A Microsoft Office (Excel, Word) forum. OfficeFrustration

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

Go Back   Home » OfficeFrustration forum » Microsoft Word » General Discussion
Site Map Home Register Authors List Search Today's Posts Mark Forums Read  

Unique Wordcount



 
 
Thread Tools Display Modes
  #1  
Old June 1st, 2004, 11:56 AM
Frank
external usenet poster
 
Posts: n/a
Default Unique Wordcount

How can Word count the unique number of words. eg: "apple pear apple" should count only two words since apple appears twice.
Thanks a lot, Frank
  #2  
Old June 1st, 2004, 12:44 PM
Jezebel
external usenet poster
 
Posts: n/a
Default Unique Wordcount

Word can't do this easily. If the document is not too long, you can do it by
converting the document to a single word list, copy and paste into Excel,
then do a unique filter.

To convert the document to a wordlist, use find and replace to convert all
spaces to paragraph marks, and to remove all other non-alpha characters.


"Frank" wrote in message
...
How can Word count the unique number of words. eg: "apple pear apple"

should count only two words since apple appears twice.
Thanks a lot, Frank



  #3  
Old June 1st, 2004, 01:31 PM
Frank
external usenet poster
 
Posts: n/a
Default Unique Wordcount

That is precious information .
Abundance of thanks Jezebel,
Frank
  #4  
Old June 1st, 2004, 01:54 PM
Greg
external usenet poster
 
Posts: n/a
Default Unique Wordcount

This macro, bulk of which is the fruits of another's
labor, will do a nice job as well:

Sub WordFrequency()

Dim SingleWord As String 'Raw word pulled from
doc
Const maxwords = 9000 'Maximum unique words
allowed
Dim Words(maxwords) As String 'Array to hold unique
words
Dim Freq(maxwords) As Integer 'Frequency counter for
Unique Words
Dim WordNum As Integer 'Number of unique words
Dim ByFreq As Boolean 'Flag for sorting order
Dim ttlwds As Long 'Total words in the
document
Dim Excludes As String 'Words to be excluded
Dim Found As Boolean 'Temporary flag
Dim j, k, l, Temp As Integer 'Temporary variables
Dim IngWordCount As Long 'Total non-excluded
words in document
Dim NonWordObjects As Long
Dim AllWordOjects As Long
Dim TotalWords As Long
Dim tword As String '

'Set up excluded words
'Excludes = "[pickleloaf][gruntbutter]"
'Excludes = Excludes & InputBox$("The following words are
excluded by default: " & Excludes & ". Enter additional
words that you wish to exclude, surrounding each word with
[ ].", "Excluded Words", "")

Excludes = InputBox$("Enter words that you wish to
exclude. Place each word within square brackets [ ].
Example: [is][a].", "Excluded Words", "")

'Find out how to sort

ByFreq = True
Ans = InputBox$("Default sort order is word freqeuncy. To
sort alphabetically by word, type Word in the field
below.", "Sort order", "FREQ")
If Ans = "" Then End
If UCase(Ans) = "WORD" Then
ByFreq = False
End If
Selection.HomeKey Unit:=wdStory
System.Cursor = wdCursorWait
WordNum = 0
ttlwds = ActiveDocument.Words.Count
'AllWordObjects = ActiveDocument.Words.Count
'TotalWords = NonWordObjects
'Control the repeat
For Each aword In ActiveDocument.Words
SingleWord = Trim(LCase(aword))
If SingleWord "a" Or SingleWord "z" Then SingleWord
= "" 'Out of range?
If SingleWord "a" Or SingleWord "z" Then
NonWordObjects = NonWordObjects + 1
'SingleWord = Trim(aword)
'If SingleWord "A" Or SingleWord "z" Then SingleWord
= "" 'Out of range?
If InStr(Excludes, "[" & SingleWord & "]") Then SingleWord
= "" 'On exclude list?
If Len(SingleWord) 0 Then
IngWordCount = IngWordCount + 1
Found = False
For j = 1 To WordNum
If Words(j) = SingleWord Then
Freq(j) = Freq(j) + 1
Found = True
Exit For
End If
Next j
If Not Found Then
WordNum = WordNum + 1
Words(WordNum) = SingleWord
Freq(WordNum) = 1
End If
If WordNum maxwords - 1 Then
j = MsgBox("The maximum array size has been exceeded.
Increase maxwords.", vbOKOnly)
Exit For
End If
End If
ttlwds = ttlwds - 1
StatusBar = "Remaining: " & ttlwds & " Unique: " &
WordNum
Next aword
'Now sort it into word order
For j = 1 To WordNum - 1
k = j
For l = j + 1 To WordNum
If (Not ByFreq And Words(l) Words(k)) Or (ByFreq And Freq
(l) Freq(k)) Then k = l
Next l
If k j Then
tword = Words(j)
Words(j) = Words(k)
Words(k) = tword
Temp = Freq(j)
Freq(j) = Freq(k)
Freq(k) = Temp
End If
StatusBar = "Sorting: " & WordNum - j
Next j
AllWordObjects = ActiveDocument.Words.Count
NonWordObjects = NonWordObjects
TotalWords = AllWordObjects - NonWordObjects
'Now write out the results
tmpName = ActiveDocument.AttachedTemplate.FullName
Documents.Add Template:=tmpName, NewTemplate:=False
Selection.ParagraphFormat.TabStops.ClearAll
With Selection
For j = 1 To WordNum
..TypeText Text:=Words(j) & vbTab & Trim(Str(Freq(j))) &
vbCrLf
Next j
End With
ActiveDocument.Range.Select
Selection.ConvertToTable
Selection.Collapse wdCollapseStart
ActiveDocument.Tables(1).Rows.Add BeforeRow:=Selection.Rows
(1)
ActiveDocument.Tables(1).Cell(1,
1).Range.InsertBefore "Unique Words"
ActiveDocument.Tables(1).Cell(1,
2).Range.InsertBefore "Number of Occurrences"
ActiveDocument.Tables(1).Columns(2).Select
Selection.ParagraphFormat.Alignment = wdAlignParagraphRight
Selection.Collapse wdCollapseStart
ActiveDocument.Tables(1).Rows
(1).Shading.BackgroundPatternColor = wdColorGray20
ActiveDocument.Tables(1).Columns(1).PreferredWidth =
InchesToPoints(4.75)
ActiveDocument.Tables(1).Columns(2).PreferredWidth =
InchesToPoints(1.9)
ActiveDocument.Tables(1).Rows.Add
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s
(1).Rows.Count, 1).Range.InsertBefore "Summary"
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s
(1).Rows.Count, 2).Range.InsertBefore "Total"
ActiveDocument.Tables(1).Rows(ActiveDocument.Table s
(1).Rows.Count).Shading.BackgroundPatternColor =
wdColorGray20
ActiveDocument.Tables(1).Rows.Add
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s
(1).Rows.Count, 1).Range.InsertBefore "Number of Unique
Words in Document"
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s
(1).Rows.Count, 2).Range.InsertBefore Trim(Str(WordNum))
ActiveDocument.Tables(1).Rows(ActiveDocument.Table s
(1).Rows.Count).Shading.BackgroundPatternColor =
wdColorAutomatic
ActiveDocument.Tables(1).Rows.Add
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s
(1).Rows.Count, 1).Range.InsertBefore "Number of Non-
Excluded Words in Document"
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s
(1).Rows.Count, 2).Range.InsertBefore (IngWordCount)
ActiveDocument.Tables(1).Rows.Add
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s
(1).Rows.Count, 1).Range.InsertBefore "Number of Words
(Excluded and Non-Excluded) in Document"
ActiveDocument.Tables(1).Cell(ActiveDocument.Table s
(1).Rows.Count, 2).Range.InsertBefore (TotalWords)
System.Cursor = wdCursorNormal
MsgBox "This document contains " & Trim(Str(WordNum)) & "
unique words. "
MsgBox "This document contains " & IngWordCount & " non-
excluded words. "
MsgBox "This document contains a total of " & TotalWords
& " (excluded and non-excluded) words. "
MsgBox "For more statistics on this document, use
ToolsWord Count in the original document. "
Selection.HomeKey wdStory
End Sub

-----Original Message-----
Word can't do this easily. If the document is not too

long, you can do it by
converting the document to a single word list, copy and

paste into Excel,
then do a unique filter.

To convert the document to a wordlist, use find and

replace to convert all
spaces to paragraph marks, and to remove all other non-

alpha characters.


"Frank" wrote in

message
...
How can Word count the unique number of words.

eg: "apple pear apple"
should count only two words since apple appears twice.
Thanks a lot, Frank



.

 




Thread Tools
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Forum Jump


All times are GMT +1. The time now is 04:16 PM.


Powered by vBulletin® Version 3.6.4
Copyright ©2000 - 2024, Jelsoft Enterprises Ltd.
Copyright ©2004-2024 OfficeFrustration.
The comments are property of their posters.