|
Previous Top Next |
Prototype: | (String pS, Byte pCount=True, Byte pCaseSensitive=False, Byte pSort=True),Long,PROC |
pS | String to parse into words |
pCount | Reserved |
pCaseSensitive | Indicates if the case of the string should be preserved. Set to false by default, meaning that all words are lower case |
pSort | If true the words are sorted alphabetically. If false the words are not sorted and are added in order so the words in the queue are in the same order as they are in the string. Defaults to True. |
pAllowDigits | If true then words made up of digits are counted as words. If false words made up of digits are not counted. Defaults to False. (Added: January 19, 2013) |
Returns | Returns the number of words in the string |
This method is used to parse a string into words. First all punctuation is removed by using the DepunctuateString method. Note that ONLY unique words are stored along with a word counter, so this is not suitable for parsing words in order to put the text back together. This is very useful to get words out of a document and count each occurance of it in the document.
NOTE: On September 18, 2009 the 4th parameter was added to make it possible to extract a non-sorted list of the words in the string.
On January 19, 2013 the 5th parameter was added to allow digit words to be counted.
Example:
ITS ITStringClass
S String(1024)
I Long
Code
S = 'This and that is this 123'
ITS.StringToWords(S) !! Returns 5, excludes the '123'
ITS.StringToWords(S,,,,True) !! Returns 6, includes the '123'
Loop I = 1 To Records(ITS.Words)
Message('Word ' & I & ': ' & ITS.GetWord(I))
End
The first call would result in 5 words being added to the Words queue, "this", "and", "that" and "is" and the word "this" would have a count of 2 while the others would have a count of 1. The second call would result in 6 words being added, including the numeric word at the end, see pAllowDigits.
See also: