|
Mailing Lists
|
Home /
Groups /
Regular Expressions (RegEx)
Count word in one line and <enter>
Hi.Badrul Anuar 07/08/09 07:30 A Huh? You seem to be asking two questions?Peter Boughton 07/08/09 08:42 A Thank you for replying, actually, I want to count how many line has that word. It does not a problem if the word is repeated or not as long as the word is found in the line.Badrul Anuar 07/08/09 10:06 A The text contains 5 lines.. Sorry the the typing error.Badrul Anuar 07/08/09 10:08 A Ok, the main regex that will do this is:Peter Boughton 07/08/09 01:33 P Hi. I have a simple question. I would like to count how many word in a single line. For example <text> I have an apple. The apple is green. I have an apple. </text> Let's say that if I use a simple regex. (apple), I will receive 3 match. Since there is only two lines, how to get the counter to count only 2. The <enter> is the delimiter. Thank you Huh? You seem to be asking two questions? Do you want the number of words in the text (12), or the number of non-empty lines (2)? With the behaviour of CF's List functions to ignore empty elements, you don't need regex, you can simply do: NumberOfLines = ListLen( Text , Chr(10) ) For the number of words, it is a little more complex - you need to treat any whitespace as a delimiter: NumberOfWords = ListLen( Text , ' ' & Chr(13) & Chr(10) & Chr(9) ) This is not perfect (it will treat hyphens, ampersands, etc as words). If you do want to get actual words only (i.e. , you could try something like this: <cfset Words = rematch(String,"\w[\w'-]+") /> <cfoutput>#ArrayLen(Words)# words</cfoutput> The \w means "word character" whilst the [\w'-] means "word character or apostrophe or dash", so it will match things like "half-past" and "o'clock" as single words, but will treat "like - this" or "this & that" as just 2 words. Anyway, if this isn't what you're after, you'll need to clarify your question. Thank you for replying, actually, I want to count how many line has that word. It does not a problem if the word is repeated or not as long as the word is found in the line. For another example. I have a 5 lines of sentences. <text> I have a green apple. I have two apple. Green apple and red apple. I have an orange. My brother has red apple and green apple. I have an orange but my friend give me red apple. I do have orange and two books. The two books are given by my friend. </text> In this case, I have 4 lines of words. So If I want to count how many line contains 'apple'; I will receive 4 lines. If I want to count how many line contains 'red apple'; i will receive 2 lines/ If I want to count how many line contains 'two'; I will receive 2 lines. If I want to count how many line contains 'two' or 'green'; i will receive 4 lines If I want to count how many line contains 'red apple' or 'green apple'; I will receive 4 lines. ----- Excess quoted text cut - see Original Post for more ----- The text contains 5 lines.. Sorry the the typing error. ----- Excess quoted text cut - see Original Post for more ----- Ok, the main regex that will do this is: (?m-s)^.*?(?:WordToCheck).*$ Except CF8 doesn't support the '-s' flag, to prevent '.' matching newlines, so instead I had to do: (?m)^.*?(?:WordToCheck)[^\n]*$ And here's the full code example/test to show it in action: <cffunction name="countLinesWithWord" returntype="Numeric" output="false"> <cfargument name="InputText" type="String"/> <cfargument name="WordRegex" type="String"/> <cfset var Regex = '(?m)^.*?(?:#Arguments.WordRegex#)[^\n]*$' /> <cfset var Matches = rematch(Regex,Arguments.InputText) /> <cfreturn ArrayLen(Matches)/> </cffunction> <cfsavecontent variable="Text"> I have a green apple. I have two apple. Green apple and red apple. I have an orange. My brother has red apple and green apple. I have an orange but my friend give me red apple. I do have orange and two books. The two books are given by my friend. </cfsavecontent> <cfset Text = trim(Text) /> <cfloop index="CurWord" list="apple,red apple,two,two|green,(?:red|green) apple"> <cfoutput><br/>'#CurWord#'=#countLinesWithWord(Text,CurWord)#</cfoutput> </cfloop> Got to rush off now, but let me know if that works as desired, and/or if you want any bits explained.
|
June 19, 2013
|
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||