House of Fusion
Search over 2,500 ColdFusion resources here
  
Home of the ColdFusion Community

Mailing Lists
Home /  Groups /  ColdFusion Talk (CF-Talk)

Strip multiple words from string

  << Previous Post |  RSS |  Tree View |  Sort Oldest First |  Subscribe to this Group Next >> 

Strip multiple words from string

>>Sometimes. But it's seldom that you'll find a solution to a problem that Claude Schneegans 05/16/2004 06:29 PM
> Just consider that there may be a big difference in the algorithm and the DougF 05/16/2004 06:20 PM
> Part of the difficulty is describing the problem... sometimes Jim McAtee 05/16/2004 05:27 PM
Part of the difficulty is describing the problem... sometimes the DougF 05/16/2004 04:56 PM
If, as you said, you're just putting together a keyword list, then take Jim McAtee 05/16/2004 04:05 PM
Sorry should have said: DougF 05/16/2004 03:31 PM
>>What I'm attempting to do is stitch together a number of different Claude Schneegans 05/16/2004 03:14 PM
Thanks Claude, DougF 05/16/2004 03:04 PM
Hmmmm, I think I would use the replace function to first replace the first Claude Schneegans 05/15/2004 11:29 PM
Any suggestions how to strip multiple occurances of a short list (4-8) of words Douglas Fentiman 05/15/2004 10:38 PM

05/16/2004 06:29 PM
Author: Claude Schneegans Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163317 >>Sometimes. But it's seldom that you'll find a solution to a problem that you're unable to define. May be, but sometimes it is much easier to find the solution first, then the problem ;-)) -- _______________________________________ See some cool custom tags here: http://www.contentbox.com/claude/customtags/tagstore.cfm Please send any spam to this address: piegeacon@internetique.com Thanks.
05/16/2004 06:20 PM
Author: DougF Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163316 > Just consider that there may be a big difference in the algorithm and the > processing time between the two approaches of a) stripping duplicates and > b) not adding duplicates to the assembled string in the first place. Duplicates result from the assembly of strings. They need to be removed after they are assembled. > What are you calling a phrase? A phrase in this case would be two or three words separated from other phrases and words by a comma, i.e. "word1, phrase one, word2, phrase two, phrase three". ----- Excess quoted text cut - see Original Post for more ----- I feel this solution is too complex for what is needed and would also be a processing time concern. > I you want to call any string of words between punctuation marks a > "phrase", then loop over your original string as a list, but don't include > white space characters among the delimiters. Will try the loop approach. Thanks, Doug
05/16/2004 05:27 PM
Author: Jim McAtee Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163312 > Part of the difficulty is describing the problem... sometimes > the description evolves as unanticipated results materialize. Sometimes. But it's seldom that you'll find a solution to a problem that you're unable to define. > Better description of problem: > Assemble a number of different strings with the final result > being a single string of words and phases that are delimited > by commas. Strip out duplicate words or phases in the result. Just consider that there may be a big difference in the algorithm and the processing time between the two approaches of a) stripping duplicates and b) not adding duplicates to the assembled string in the first place. > Singular words are allowed in the phrases. Could not the comma > be used to distinguish between words and phrases? In the original material? No. In the final list a comma is as good a delimiter as any. What are you calling a phrase?  Check out the types and examples of English phrases at the link below and note that commas seldom delineate a phrase. http://grammar.uoregon.edu/phrases/phrases.html > Would be difficult to create a dictionary of all words that may > be in original string. No doubt.  But words are fairly easy to parse - generally anything delimited by white space or punctuation.  Short of creating a parser for English grammar, though, I'm not sure how you'd pull out phrases. I you want to call any string of words between punctuation marks a "phrase", then loop over your original string as a list, but don't include white space characters among the delimiters.
05/16/2004 04:56 PM
Author: DougF Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163308 Part of the difficulty is describing the problem... sometimes the description evolves as unanticipated results materialize. Better description of problem: Assemble a number of different strings with the final result being a single string of words and phases that are delimited by commas. Strip out duplicate words or phases in the result. Singular words are allowed in the phrases. Could not the comma be used to distinguish between words and phrases? Would be difficult to create a dictionary of all words that may be in original string. Will play with both Claude's and your suggestions. Thanks, Doug ----- Excess quoted text cut - see Original Post for more -----
05/16/2004 04:05 PM
Author: Jim McAtee Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163307 If, as you said, you're just putting together a keyword list, then take Claude's last suggestion.  However, distinguishing between a 'word' and a 'phrase', without knowing what constitutes a phrase (that is, wihout already having a dictionary of what you want to consider to be phrases) is going to be difficult or impossible. To answer your question... yes.  You'd first put together a keyword list and then remove duplicates of those keywords from your string.  But from the sounds of what you're trying to do, once you've put together the keyword list you're finished. Try this: <cfset s = "A string, or maybe not a string. Who knows? And who cares?"> <cfset keywords = ""> <cfloop index="w" list="#s#" delimiters=" .,?!;:%$&""'/|[]{}()">   <cfif not ListFindNoCase(keywords, w)>     <cfset keywords = ListAppend(keywords, LCase(w))>   </cfif> </cfloop> <cfoutput> <pre> s:        #s# keywords: #keywords# </pre> </cfoutput> > Sorry should have said: > "Is there a way to do this with 'OUT' having to specify the words/phrases to > search for" > -Doug > > Thanks Claude, > > Taking your suggestion I put this function together. My effort works in a > > simple way, but it is not a total solution to my problem... > > What I'm attempting to do is stitch together a number of different strings > > to create a keyword list. After the strings are assembled I need to strip > > out any duplicate words or phrases delimited by commas. Is there a way to ----- Excess quoted text cut - see Original Post for more ----- front > > str = rereplaceNoCase(str,"[^[:alnum:]]*$", "");// trims space at rear > > str = ReplaceNoCase(str,"Word_1","1x1x1x1");// replace first occurance of ----- Excess quoted text cut - see Original Post for more ----- additional > > occurances of word. > > str = ReplaceNoCase(str,"1x1%1x1","Word_1");// restore original word at ----- Excess quoted text cut - see Original Post for more ----- text, ----- Excess quoted text cut - see Original Post for more -----
05/16/2004 03:31 PM
Author: DougF Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163306 Sorry should have said: "Is there a way to do this with 'OUT' having to specify the words/phrases to search for" -Doug > Thanks Claude, > Taking your suggestion I put this function together. My effort works in a > simple way, but it is not a total solution to my problem... > What I'm attempting to do is stitch together a number of different strings > to create a keyword list. After the strings are assembled I need to strip > out any duplicate words or phrases delimited by commas. Is there a way to do > this with having to specify the words/phrases to search for. RegExp's maybe? ----- Excess quoted text cut - see Original Post for more ----- first > occurrence of the word by some string unlikely to be found in the text, > like say "%*%*%*", then replace all remaining words by nothing, using "all" > , then but back the word at the place %*%*%* is. > >  If you don't have too many words, the solution is workable.
05/16/2004 03:14 PM
Author: Claude Schneegans Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163305 >>What I'm attempting to do is stitch together a number of different strings to create a keyword list. Ah ah! Now this is a bit different. Hmmm, to do this, I would 1º remove all puctuation marks, CR, LF etc, 2º consider the text as a space delimited list, 3º in a loop, create a new list by adding any word from the first list which is not already in the new one and which has more that 3 chars or so. -- _______________________________________ See some cool custom tags here: http://www.contentbox.com/claude/customtags/tagstore.cfm Please send any spam to this address: piegeacon@internetique.com Thanks.
05/16/2004 03:04 PM
Author: DougF Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163304 Thanks Claude, Taking your suggestion I put this function together. My effort works in a simple way, but it is not a total solution to my problem... What I'm attempting to do is stitch together a number of different strings to create a keyword list. After the strings are assembled I need to strip out any duplicate words or phrases delimited by commas. Is there a way to do this with having to specify the words/phrases to search for. RegExp's maybe? Any suggestions? -Doug ----------------------- <cfscript> function CleanKeywords(str){ str = ReReplaceNoCase(str,"^[^[:alnum:]]*", "");// trims space at front str = rereplaceNoCase(str,"[^[:alnum:]]*$", "");// trims space at rear str = ReplaceNoCase(str,"Word_1","1x1x1x1");// replace first occurance of word with place holder word. str = ReplaceNoCase(str,"Word_2","2x2x2x2"); str = ReplaceNoCase(str,"Word_3","3x3x3x3"); // add addional words to find. str = ReplaceList(str,"Word_1,Word_2,Word_3",", , ,");// delete additional occurances of word. str = ReplaceNoCase(str,"1x1%1x1","Word_1");// restore original word at place holder word. str = ReplaceNoCase(str,"2x2%2x2","Word_2"); str = ReplaceNoCase(str,"3x3%3x3","Word_3"); // add addional words to restore. return str; } </cfscript> > Hmmmm, I think I would use the replace function to first replace the first occurrence of the word by some string unlikely to be found in the text, like say "%*%*%*", then replace all remaining words by nothing, using "all" , then but back the word at the place %*%*%* is. >  If you don't have too many words, the solution is workable.
05/15/2004 11:29 PM
Author: Claude Schneegans Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163302 Hmmmm, I think I would use the replace function to first replace the first occurrence of the word by some string unlikely to be found in the text,  like say "%*%*%*", then replace all remaining words by nothing, using "all" , then but back the word at the place %*%*%* is. If you don't have too many words, the solution is workable. -- _______________________________________ See some cool custom tags here: http://www.contentbox.com/claude/customtags/tagstore.cfm Please send any spam to this address: piegeacon@internetique.com Thanks.
05/15/2004 10:38 PM
Author: Douglas Fentiman Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:32533#163300 Any suggestions how to strip multiple occurances of a short list (4-8) of words from a string. The first occurance of each word must be preserved at its position. Using CF5. Thanks, Doug
<< Previous Thread Today's Threads Next Thread >>

Search cf-talk

May 24, 2012

<<   <   Today   >   >>
Su Mo Tu We Th Fr Sa
     1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31     

Designer, Developer and mobile workflow conference