|
Mailing Lists
|
Home /
Groups /
ColdFusion Talk (CF-Talk)
More Complicated RegEx Replace
This is the replace statement a regex guru gave me to wrap a variable found in a string in a span tag.Robert Harrison 06/28/12 09:57 A I doubt you can achieve this just using regex.Claude_Schnéegans 06/28/12 10:26 A I disagree with that statement. I don't have time at the moment to playMatt Quackenbush 06/28/12 10:39 A > This is the replace statement a regex guru gave mePeter Boughton 07/02/12 09:12 A This is the replace statement a regex guru gave me to wrap a variable found in a string in a span tag. # REReplaceNoCase(answer, '(#search_string#)', '<span class="keyword">\1</span>', 'all')# It works great, but the variable contains html and it's also replacing stuff inside of HTML tags. Example: if the string was "pool" I'm getting results like <a href="more_on_pools.cfm"> becomes <href="more_on_<span class="keyword">pools</span>.cfm"> or <img src="pool_picture.jpg"> becomes <img src="<span class="keyword">pool</span>_picture.jpg"> Is there anything I can do to EITHER NOT do the replace when it's part of an href or img src, OR UNDO the replace if it's inside an href or img src tag? Either would achieve the same results. Any help is appreciated. This is truly beyond my regex skill level. Thanks, Robert Robert Harrison Director of Interactive Services Austin & Williams Advertising I Branding I Digital I Direct 125 Kennedy Drive, Suite 100 I Hauppauge, NY 11788 T 631.231.6600 X 119 F 631.434.7022 http://www.austin-williams.com Blog: http://www.austin-williams.com/blog Twitter: http://www.twitter.com/austin_wi I doubt you can achieve this just using regex. Regex are great for doing things, but not for "not doing". I disagree with that statement. I don't have time at the moment to play with it, but you'll want to look at negative lookahead (e.g. ?!). http://help.adobe.com/en_US/ColdFusion/9.0/Developing/WSc3ff6d0ea77859461172e0811cbec0a38f-7ffb.html#WSc3ff6d0ea77859461172e0811cbec0a38f-7fee HTH On Thu, Jun 28, 2012 at 9:26 AM, <> wrote: > > I doubt you can achieve this just using regex. > Regex are great for doing things, but not for "not doing". > This is the replace statement a regex guru gave me > to wrap a variable found in a string in a span tag. Not sure you can call them a "guru" when the only piece of regex used is a pair of parentheses which are entirely unnecessary. *shrug* Here's a simpler version that does exactly the same thing: REReplaceNoCase ( answer , search_string , '<span class="keyword">\0</span>' , 'all' ) However, what that isn't doing is escaping potential regex metacharacters inside search_string (which could then result in unexpected behaviour). If you can't guarantee there will not be any metacharacters present, you need to do this: REReplaceNoCase ( answer , search_string.replaceAll('[$^*()+\[\]{}.?\\|]','\\$0') , '<span class="keyword">\0</span>' , 'all' ) (Which prefixes the relevant characters with a backslash to escape them.) Anyhow, as for your actual problem, regex is not a good tool for parsing HTML (which is what you're asking to be done by excluding tag attributes from matching). What you need to do is use a HTML parsing library, such as jSoup, to isolate the text segments within HTML tags, and loop through performing your replace operation on each of those in turn (recursing down through any child tags as required). Using jSoup, this can be achieved with the textNodes() method, to access the individual segments of text and child nodes: http://jsoup.org/apidocs/org/jsoup/nodes/Element.html#textNodes() If you're unfamiliar with using JARs in CF, Ben Nadel has a post on using jSoup with CF10: http://www.bennadel.com/blog/2358-Parsing-Traversing-And-Mutating-HTML-With-ColdFusion-And-jSoup.htm
|
May 23, 2013
|
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||