UDF inspection, please!

Rick Root
11/28/2006 05:20 PM

Ben Nadel wrote: > > Looks good. Nice idea for a UDF by the way. The only concern I have is > the .* for selecting the "rest" of a tag. I don't know off hand, but I > think that by default "." does NOT match on line breaks (I could be way > off here though). This will not allow for tags that wrap lines. You > might want to try [^>]* instead. But, this will break if the ">" appears > in one of the tag attribute values. Actually, in Coldfusion regular expressions, "." does match *ANY* character, including linefeeds and other control chars, printable or otherwise.  There is no "multi line mode" like in perl regular expressions (or is it vi?) .. I just tested to verify that, and it does work for code like this: This is a test <a   href=""><b>this site</b></a> is cool. If you're stripping anchors, it *does* work. > Also, I think it will break with nested comments (the non-greed search > would find: <!-- bla bhal bha <!-- more blah -->  which will leave the > rest of the comment in place (I think). This is a huge issue with > patters. Just like handling nested quote attributes. I have not got a > good handle on how to solve this problem in the least. Excellent point.  It would still strip unwanted tags inside the comment that were after the first closing comment -->, but obviously that's an undesirable affect. I don't know how to solve it other than documenting that stripping comments might not provide desirable results. I can actually think of a solution *NOT* using regular expressions that would probably work but might take a considerable amount of code :) So, for now I'll just document =)  or just not strip comments. Rick

