|
Mailing Lists
|
Home /
Groups /
Regular Expressions (RegEx)
better http referer parser
I'm redoing my logging and one of the things I have to do is slice up the http_referer into useful parts. I can do the operation in multiple lines and multiple expressions, but I'd rather do it in one. Here's the code:Michael Dinowitz 08/02/06 12:34 A > I wanted to do (.+?([&?][pq]=([^&]+))?.*?)?$Steve L 02/10/07 08:20 P I'm redoing my logging and one of the things I have to do is slice up the http_referer into useful parts. I can do the operation in multiple lines and multiple expressions, but I'd rather do it in one. Here's the code: <CFSET referer="http://www.googlecom.com/search?hl=en&lr=&q=html+javax+parse+plain+text+example&btnG=Search"> <CFSET qReferer=REFindNoCase('^https?://([^/]+)/?(.+?([&?][pq]=([^&]+)).*?|.+)?$',referer&'&',1,1)> Positions 2=domain 3=referer string 5=search term This will basically try to find a search term first and if it totally fails (non-google/yahoo referer) it'll go to the alternate. This just doesn't 'feel' as tight as it could be. I wanted to do (.+?([&?][pq]=([^&]+))?.*?)?$ but it would never recognise the internal expression that would be the search. Is there something better I can do? Thanks Michael Dinowitz President: House of Fusion http://www.houseoffusion.com Publisher: Fusion Authority http://www.fusionauthority.com Adobe Community Expert > I wanted to do (.+?([&?][pq]=([^&]+))?.*?)?$ > but it would never recognise the internal expression that would be the search. These regexes have several issues, but the root cause of the problem with the regex above is that the inner group (intended to capture the search terms) is optional. Since the regex will successfully match even when the grouping is excluded, it does just that. As far as improving the full regex's efficiency, you could do: <cfset qReferer = REFindNoCase("^https?://([^/]+)/?([^?]+(?:[?&][pq]=([^&]+)|.)*)?$", referer, 1, TRUE)> Positions: 2 = Domain 3 = Relative URL excluding the leading slash ("referer string") 4 = Search term On a related note, and as a shameless plug, check out my parseUri UDF for splitting any URI: http://badassery.blogspot.com/2007/01/parsing-uris-in-coldfusion.html
|
May 25, 2012
|
Latest Fusion Authority Articles
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||