House of Fusion
Search over 2,500 ColdFusion resources here
  
Home of the ColdFusion Community

Mailing Lists
Home /  Groups /  Regular Expressions (RegEx)

better http referer parser

  << Previous Post |  RSS |  Sort Oldest First |  Sort Latest First |  Subscribe to this Group Next >> 
Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Michael Dinowitz
08/02/2006 12:34 AM

I'm redoing my logging and one of the things I have to do is slice up the http_referer into useful parts. I can do the operation in multiple lines and multiple expressions, but I'd rather do it in one. Here's the code: <CFSET referer="http://www.googlecom.com/search?hl=en&lr=&q=html+javax+parse+plain+text+example&btnG=Search">; <CFSET qReferer=REFindNoCase('^https?://([^/]+)/?(.+?([&?][pq]=([^&]+)).*?|.+)?$',referer&'&',1,1)> Positions 2=domain 3=referer string 5=search term This will basically try to find a search term first and if it totally fails (non-google/yahoo referer) it'll go to the alternate. This just doesn't 'feel' as tight as it could be. I wanted to do (.+?([&?][pq]=([^&]+))?.*?)?$ but it would never recognise the internal expression that would be the search. Is there something better I can do? Thanks Michael Dinowitz President: House of Fusion     http://www.houseoffusion.com Publisher: Fusion Authority     http://www.fusionauthority.com Adobe Community Expert

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Steve L
02/10/2007 08:20 PM

> I wanted to do (.+?([&?][pq]=([^&]+))?.*?)?$ > but it would never recognise the internal expression that would be the search. These regexes have several issues, but the root cause of the problem with the regex above is that the inner group (intended to capture the search terms) is optional. Since the regex will successfully match even when the grouping is excluded, it does just that. As far as improving the full regex's efficiency, you could do: <cfset qReferer = REFindNoCase("^https?://([^/]+)/?([^?]+(?:[?&][pq]=([^&]+)|.)*)?$", referer, 1, TRUE)> Positions: 2 = Domain 3 = Relative URL excluding the leading slash ("referer string") 4 = Search term On a related note, and as a shameless plug, check out my parseUri UDF for splitting any URI: http://badassery.blogspot.com/2007/01/parsing-uris-in-coldfusion.html


<< Previous Thread Today's Threads Next Thread >>

Search regex

May 25, 2012

<<   <   Today   >   >>
Su Mo Tu We Th Fr Sa
     1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31     

Designer, Developer and mobile workflow conference