House of Fusion
Search over 2,500 ColdFusion resources here
  
Home of the ColdFusion Community

Mailing Lists
Home /  Groups /  ColdFusion Talk (CF-Talk)

Extracting an email address out of a text file

  << Previous Post |  RSS |  Sort Oldest First |  Sort Latest First |  Subscribe to this Group Next >> 
Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Ben Densmore
05/21/2004 11:14 AM

I have about 1800 text files that I exported from outlook that are all bounced emails. I need to extract the email addresses that were bounced back. Unfortunately not all mail servers bounce messages that look alike. I was thinking I could just read through each text file and find where the email address is but in a lot of the files there will be the email address of the person who the email was originally from, the email address of the server that bounced the email and the email address who it was originally sent to. The only time these files are consistent is when they were bounced by the same mail server software. I was thinking maybe I will loop through the ones that are in the same format, i.e. the email address that was bad looks like (someemail@somedomain.com) and strip out everything but the email address. That's the only method I can come up with at the moment, has anyone done something similar or maybe have a better solution? I really don't want to manually go through each text file and copy and paste the email address. Thanks, Ben

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Claude Schneegans
05/21/2004 11:21 AM

>> I really don't want to manually go through each text file and copy and paste the email For this I would try to use CF_REextract to extract the address in its context. The address is variable, but the context is not. See CF_REextract at http://www.contentbox.com/claude/customtags/tagstore.cfm -- _______________________________________ See some cool custom tags here: http://www.contentbox.com/claude/customtags/tagstore.cfm Please send any spam to this address: piegeacon@internetique.com Thanks.

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Bryan F. Hogan
05/21/2004 11:23 AM

I can vouch for cf_reextract. I use it as a spider for a web cvs I wrote and it works wonderfully. Claude Schneegans wrote: ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Jochem van Dieten
05/21/2004 11:52 AM

Ben Densmore wrote: ----- Excess quoted text cut - see Original Post for more ----- That is indeed the way to go. In some cases it is easy, for instance if you want to get all AOL bounces you can do a simple regex replace on your email: REReplaceNoCase(email,"^.*----- The following addresses had permanent fatal errors -----(.+)----- Transcript of session follows -----.*$","\1") Simple patterns like these should find you most of the addresses, just look for 5xx (permanent) error codes next to an email address. But it is a case of diminishing returns and at some point it is easier to do the rest by hand. Jochem

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Scott Weikert
05/21/2004 12:11 PM

There's also a tag on the DevEx, "StripEmail" - you pass it a string and it'll pull out email addresses and return them as a list. Works great for me.

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Ben Densmore
05/21/2004 12:19 PM

Thanks Claude, I bought the tag and it seems to work ok. I'm wondering if someone can help me with a reg ex. Just playing around with the tag I was trying: <CF_REextract             INPUTMODE="File"             INPUT="c:\cfusionmx\wwwroot\email_blast\#name#"             OUTPUTMODE="output"             RE1="([[:alnum:]_\.\-]+@([[:alnum:]_\.\-]))"             RE2="\.+[[:alpha:]]{2,4}"             > Which I thought would extract email addresses but it doesn't seem to be working. Since I suck at them I'm sure I'm doing something wrong with it. Can anyone modify this reg ex a little for me? Thanks, Ben >> I really don't want to manually go through each text file and copy and paste the email For this I would try to use CF_REextract to extract the address in its context. The address is variable, but the context is not. See CF_REextract at http://www.contentbox.com/claude/customtags/tagstore.cfm -- _______________________________________ See some cool custom tags here: http://www.contentbox.com/claude/customtags/tagstore.cfm Please send any spam to this address: piegeacon@internetique.com Thanks. ________________________________

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Claude Schneegans
05/21/2004 12:38 PM

>> I bought the tag Congratulations: you're on the right track ! ;-) >>if someone can help me with a reg ex. Just playing around with the tag I was trying: Sure, try the REWizard, same author, same address, and this one is on line and free. Make sure you work with the CF syntax, not the Javascript. -- _______________________________________ See some cool custom tags here: http://www.contentbox.com/claude/customtags/tagstore.cfm Please send any spam to this address: piegeacon@internetique.com Thanks.

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Ben Densmore
05/21/2004 02:04 PM

Thanks Claude. According to the REWizard my regular expression works but in the REExtract tag it doesn't. Is there any way to not use the RE2 option? Thanks, Ben >> I bought the tag Congratulations: you're on the right track ! ;-) >>if someone can help me with a reg ex. Just playing around with the tag I was trying: Sure, try the REWizard, same author, same address, and this one is on line and free. Make sure you work with the CF syntax, not the Javascript. -- _______________________________________ See some cool custom tags here: http://www.contentbox.com/claude/customtags/tagstore.cfm Please send any spam to this address: piegeacon@internetique.com Thanks. ________________________________

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Claude Schneegans
05/21/2004 02:29 PM

>>According to the REWizard my regular expression works You mean RE1 works? Now what's about RE2? >> Is there any way to not use the RE2 option? No, because without RE2, the tag does not know where the zone it has to extract stops. You have to make sure that you have both RE1 and RE2 that recognize something in you file. When this is done, REExtract will retun what's between the two occurrences. -- _______________________________________ See some cool custom tags here: http://www.contentbox.com/claude/customtags/tagstore.cfm Please send any spam to this address: piegeacon@internetique.com Thanks.


<< Previous Thread Today's Threads Next Thread >>

Search cf-talk

February 09, 2012

<<   <   Today   >   >>
Su Mo Tu We Th Fr Sa
       1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29