House of Fusion
Search over 2,500 ColdFusion resources here
  
Home of the ColdFusion Community

Mailing Lists
Home /  Groups /  ColdFusion Talk (CF-Talk)

CSV Upload Note

  << Previous Post |  RSS |  Sort Oldest First |  Sort Latest First |  Subscribe to this Group Next >> 
Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Andy Ousterhout
06/27/2003 05:09 PM

I've tested using OBDC to upload CSV files into queries and have found that they do not handle the following situation properly: Small Heart, with ""Red" decorations is translated as Small Heart, with "Red and Small Heart, with PINK "decorations" becomes Small Heart, with PINK The following code handles this correctly: <!--- Object:   ParseCSV Purpose: Include to parse a CSV string named lstCSV and return in array named aryCSV Author:  Andy Ousterhout(ACO) Date:  3/3/2002 Version: V1.0 Usage: CSV files have fields in a record separated by commas, ",".  However, if a text field contains an embededd comma, the entire field is placed in double quotes.  For example, 00007,"Bear, Swimming",Test 00008,"Frank ""The Animal"" Hooter", testing 00009, Testing, "Frank ""The Animal"" Hooter" Should each result in 3 fields being parsed. Change Record: Who:  When:  Why: ---> <cfscript>   aryCSV = ArrayNew(1);   // Only do stuff if something passed ....   if (len(lstCSV) GT 0) {    // First, add a comma onto the end so that the search for ", and not "", always finds something unless bad CSV string.    // Then, replace any ",," with ", ," so that individual array elements will be found    lstWork = Trim(lstCSV) & ",";    lstWork = replace(lstWork, ",", ", ", "ALL");    // Then loop thru string, parsing off to next comma and checking if complete field    // Note that if the field has imbedded quotes, this routine will create bad data.    // For example, [This is a test, "test"] will come through as ["This is a test, ""test""]    // and will be parsed into two fields, [This is a test, "] and ["test"]. The calling routine will    // need to handle the possibility that more fields will be returned then expected.    Do {     // If Line starts with quote, next field is next quote+comma that is not part of a quote+quote+comma     if (left(lstWork, 1) EQ '"') {      tmpStr = Right(lstWork, len(lstWork)-1);      EndPos = REFInd('([^"]",) | ("*"",)', tmpStr);      // If nothing found, look for "", ending      if (EndPos EQ 0) {       EndPos=REFInd('([^"]",) | (["]*["",])', tmpStr);       EndPos=REFInd('("",)', Mid(tmpStr, EndPos+1, len(tmpStr)))+EndPos;      }      // If still nothing found, Error.  Stop work immediately      if (EndPos EQ 0) {       lstwork = "";       NextField = "";      }      Else {       NextField = mid(trim(lstWork), 2, EndPos);       if (len(lstWork)-(EndPos+2) EQ 0)        lstwork = "";       else        lstWork = ltrim(right(lstWork, len(lstWork)-(EndPos+3)));      }     }     else if (left(lstwork, 1) EQ ",") {      lstWork = ltrim(right(lstWork, len(lstWork)-1));      NextField = " ";     }     else {      NextField = Trim(ListFirst(lstWork, ","));      lstWork = LTrim(ListDeleteAt(lstWork, 1, ","));     }     //Append the new field, converting CSV's double quotes to single quotes     ArrayAppend(aryCSV, Replace(NextField, '""', '"', "all")); //writeoutput(",[#lstwork#], [#NextField#]");    }    while (len(lstWork) GT 1);   } </cfscript>

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
jon hall
06/27/2003 06:22 PM

If you were using a the correct custom delimiter for your file ...that's quite a bug, and quite incredible that it has gone undetected for all these years. If you left the DSN on the default of CSV delimited...expect crazy things like that to happen, because the example you give is _not_ a valid csv file. -- jon mailto:jonhall@ozline.net Friday, June 27, 2003, 5:06:46 PM, you wrote: AO> I've tested using OBDC to upload CSV files into queries and have found that AO> they do not handle the following situation properly: AO> Small Heart, with ""Red" decorations AO>  is translated as AO> Small Heart, with "Red AO> and AO> Small Heart, with PINK "decorations" AO> becomes AO> Small Heart, with PINK AO> The following code handles this correctly: AO> <!--- AO> Object:   ParseCSV AO> Purpose: Include to parse a CSV string named lstCSV and return in array named AO> aryCSV AO> Author:  Andy Ousterhout(ACO) AO> Date:  3/3/2002 AO> Version: V1.0 AO> Usage: CSV files have fields in a record separated by commas, ",".   However, AO> if a text field contains an embededd comma, the entire field is placed in AO> double quotes.  For example, AO>  00007,"Bear, Swimming",Test AO>  00008,"Frank ""The Animal"" Hooter", testing AO>  00009, Testing, "Frank ""The Animal"" Hooter" AO>  Should each result in 3 fields being parsed. AO> Change Record: AO> Who:  When:  Why: AO>  ---> AO>  <cfscript> AO>   aryCSV = ArrayNew(1); AO>   // Only do stuff if something passed .... AO>   if (len(lstCSV) GT 0) { AO>    // First, add a comma onto the end so that the search for ", and not "", AO> always finds something unless bad CSV string. AO>    // Then, replace any ",," with ", ," so that individual array elements will AO> be found AO>    lstWork = Trim(lstCSV) & ","; AO>    lstWork = replace(lstWork, ",", ", ", "ALL"); AO>    // Then loop thru string, parsing off to next comma and checking if AO> complete field AO>    // Note that if the field has imbedded quotes, this routine will create bad AO> data. AO>    // For example, [This is a test, "test"] will come through as ["This is a AO> test, ""test""] AO>    // and will be parsed into two fields, [This is a test, "] and ["test"]. AO> The calling routine will AO>    // need to handle the possibility that more fields will be returned then AO> expected. AO>    Do { AO>     // If Line starts with quote, next field is next quote+comma that is not AO> part of a quote+quote+comma AO>     if (left(lstWork, 1) EQ '"') { AO>      tmpStr = Right(lstWork, len(lstWork)-1); AO>      EndPos = REFInd('([^"]",) | ("*"",)', tmpStr); AO>      // If nothing found, look for "", ending AO>      if (EndPos EQ 0) { AO>       EndPos=REFInd('([^"]",) | (["]*["",])', tmpStr); AO>       EndPos=REFInd('("",)', Mid(tmpStr, EndPos+1, len(tmpStr)))+EndPos; AO>      } AO>      // If still nothing found, Error.  Stop work immediately AO>      if (EndPos EQ 0) { AO>       lstwork = ""; AO>       NextField = ""; AO>      } AO>      Else { AO>       NextField = mid(trim(lstWork), 2, EndPos); AO>       if (len(lstWork)-(EndPos+2) EQ 0) AO>        lstwork = ""; AO>       else AO>        lstWork = ltrim(right(lstWork, len(lstWork)-(EndPos+3))); AO>      } AO>     } AO>     else if (left(lstwork, 1) EQ ",") { AO>      lstWork = ltrim(right(lstWork, len(lstWork)-1)); AO>      NextField = " "; AO>     } AO>     else { AO>      NextField = Trim(ListFirst(lstWork, ",")); AO>      lstWork = LTrim(ListDeleteAt(lstWork, 1, ",")); AO>     } AO>     //Append the new field, converting CSV's double quotes to single quotes AO>     ArrayAppend(aryCSV, Replace(NextField, '""', '"', "all")); AO> //writeoutput(",[#lstwork#], [#NextField#]"); AO>    } AO>    while (len(lstWork) GT 1); AO>   } AO>  </cfscript> AO>

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Andy Ousterhout
06/27/2003 07:03 PM

However, it is the CSV file format output by Peachtree, which is what I need to interface with and represents real data input by Users.  So I am stuck with my "work around".  I was hoping for an easier and faster solution. Andy If you were using a the correct custom delimiter for your file ...that's quite a bug, and quite incredible that it has gone undetected for all these years. If you left the DSN on the default of CSV delimited...expect crazy things like that to happen, because the example you give is _not_ a valid csv file. -- jon mailto:jonhall@ozline.net Friday, June 27, 2003, 5:06:46 PM, you wrote: AO> I've tested using OBDC to upload CSV files into queries and have found that AO> they do not handle the following situation properly: AO> Small Heart, with ""Red" decorations AO>  is translated as AO> Small Heart, with "Red AO> and AO> Small Heart, with PINK "decorations" AO> becomes AO> Small Heart, with PINK AO> The following code handles this correctly: AO> <!--- AO> Object:   ParseCSV AO> Purpose: Include to parse a CSV string named lstCSV and return in array named AO> aryCSV AO> Author:  Andy Ousterhout(ACO) AO> Date:  3/3/2002 AO> Version: V1.0 AO> Usage: CSV files have fields in a record separated by commas, ",". However, AO> if a text field contains an embededd comma, the entire field is placed in AO> double quotes.  For example, AO>  00007,"Bear, Swimming",Test AO>  00008,"Frank ""The Animal"" Hooter", testing AO>  00009, Testing, "Frank ""The Animal"" Hooter" AO>  Should each result in 3 fields being parsed. AO> Change Record: AO> Who:  When:  Why: AO>  ---> AO>  <cfscript> AO>   aryCSV = ArrayNew(1); AO>   // Only do stuff if something passed .... AO>   if (len(lstCSV) GT 0) { AO>    // First, add a comma onto the end so that the search for ", and not "", AO> always finds something unless bad CSV string. AO>    // Then, replace any ",," with ", ," so that individual array elements will AO> be found AO>    lstWork = Trim(lstCSV) & ","; AO>    lstWork = replace(lstWork, ",", ", ", "ALL"); AO>    // Then loop thru string, parsing off to next comma and checking if AO> complete field AO>    // Note that if the field has imbedded quotes, this routine will create bad AO> data. AO>    // For example, [This is a test, "test"] will come through as ["This is a AO> test, ""test""] AO>    // and will be parsed into two fields, [This is a test, "] and ["test"]. AO> The calling routine will AO>    // need to handle the possibility that more fields will be returned then AO> expected. AO>    Do { AO>     // If Line starts with quote, next field is next quote+comma that is not AO> part of a quote+quote+comma AO>     if (left(lstWork, 1) EQ '"') { AO>      tmpStr = Right(lstWork, len(lstWork)-1); AO>      EndPos = REFInd('([^"]",) | ("*"",)', tmpStr); AO>      // If nothing found, look for "", ending AO>      if (EndPos EQ 0) { AO>       EndPos=REFInd('([^"]",) | (["]*["",])', tmpStr); AO>       EndPos=REFInd('("",)', Mid(tmpStr, EndPos+1, len(tmpStr)))+EndPos; AO>      } AO>      // If still nothing found, Error.  Stop work immediately AO>      if (EndPos EQ 0) { AO>       lstwork = ""; AO>       NextField = ""; AO>      } AO>      Else { AO>       NextField = mid(trim(lstWork), 2, EndPos); AO>       if (len(lstWork)-(EndPos+2) EQ 0) AO>        lstwork = ""; AO>       else AO>        lstWork = ltrim(right(lstWork, len(lstWork)-(EndPos+3))); AO>      } AO>     } AO>     else if (left(lstwork, 1) EQ ",") { AO>      lstWork = ltrim(right(lstWork, len(lstWork)-1)); AO>      NextField = " "; AO>     } AO>     else { AO>      NextField = Trim(ListFirst(lstWork, ",")); AO>      lstWork = LTrim(ListDeleteAt(lstWork, 1, ",")); AO>     } AO>     //Append the new field, converting CSV's double quotes to single quotes AO>     ArrayAppend(aryCSV, Replace(NextField, '""', '"', "all")); AO> //writeoutput(",[#lstwork#], [#NextField#]"); AO>    } AO>    while (len(lstWork) GT 1); AO>   } AO>  </cfscript> AO>


<< Previous Thread Today's Threads Next Thread >>

Search cf-talk

May 24, 2012

<<   <   Today   >   >>
Su Mo Tu We Th Fr Sa
     1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31     

Designer, Developer and mobile workflow conference