House of Fusion
Search over 2,500 ColdFusion resources here
  
Home of the ColdFusion Community

Search cf-talk

February 09, 2010

<<   <   Today   >   >>
Su Mo Tu We Th Fr Sa
   1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28             

Home /  Groups /  ColdFusion Talk (CF-Talk)

CSV Generation MEMORY SUCK

  << Previous Post |  RSS |  Sort Oldest First |  Sort Latest First |  Subscribe to this Group Next >> 
Rick,
Mark Kruger
06/02/08 10:15 A
SQL Server 2005.
Rick Root
06/02/08 10:34 A
>> "#Chr(34)##replace(arguments
Gerald Guido
06/02/08 11:41 A
> >> "#Chr(34)##replace(arguments
Rick Root
06/02/08 12:29 P
Rick,
Mark Kruger
06/02/08 03:01 P
Didn't look at the code, eh?
Rick Root
06/02/08 12:26 P
" where a fair chunk of the
Wil Genovese
06/03/08 12:23 P
Good to know.
Brad Wood
06/03/08 12:16 P
Just a 411
Gerald Guido
06/03/08 05:12 P
Did you compare the memory usage by chance?
Gerald Guido
06/03/08 01:12 P
Wow, I just came back to this thread.
Rick Root
06/03/08 03:16 P
timed out after 15 min
Brad Wood
06/03/08 03:22 P
Thanks Larry.
Brad Wood
06/04/08 05:32 P
Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/02/2008 10:05 AM

So I've got a problem with generating large csv files.. it's a memory suck. I do this in an event gateway so that these file drops are generated "in the background"... here's the gateway code: http://cfm.pastebin.org/40043 The larger the file drop, the worse the memory suck is.  A relatively small drop of about 7200 rows and 138 columns (just over 1 million pieces of data) took 68 seconds.  In my production environment, I've estimated that I can generate between 15,000 and 20,000 pieces of data per second using the code above. The problem is this drop (which only generates a 5MB file) causes a memory suck of about 100MB... Take a look at this output from the server monitor: www.it.dev.duke.edu/public/temp.rtf It shows the memory graph generated from two file drops, at 9:38 and 9:45 am... the first one spiked the memory from 70MB to 170MB...the second one dropped the memory back to about 90MB and then spiked it to 140MB. Of course, this size drop is not what causes my concern, it's when people are dropping 10x that amount.. say 80,000 rows at 130 columns.  Over 10 million pieces of data, would take nearly 9 minutes ASSUMING you had no memory issues, which I would.  Such a drop would basically crash the instance. -- Rick Root New Brian Vander Ark Album, songs in the music player and cool behind the scenes video at www.myspace.com/brianvanderark

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Mark Kruger
06/02/2008 10:15 AM

Rick, What's your DB platform?  Are you sure there is not a better "non-cf" way to do it? Mark A. Kruger, CFG, MCSE (402) 408-3733 ext 105 www.cfwebtools.com www.coldfusionmuse.com www.necfug.com So I've got a problem with generating large csv files.. it's a memory suck. I do this in an event gateway so that these file drops are generated "in the background"... here's the gateway code: http://cfm.pastebin.org/40043 The larger the file drop, the worse the memory suck is.  A relatively small drop of about 7200 rows and 138 columns (just over 1 million pieces of data) took 68 seconds.  In my production environment, I've estimated that I can generate between 15,000 and 20,000 pieces of data per second using the code above. The problem is this drop (which only generates a 5MB file) causes a memory suck of about 100MB... Take a look at this output from the server monitor: www.it.dev.duke.edu/public/temp.rtf It shows the memory graph generated from two file drops, at 9:38 and 9:45 am... the first one spiked the memory from 70MB to 170MB...the second one dropped the memory back to about 90MB and then spiked it to 140MB. Of course, this size drop is not what causes my concern, it's when people are dropping 10x that amount.. say 80,000 rows at 130 columns.  Over 10 million pieces of data, would take nearly 9 minutes ASSUMING you had no memory issues, which I would.  Such a drop would basically crash the instance. -- Rick Root New Brian Vander Ark Album, songs in the music player and cool behind the scenes video at www.myspace.com/brianvanderark

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/02/2008 10:34 AM

SQL Server 2005. I'm open to suggestion.  This is part of an application that allows users to generate CSV files of their own based on their own criteria, so though I'm open to "non-CF" solutions, I'm not sure there really would be anyway except maybe a homegrown java class to handle the work and be more strict with memory consumption.... Rick ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Phillip Duba
06/02/2008 10:45 AM

I know when I had to do this at a previous job I used ArrayAppend to build each line in the CSV, but I see you are using the string buffer. I had no performance diffs at the time, so I just stayed with the CF solution. The one thing I would look at is not using list functions, but instead using Array functions and then one ArrayToList at the end. Also, make sure the queries being executed aren't intensive either. I found in our CSV generation, for every second the query took, it took one second on output so my resources were essentially 50/50 between query and output, Phil ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Gaulin, Mark
06/02/2008 10:52 AM

We had a similar issue with an internal application and we made good improvements using the standard java classes PrintStream, FileOutputStream, and BufferedOutputStream to handle the writing to your file.   Something like this (shown in java, so you'd have to wrap it properly with cfscript): PrintStream out = new PrintStream(new BufferedOutputStream(new FileOutputStream("filename", true), bufferSize));  // pick a decent-sized buffer.. Maybe 100k to start This will let you do "out.print(...)" and "out.println(...)" from CF that I think will be much more efficient than what CF can do.  Be sure to do "out.close()" within the page (so use try/catch to be sure that the close happens). Also, test you page with the user's query but with the output part (actually writing the file) commented out... If the page is still slow and a huge memory hog then the file stuff above won't help much and you'll have to look at running the query in java too, but I but you'll get something by handling the file better. Thanks   Mark SQL Server 2005. I'm open to suggestion.  This is part of an application that allows users to generate CSV files of their own based on their own criteria, so though I'm open to "non-CF" solutions, I'm not sure there really would be anyway except maybe a homegrown java class to handle the work and be more strict with memory consumption.... Rick ----- Excess quoted text cut - see Original Post for more ----- suck. ----- Excess quoted text cut - see Original Post for more ----- 140MB. ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/02/2008 11:20 AM

> Also, test you page with the user's query but with the output part > (actually writing the file) commented out... If the page is still slow > and a huge memory hog then the file stuff above won't help much and > you'll have to look at running the query in java too, but I but you'll > get something by handling the file better. > I honestly don't think it's the file writing that's the problem. I just commented out the fileWrite() statements inside the <cfloop> tags that would write each line to the file, (the file still being opened with the "header" row being written to it... and it made zero difference at all in the length of time it took to complete. The query itself runs quite fast.  Returns a lot of rows but isn't a complex query. Anyway, I put some cfoutput statements in my gateway (I'm calling it as a direct cfc call not a gateway for testing).. output now().gettime() to see the ms as the method call progresses. The query returns its results in less than 4 seconds.  The process of generating the csv (around line 330-340 of the sample code I posted earlier) took 62 of the 68 seconds. And that was without actually WRITING the file. the GOOD news is that a TAB delimited file takes considerably less time (32 seconds vs. 68 seconds), cutting the "output time" from 62 seconds to 26 seconds.  Which means the csvFormat() function is taking up a very large part of the processing time. This is my csvFormat() function: <cffunction name="csvFormat" output="false" access="public" returnType="string"> <cfargument name="str" type="string" required="yes"> <cfif arguments.str neq "" and not isNumeric(arguments.str)>   <cfreturn "#Chr(34)##replace(arguments.str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34)#"> <cfelse>   <cfreturn arguments.str> </cfif> </cffunction>

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Gerald Guido
06/02/2008 11:41 AM

>> "#Chr(34)##replace(arguments > > .str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34)#" There is your bottle neck. CF does not like string manipulation on a large scale. I have tried to parsed large text files before only to watched my dev box just keel over. I see two options off the top of my head, let SQL server do the work or use Java for the string manipulation. Last time I had to parse a large text file like this I ended up writting an ActiveX script for DTS (long time ago) . G ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/02/2008 12:29 PM

----- Excess quoted text cut - see Original Post for more ----- This is only being done on a per field basis, so it's not a manipulation being done on a large scale.  At least not on a large string.  It's being done on the individual fields. I suspect that the largest string of data being dealt with by the csvFormat() function is 50 characters. -- Rick Root New Brian Vander Ark Album, songs in the music player and cool behind the scenes video at www.myspace.com/brianvanderark

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Tom Chiverton
06/02/2008 11:44 AM

> generating the csv (around line 330-340 of the sample code I posted > earlier) took 62 of the 68 seconds. Why not output the file all at once, rather than a line at a time (scrap lines ~336 - just keeping .append()'ing to a StringBuffer till your done) ? Also, have you benchmarked the EnquireLookupCFC (line 333) ? What does that do ? -- Tom Chiverton **************************************************** This email is sent for and on behalf of Halliwells LLP. Halliwells LLP is a limited liability partnership registered in England and Wales under registered number OC307980 whose registered office address is at Halliwells LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB.  A list of members is available for inspection at the registered office. Any reference to a partner in relation to Halliwells LLP means a member of Halliwells LLP.  Regulated by The Solicitors Regulation Authority. CONFIDENTIALITY This email is intended only for the use of the addressee named above and may be confidential or legally privileged.  If you are not the addressee you must not read it and must not use any information contained in nor copy it nor inform any person other than Halliwells LLP or the addressee of its existence or contents.   If you have received this email in error please delete it and notify Halliwells LLP IT Department on 0870 365 2500. For more information about Halliwells LLP visit www.halliwells.com.

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/02/2008 12:30 PM

----- Excess quoted text cut - see Original Post for more ----- I could do that but as I mentioned, the file writing is not the problem.

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Mark Kruger
06/02/2008 03:02 PM

Don't forget to turn off debugging (or remove the 127.0.0.1 ip) -mark Mark A. Kruger, CFG, MCSE (402) 408-3733 ext 105 www.cfwebtools.com www.coldfusionmuse.com www.necfug.com > > Also, test you page with the user's query but with the output part > (actually writing the file) commented out... If the page is still slow > and a huge memory hog then the file stuff above won't help much and > you'll have to look at running the query in java too, but I but you'll > get something by handling the file better. > I honestly don't think it's the file writing that's the problem. I just commented out the fileWrite() statements inside the <cfloop> tags that would write each line to the file, (the file still being opened with the "header" row being written to it... and it made zero difference at all in the length of time it took to complete. The query itself runs quite fast.  Returns a lot of rows but isn't a complex query. Anyway, I put some cfoutput statements in my gateway (I'm calling it as a direct cfc call not a gateway for testing).. output now().gettime() to see the ms as the method call progresses. The query returns its results in less than 4 seconds.  The process of generating the csv (around line 330-340 of the sample code I posted earlier) took 62 of the 68 seconds. And that was without actually WRITING the file. the GOOD news is that a TAB delimited file takes considerably less time (32 seconds vs. 68 seconds), cutting the "output time" from 62 seconds to 26 seconds.  Which means the csvFormat() function is taking up a very large part of the processing time. This is my csvFormat() function: <cffunction name="csvFormat" output="false" access="public" returnType="string"> <cfargument name="str" type="string" required="yes">  <cfif arguments.str neq "" and not isNumeric(arguments.str)>   <cfreturn "#Chr(34)##replace(arguments.str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34 )#"> <cfelse>   <cfreturn arguments.str> </cfif> </cffunction>

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Mark Kruger
06/02/2008 03:01 PM

Rick, So... The file is a selection of columns and filter criteria - right? It always varies per user.... Your right - it's a sticky problem :) -Mark Mark A. Kruger, CFG, MCSE (402) 408-3733 ext 105 www.cfwebtools.com www.coldfusionmuse.com www.necfug.com SQL Server 2005. I'm open to suggestion.  This is part of an application that allows users to generate CSV files of their own based on their own criteria, so though I'm open to "non-CF" solutions, I'm not sure there really would be anyway except maybe a homegrown java class to handle the work and be more strict with memory consumption.... Rick ----- Excess quoted text cut - see Original Post for more ----- suck. ----- Excess quoted text cut - see Original Post for more ----- 140MB. ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brian Kotek
06/02/2008 12:09 PM

Use a Java StringBuffer or StringBuilder. Concatenating large strings in CF is always a memory hog because every single concatenation creates a new String instance. Check RIAForge, there are CFC libraries that wrap using these Java classes for exactly this purpose. You'll find memory usage drops dramatically. ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brian Kotek
06/02/2008 12:36 PM

No, I opened it and saw that it was 400 lines long and didn't have time to go through it all. But sweeping through it quickly, the same advice applies. The difference is that you have to use the StringBuffer for everything. Since you aren't passing the StringBuffer into the CSVFormat method and I don't see the code for that method, I assume it is still suffering from the creation of large numbers of String instances. Try passing the StringBuffer into CSVFormat and use it within the method to append the data. ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/02/2008 12:39 PM

> The difference is that you have to use the StringBuffer for everything. > Since you aren't passing the StringBuffer into the CSVFormat method and I > don't see the code for that method, I assume it is still suffering from the > creation of large numbers of String instances. Try passing the StringBuffer > into CSVFormat and use it within the method to append the data. Now *THAT* I hadn't though of.  Lemme give that a whirl. Rick

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/02/2008 01:30 PM

I found a nice little java class library called JavaCSV that handles all the file writing and dropped my time from 68 seconds to 18 seconds.  That has potential! It basically handles the writing of delimiters and the proper csv formatting.. so here's my code: <cfset var fileOutput = createObject("java","com.csvreader.CsvWriter")> <cfset fileOutput.init("#expandPath("..")#\drops\#filename#")>   <cfloop query="resultSet">    <!--- write record --->    <cfloop from="1" to="#numFields#" index="i" step="1">     <cfset fileOutput.write( resultSet[fieldsArray[i]][resultSet.currentRow].toString() )>    </cfloop>    <!--- write end of record --->    <cfset fileOutput.endRecord()>   </cfloop>   <cfset fileOutput.close()>

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Gerald Guido
06/02/2008 01:46 PM

>> dropped my time from 68 seconds to 18 seconds Nice. Is that entirety of the code sans the query? >>> little java class library called JavaCSV that handles The the one from SourceForge? I am going to need something like this shortly. G -- "The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them." - Sir William Bragg

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/02/2008 02:36 PM

> >> dropped my time from 68 seconds to 18 seconds > > Nice. Is that entirety of the code sans the query? Entirety.  The query itself takes about 4 seconds to execute and return all its data. ----- Excess quoted text cut - see Original Post for more ----- Yeah, that's the one. Rick

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Tom Chiverton
06/03/2008 04:35 AM

> I found a nice little java class library called JavaCSV that handles all > the file writing and dropped my time from 68 seconds to 18 seconds.  That > has potential! Why CF can't translate '&' to a StringBuffer append I'll never know... -- Tom Chiverton **************************************************** This email is sent for and on behalf of Halliwells LLP. Halliwells LLP is a limited liability partnership registered in England and Wales under registered number OC307980 whose registered office address is at Halliwells LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB.  A list of members is available for inspection at the registered office. Any reference to a partner in relation to Halliwells LLP means a member of Halliwells LLP.  Regulated by The Solicitors Regulation Authority. CONFIDENTIALITY This email is intended only for the use of the addressee named above and may be confidential or legally privileged.  If you are not the addressee you must not read it and must not use any information contained in nor copy it nor inform any person other than Halliwells LLP or the addressee of its existence or contents.   If you have received this email in error please delete it and notify Halliwells LLP IT Department on 0870 365 2500. For more information about Halliwells LLP visit www.halliwells.com.

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brian Kotek
06/03/2008 11:13 AM

Probably because it can't know if that's what you actually want to do. We probably need a new function StringAppend or something that would be able to do this. Might be time to hit the wish list! ;-) ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Tom Chiverton
06/03/2008 11:32 AM

> Probably because it can't know if that's what you actually want to do. We > probably need a new function StringAppend or something that would be able > to do this. Might be time to hit the wish list! ;-) I'm leaving for Scotch on the Rocks in ~12 hours, where a fair chunk of the CF9 team are hosting a BOF session :-) -- Tom Chiverton **************************************************** This email is sent for and on behalf of Halliwells LLP. Halliwells LLP is a limited liability partnership registered in England and Wales under registered number OC307980 whose registered office address is at Halliwells LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB.  A list of members is available for inspection at the registered office. Any reference to a partner in relation to Halliwells LLP means a member of Halliwells LLP.  Regulated by The Solicitors Regulation Authority. CONFIDENTIALITY This email is intended only for the use of the addressee named above and may be confidential or legally privileged.  If you are not the addressee you must not read it and must not use any information contained in nor copy it nor inform any person other than Halliwells LLP or the addressee of its existence or contents.   If you have received this email in error please delete it and notify Halliwells LLP IT Department on 0870 365 2500. For more information about Halliwells LLP visit www.halliwells.com.

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Wil Genovese
06/03/2008 12:23 PM

" where a fair chunk of the CF9 team are hosting a BOF session :-)" That was a fun and ruckus BOF session at CF.Objective()! Wil Genovese

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 11:18 AM

I wonder what Java string objects are used when you create a large string by outputting inside a cfsavecontent.   I'm sure ColdFusion implements strings the way it does because it was found to be the most efficient method for the majority of programming needs. ~Brad Probably because it can't know if that's what you actually want to do. We probably need a new function StringAppend or something that would be able to do this. Might be time to hit the wish list! ;-)

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brian Kotek
06/03/2008 12:12 PM

Building up strings in cfsavecontent also concatenates to the result variable so the problem is the same. ----- Excess quoted text cut - see Original Post for more -----

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 12:16 PM

Good to know.   What is your source of this information? ~Brad Sent: Tuesday, June 03, 2008 11:11 AM Building up strings in cfsavecontent also concatenates to the result variable so the problem is the same. > I wonder what Java string objects are used when you create a large > string by outputting inside a cfsavecontent.

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brian Kotek
06/03/2008 04:34 PM

Just experience, since I've tried all three options (concatenation, cfsavecontent, and StringBuffer) and have had the first two generate out of memory errors while the StringBuffer worked correctly. So while cfsavecontent may indeed be faster and use less memory, I'm still pretty sure that the StringBuffer approach is better especially for very large strings (these were 100+ megabyte CSV files). ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Gerald Guido
06/03/2008 05:12 PM

Just a 411 I found a nice little tute on generating csv's using the StringBuffer class in ColdFusion http://www.stillnetstudios.com/2007/03/07/java-strings-in-coldfusion/ -- "The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them." - Sir William Bragg

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 12:45 PM

Hmm, I don't think you are correct Brian.  I just whipped up a test of string concatenation.   Please spare the "proper load test" flames.  This is NOT a load test-- it is intended to make a process run long enough to capture a thread stack.   Actually, in the context of large file generations I would call it quite appropriate. Concatenation "Hello World. " together 30 thousand times on CF 7.0.2 Ent (Win) shows a vast difference between using & and simply outputting it inside a cfsavecontent. The & definitely spent all its time doing a Java.lang.String.concat(). The cfsavecontent not only executed 211 times faster, I didn't see a single String.contat() happening. Here are the results: And here is the code: <cfset string1 = ""> <cftimer label="string & string&" type="outline">   <cfloop from="1" to="30000" index="i">     <cfset string1 = string1 & "Hello World. ">     </cfloop> </cftimer> <cfoutput>String Length: #len(string1)#</cfoutput> <cfsetting enablecfoutputonly="true"> <cftimer label="cfsavecontent" type="outline">   <cfsavecontent variable="string2">     <cfloop from="1" to="30000" index="i">       <cfoutput>Hello World.</cfoutput>       </cfloop>   </cfsavecontent> </cftimer> <cfsetting enablecfoutputonly="false"> <cfoutput>String Length: #len(string2)#</cfoutput>

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 01:08 PM

Update: I experienced the same behavior on CF 8 JVM 1.6 (Win). Well, almost the same-- the CF8 server was actually faster overall.  I would like to point out it is actually a slower server too! string & string: 9141ms String Length: 390000 cfsavecontent: 31ms String Length: 390000 Here are the results: string & string: 17093ms String Length: 390000 cfsavecontent: 125ms String Length: 390000

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Gerald Guido
06/03/2008 01:12 PM

Did you compare the memory usage by chance? G ----- Excess quoted text cut - see Original Post for more -----

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 01:26 PM

No, but I would like to.  The problem is I'm not sure how to get any exact numbers.  I have SeeFusion installed which will tell me the overall heap size of my JVM, but it might be difficult to nail down how much was used by one thread.   Alternatively, there are the totalMemory() and maxMemory() methods in the runtime object available in java.lang.Runtime.  I could run that before and after the code ran, but I'm not sure how garbage collection would affect that. Suggestions? ~Brad Did you compare the memory usage by chance? G

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 01:57 PM

Ok, here are my memory usage stats on CF 7.  Someone please correct me if my code is wrong. It's a little messy, and I apologize for that. Memory Before: 83 Megs string & string: 52795ms String Length: 650000 Memory After: 101 Megs -- Increase of 17 Megs Memory Before: 85 Megs cfsavecontent: 172ms String Length: 650000 Memory After: 97 Megs -- Increase of 12 Megs As you can see, cfsavecontent used about 1/3 less memory that the other method.  Not nearly as proportional savings the execution time... Wow-- here are the numbers from my CF8 box: Memory Before: 161 Megs string & string: 26530ms String Length: 650000 Memory After: 195 Megs -- Increase of 35 Megs Memory Before: 158 Megs cfsavecontent: 47ms String Length: 650000 Memory After: 165 Megs -- Increase of 7 Megs This time the cfsavecontent used 4/5ths less memory! Very interesting indeed...  Of course, please understand there are many factors and JVM settings that go into this.  I'm not trying to claim everyone else will get results like this. Here the latest version of my (slightly sloppy) code: <cfset runtime = CreateObject("java", "java.lang.Runtime").getRuntime()> <cfset total_memory = runtime.totalMemory()> <cfset runtime.gc()> <cfset memory_before = (total_memory-runtime.freeMemory()) / 1024 / 1024> <cfoutput>Memory Before: #round(memory_before)# Megs<br> <cfset string1 = ""> <cftimer label="string & string" type="outline">   <cfloop from="1" to="50000" index="i">     <cfset string1 = string1 & "Hello World. ">     </cfloop> </cftimer> <cfoutput>String Length: #len(string1)#</cfoutput><br> <cfset memory_after = (total_memory-runtime.freeMemory()) / 1024 / 1024> Memory After: #round(memory_after)# Megs -- Increase of #round(memory_after - memory_before)# Megs<br> <br /> </cfoutput> <cfset runtime.gc()> <cfset memory_before = (total_memory-runtime.freeMemory()) / 1024 / 1024> <cfoutput>Memory Before: #round(memory_before)# Megs<br></cfoutput> <cfsetting enablecfoutputonly="true"> <cftimer label="cfsavecontent" type="outline">   <cfsavecontent variable="string2">     <cfloop from="1" to="50000" index="i">       <cfoutput>Hello World.</cfoutput>       </cfloop>   </cfsavecontent> </cftimer> <cfsetting enablecfoutputonly="false"> <cfoutput>String Length: #len(string2)#</cfoutput><br> <cfset memory_after = (total_memory-runtime.freeMemory()) / 1024 / 1024> <cfoutput>Memory After: #round(memory_after)# Megs -- Increase of #round(memory_after - memory_before)# Megs<br></cfoutput>

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Gerald Guido
06/03/2008 02:45 PM

I did a million loops - I don't know what possessed me to do that. Memory was "measured" using task manager. Totally unscientific. I did a restart on the service before each trial. CF 8 developer 2 gig ram Java v. 1.6.0_01 cfsavecontent 2281 ms 192,356 k start 260,872K after 68.516 k difference concatenation using  cfset timed out after 15 min 192,362 k start 580,952 k after 388,590 k difference I find this very interesting in that my totally unscientific though process was: Since cfsavecontent is so damn easy to use it *must* be resource intensive. G > Ok, here are my memory usage stats on CF 7.  Someone please correct me > if my code is wrong. It's a little messy, and I apologize for that. > > -- "The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them." - Sir William Bragg

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/03/2008 03:16 PM

Wow, I just came back to this thread. REALLY makes me wonder how they're handling cfsavecontent!

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 03:22 PM

timed out after 15 min lol I find this very interesting in that my totally unscientific though process was: Since cfsavecontent is so damn easy to use it *must* be resource intensive. Go figure-- ColdFusion strikes again. If string.concat() creates a brand new string object in memory to hold the combination of the two original strings, then given a consistently average sized string being added each time I would expect memory consumption to be exponential.   For instance, if you concatenated a string 1 byte in size 10 times over, you would consume 55 bytes of memory.  1+2+3+4+5+6+7+8+9+10=55 For the life of me I can't figure out what the Big-O notation would be for that though... ~Brad I did a million loops - I don't know what possessed me to do that. Memory was "measured" using task manager. Totally unscientific. I did a restart on the service before each trial. CF 8 developer 2 gig ram Java v. 1.6.0_01 cfsavecontent 2281 ms 192,356 k start 260,872K after 68.516 k difference concatenation using  cfset 192,362 k start 580,952 k after 388,590 k difference G > Ok, here are my memory usage stats on CF 7.  Someone please correct me > if my code is wrong. It's a little messy, and I apologize for that. > > -- "The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them." - Sir William Bragg

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Rick Root
06/03/2008 03:27 PM

Well, since we're all conducting our own little tests, here's MY test code: the cfset method took 64 seconds.  The cfsavecontent method only takes 203ms. It has GOT to be using a stringbuffer then converting the result to a string at the end. <cfsetting enablecfoutputonly="yes"> <cfsetting requesttimeout="600"> <cfset reps = 100000> <cfif 1> <cfset start = now().gettime()> <cfset result = ""> <cfloop from="1" to="#reps#" step="1" index="i">   <cfset result = result & i> </cfloop> <cfset end = now().gettime()> <cfoutput><p>#end-start#ms : #len(result)#</p></cfoutput> <cfelse> <cfset start = now().gettime()> <cfsavecontent variable="result"> <cfloop from="1" to="#reps#" step="1" index="i">   <cfoutput>#i#</cfoutput> </cfloop> </cfsavecontent> <cfset end = now().gettime()> <cfoutput><p>#end-start#ms : #len(result)#</p></cfoutput> </cfif>

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 03:29 PM

Dang closed source apps-- if only we could just go look at the code!  :) ~Brad Well, since we're all conducting our own little tests, here's MY test code: the cfset method took 64 seconds.  The cfsavecontent method only takes 203ms. It has GOT to be using a stringbuffer then converting the result to a string at the end.

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/03/2008 11:46 PM

(Sorry, I got a little CTRL-Enter happy and sent before I was ready...) Building up strings in cfsavecontent also concatenates to the result variable so the problem is the same. ============================= Hmm, I don't think you are correct Brian.  I just whipped up a test of string concatenation.   Please spare the "proper load test" flames.  This is NOT a load test-- it is intended to make a process run long enough to capture a thread stack.   Actually, in the context of large file generations I would call it quite appropriate. Concatenation "Hello World. " together 30 thousand times on CF 7.0.2 Ent (Win) shows a vast difference between using & and simply outputting it inside a cfsavecontent. The & definitely spent all its time doing a Java.lang.String.concat(). The cfsavecontent not only executed 211 times faster, I didn't see a single String.contat() happening. Here are the results: string & string: 17093ms String Length: 390000 cfsavecontent: 125ms String Length: 390000 And here is the code: <cfset string1 = ""> <cftimer label="string & string" type="outline">   <cfloop from="1" to="30000" index="i">     <cfset string1 = string1 & "Hello World. ">     </cfloop> </cftimer> <cfoutput>String Length: #len(string1)#</cfoutput> <cfsetting enablecfoutputonly="true"> <cftimer label="cfsavecontent" type="outline">   <cfsavecontent variable="string2">     <cfloop from="1" to="30000" index="i">       <cfoutput>Hello World.</cfoutput>       </cfloop>   </cfsavecontent> </cftimer> <cfsetting enablecfoutputonly="false"> <cfoutput>String Length: #len(string2)#</cfoutput>

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Larry Lyons
06/04/2008 11:46 AM

> This whole discussion prompted two blog entries... > > Regarding the javaCSV library: > http://www.opensourcecf. com/1/2008/06/Ja> vaCSV-for-creating-large-CSV-and-other-delmiited-files-with-Coldfusion. ----- Excess quoted text cut - see Original Post for more ----- Rick, I thought I run your code test cfset vs cfsavecontent on a slightly different platform - BlueDragon for J2EE running on JBoss AS 4.22 on an XP Pox (intel core 2 duo with 2gig memory). Here are the results: cfset 145861ms : 488895 cfsavecontent 766ms : 488895 regards, larry

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/04/2008 11:10 AM

Who, that's weird-- this message came in my inbox late last night as a duplicate of something I sent yesterday afternoon...  I wasn't even in front of a computer at 11:45 pm.  :) ~Brad (Sorry, I got a little CTRL-Enter happy and sent before I was ready...) Building up strings in cfsavecontent also concatenates to the result variable so the problem is the same. ============================= Hmm, I don't think ...

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/04/2008 11:54 AM

That's pretty cool, Larry.  I was wondering about BD and Smith.   Will J2EE BD let you create the java.lang.runtime object to get memory usage etc?  If so, I would be interested in seeing the results of my version of the test which reported the memory increase for each test. (I posted the code yesterday.  Let me know if the word-wraps trashed it). ~Brad ========================= Rick, I thought I run your code test cfset vs cfsavecontent on a slightly different platform - BlueDragon for J2EE running on JBoss AS 4.22 on an XP Pox (intel core 2 duo with 2gig memory). Here are the results: cfset 145861ms : 488895 cfsavecontent 766ms : 488895 regards, larry

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Larry Lyons
06/04/2008 02:15 PM

Here are the results of your code (again BD for J2EE running on JBoss AS 4.22): string & string&: 33251ms String Length: 390000 cfsavecontent: 62ms String Length: 570006 I ran the test several times, mainly because the results for cfsavecontent looked so much like an outlier, but I got similar results. I'll have to dig up that code you posted, but in general since BlueDragon for J2EE sits on top of a java application server it can access the java.lang.runtime object no problem. When I get home tonight I'll run these tests again using open BlueDragon for J2EE, but I doubt there will be any differences. regards, larry ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Larry Lyons
06/04/2008 02:35 PM

Here are the results of your code with java.lang.runtime. Forgot to mention that the JVM is 1.5.0_15-b04. Memory Before: 28 Megs string & string: 99642ms String Length: 650000 Memory After: 91 Megs -- Increase of 63 Megs Memory Before: 29 Megs cfsavecontent: 63ms String Length: 850003 Memory After: 37 Megs -- Increase of 8 Megs Basically what this tells me is that you should use cfsavecontent rather than concatenating strings. But I was not expecting such a difference. larry ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Larry Lyons
06/04/2008 05:30 PM

Just ran the same code on Open BlueDragon. NThis test probably is not the equivalent of the previous tests, at home here I'm running this app on a MacBook (core duo 2.16 ghz with 2 gb RAM), running OSX 10.4 Tiger. J2SE 5. But the results are similar: Memory Before: 26 Megs string & string: 553776ms String Length: 650000 Memory After: 45 Megs -- Increase of 19 Megs Memory Before: 32 Megs cfsavecontent: 65ms String Length: 1400010 Memory After: 39 Megs -- Increase of 8 Megs ----- Excess quoted text cut - see Original Post for more -----

Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Brad Wood
06/04/2008 05:32 PM

Thanks Larry.   ~Brad Here are the results of your code with java.lang.runtime. Forgot to mention that the JVM is 1.5.0_15-b04.


<< Previous Thread Today's Threads Next Thread >>

Mailing Lists