|
|
Home /
Groups /
ColdFusion Talk (CF-Talk)
CSV Generation MEMORY SUCK
So I've got a problem with generating large csv files.. it's a memory suck.Rick Root 06/02/08 10:05 A Rick,Mark Kruger 06/02/08 10:15 A SQL Server 2005.Rick Root 06/02/08 10:34 A I know when I had to do this at a previous job I used ArrayAppend to buildPhillip Duba 06/02/08 10:45 A We had a similar issue with an internal application and we made goodGaulin, Mark 06/02/08 10:52 A > Also, test you page with the user's query but with the output partRick Root 06/02/08 11:20 A >> "#Chr(34)##replace(argumentsGerald Guido 06/02/08 11:41 A > >> "#Chr(34)##replace(argumentsRick Root 06/02/08 12:29 P > generating the csv (around line 330-340 of the sample code I postedTom Chiverton 06/02/08 11:44 A > > generating the csv (around line 330-340 of the sample code I postedRick Root 06/02/08 12:30 P Don't forget to turn off debugging (or remove the 127.0.0.1 ip)Mark Kruger 06/02/08 03:02 P Rick,Mark Kruger 06/02/08 03:01 P Use a Java StringBuffer or StringBuilder. Concatenating large strings in CFBrian Kotek 06/02/08 12:09 P Didn't look at the code, eh?Rick Root 06/02/08 12:26 P No, I opened it and saw that it was 400 lines long and didn't have time toBrian Kotek 06/02/08 12:36 P > The difference is that you have to use the StringBuffer for everything.Rick Root 06/02/08 12:39 P I found a nice little java class library called JavaCSV that handles all theRick Root 06/02/08 01:30 P >> dropped my time from 68 seconds to 18 secondsGerald Guido 06/02/08 01:46 P > >> dropped my time from 68 seconds to 18 secondsRick Root 06/02/08 02:36 P > I found a nice little java class library called JavaCSV that handles allTom Chiverton 06/03/08 04:35 A Probably because it can't know if that's what you actually want to do. WeBrian Kotek 06/03/08 11:13 A > Probably because it can't know if that's what you actually want to do. WeTom Chiverton 06/03/08 11:32 A " where a fair chunk of theWil Genovese 06/03/08 12:23 P I wonder what Java string objects are used when you create a largeBrad Wood 06/03/08 11:18 A Building up strings in cfsavecontent also concatenates to the resultBrian Kotek 06/03/08 12:12 P Good to know.Brad Wood 06/03/08 12:16 P Just experience, since I've tried all three options (concatenation,Brian Kotek 06/03/08 04:34 P Just a 411Gerald Guido 06/03/08 05:12 P Hmm, I don't think you are correct Brian. I just whipped up a test ofBrad Wood 06/03/08 12:45 P Update: I experienced the same behavior on CF 8 JVM 1.6 (Win).Brad Wood 06/03/08 01:08 P Did you compare the memory usage by chance?Gerald Guido 06/03/08 01:12 P No, but I would like to. The problem is I'm not sure how to get anyBrad Wood 06/03/08 01:26 P Ok, here are my memory usage stats on CF 7. Someone please correct meBrad Wood 06/03/08 01:57 P I did a million loops - I don't know what possessed me to do that.Gerald Guido 06/03/08 02:45 P Wow, I just came back to this thread.Rick Root 06/03/08 03:16 P timed out after 15 minBrad Wood 06/03/08 03:22 P Well, since we're all conducting our own little tests, here's MY test code:Rick Root 06/03/08 03:27 P Dang closed source apps-- if only we could just go look at the code! :)Brad Wood 06/03/08 03:29 P (Sorry, I got a little CTRL-Enter happy and sent before I was ready...)Brad Wood 06/03/08 11:46 P This whole discussion prompted two blog entries...Rick Root 06/04/08 10:27 A > This whole discussion prompted two blog entries...Larry Lyons 06/04/08 11:46 A Who, that's weird-- this message came in my inbox late last night as aBrad Wood 06/04/08 11:10 A That's pretty cool, Larry. I was wondering about BD and Smith.Brad Wood 06/04/08 11:54 A Here are the results of your code (again BD for J2EE running on JBoss AS 4.22):Larry Lyons 06/04/08 02:15 P Here are the results of your code with java.lang.runtime. Forgot to mention that the JVM is 1.5.0_15-b04.Larry Lyons 06/04/08 02:35 P Just ran the same code on Open BlueDragon. NThis test probably is not the equivalent of the previous tests, at home here I'm running this app on a MacBook (core duo 2.16 ghz with 2 gb RAM), running OSX 10.4 Tiger. J2SE 5. But the results are similar:Larry Lyons 06/04/08 05:30 P Thanks Larry.Brad Wood 06/04/08 05:32 P So I've got a problem with generating large csv files.. it's a memory suck. I do this in an event gateway so that these file drops are generated "in the background"... here's the gateway code: http://cfm.pastebin.org/40043 The larger the file drop, the worse the memory suck is. A relatively small drop of about 7200 rows and 138 columns (just over 1 million pieces of data) took 68 seconds. In my production environment, I've estimated that I can generate between 15,000 and 20,000 pieces of data per second using the code above. The problem is this drop (which only generates a 5MB file) causes a memory suck of about 100MB... Take a look at this output from the server monitor: www.it.dev.duke.edu/public/temp.rtf It shows the memory graph generated from two file drops, at 9:38 and 9:45 am... the first one spiked the memory from 70MB to 170MB...the second one dropped the memory back to about 90MB and then spiked it to 140MB. Of course, this size drop is not what causes my concern, it's when people are dropping 10x that amount.. say 80,000 rows at 130 columns. Over 10 million pieces of data, would take nearly 9 minutes ASSUMING you had no memory issues, which I would. Such a drop would basically crash the instance. -- Rick Root New Brian Vander Ark Album, songs in the music player and cool behind the scenes video at www.myspace.com/brianvanderark Rick, What's your DB platform? Are you sure there is not a better "non-cf" way to do it? Mark A. Kruger, CFG, MCSE (402) 408-3733 ext 105 www.cfwebtools.com www.coldfusionmuse.com www.necfug.com So I've got a problem with generating large csv files.. it's a memory suck. I do this in an event gateway so that these file drops are generated "in the background"... here's the gateway code: http://cfm.pastebin.org/40043 The larger the file drop, the worse the memory suck is. A relatively small drop of about 7200 rows and 138 columns (just over 1 million pieces of data) took 68 seconds. In my production environment, I've estimated that I can generate between 15,000 and 20,000 pieces of data per second using the code above. The problem is this drop (which only generates a 5MB file) causes a memory suck of about 100MB... Take a look at this output from the server monitor: www.it.dev.duke.edu/public/temp.rtf It shows the memory graph generated from two file drops, at 9:38 and 9:45 am... the first one spiked the memory from 70MB to 170MB...the second one dropped the memory back to about 90MB and then spiked it to 140MB. Of course, this size drop is not what causes my concern, it's when people are dropping 10x that amount.. say 80,000 rows at 130 columns. Over 10 million pieces of data, would take nearly 9 minutes ASSUMING you had no memory issues, which I would. Such a drop would basically crash the instance. -- Rick Root New Brian Vander Ark Album, songs in the music player and cool behind the scenes video at www.myspace.com/brianvanderark SQL Server 2005. I'm open to suggestion. This is part of an application that allows users to generate CSV files of their own based on their own criteria, so though I'm open to "non-CF" solutions, I'm not sure there really would be anyway except maybe a homegrown java class to handle the work and be more strict with memory consumption.... Rick ----- Excess quoted text cut - see Original Post for more ----- I know when I had to do this at a previous job I used ArrayAppend to build each line in the CSV, but I see you are using the string buffer. I had no performance diffs at the time, so I just stayed with the CF solution. The one thing I would look at is not using list functions, but instead using Array functions and then one ArrayToList at the end. Also, make sure the queries being executed aren't intensive either. I found in our CSV generation, for every second the query took, it took one second on output so my resources were essentially 50/50 between query and output, Phil ----- Excess quoted text cut - see Original Post for more ----- We had a similar issue with an internal application and we made good improvements using the standard java classes PrintStream, FileOutputStream, and BufferedOutputStream to handle the writing to your file. Something like this (shown in java, so you'd have to wrap it properly with cfscript): PrintStream out = new PrintStream(new BufferedOutputStream(new FileOutputStream("filename", true), bufferSize)); // pick a decent-sized buffer.. Maybe 100k to start This will let you do "out.print(...)" and "out.println(...)" from CF that I think will be much more efficient than what CF can do. Be sure to do "out.close()" within the page (so use try/catch to be sure that the close happens). Also, test you page with the user's query but with the output part (actually writing the file) commented out... If the page is still slow and a huge memory hog then the file stuff above won't help much and you'll have to look at running the query in java too, but I but you'll get something by handling the file better. Thanks Mark SQL Server 2005. I'm open to suggestion. This is part of an application that allows users to generate CSV files of their own based on their own criteria, so though I'm open to "non-CF" solutions, I'm not sure there really would be anyway except maybe a homegrown java class to handle the work and be more strict with memory consumption.... Rick ----- Excess quoted text cut - see Original Post for more ----- suck. ----- Excess quoted text cut - see Original Post for more ----- 140MB. ----- Excess quoted text cut - see Original Post for more ----- > Also, test you page with the user's query but with the output part > (actually writing the file) commented out... If the page is still slow > and a huge memory hog then the file stuff above won't help much and > you'll have to look at running the query in java too, but I but you'll > get something by handling the file better. > I honestly don't think it's the file writing that's the problem. I just commented out the fileWrite() statements inside the <cfloop> tags that would write each line to the file, (the file still being opened with the "header" row being written to it... and it made zero difference at all in the length of time it took to complete. The query itself runs quite fast. Returns a lot of rows but isn't a complex query. Anyway, I put some cfoutput statements in my gateway (I'm calling it as a direct cfc call not a gateway for testing).. output now().gettime() to see the ms as the method call progresses. The query returns its results in less than 4 seconds. The process of generating the csv (around line 330-340 of the sample code I posted earlier) took 62 of the 68 seconds. And that was without actually WRITING the file. the GOOD news is that a TAB delimited file takes considerably less time (32 seconds vs. 68 seconds), cutting the "output time" from 62 seconds to 26 seconds. Which means the csvFormat() function is taking up a very large part of the processing time. This is my csvFormat() function: <cffunction name="csvFormat" output="false" access="public" returnType="string"> <cfargument name="str" type="string" required="yes"> <cfif arguments.str neq "" and not isNumeric(arguments.str)> <cfreturn "#Chr(34)##replace(arguments.str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34)#"> <cfelse> <cfreturn arguments.str> </cfif> </cffunction> >> "#Chr(34)##replace(arguments > > .str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34)#" There is your bottle neck. CF does not like string manipulation on a large scale. I have tried to parsed large text files before only to watched my dev box just keel over. I see two options off the top of my head, let SQL server do the work or use Java for the string manipulation. Last time I had to parse a large text file like this I ended up writting an ActiveX script for DTS (long time ago) . G ----- Excess quoted text cut - see Original Post for more ----- ----- Excess quoted text cut - see Original Post for more ----- This is only being done on a per field basis, so it's not a manipulation being done on a large scale. At least not on a large string. It's being done on the individual fields. I suspect that the largest string of data being dealt with by the csvFormat() function is 50 characters. -- Rick Root New Brian Vander Ark Album, songs in the music player and cool behind the scenes video at www.myspace.com/brianvanderark > generating the csv (around line 330-340 of the sample code I posted > earlier) took 62 of the 68 seconds. Why not output the file all at once, rather than a line at a time (scrap lines ~336 - just keeping .append()'ing to a StringBuffer till your done) ? Also, have you benchmarked the EnquireLookupCFC (line 333) ? What does that do ? -- Tom Chiverton **************************************************** This email is sent for and on behalf of Halliwells LLP. Halliwells LLP is a limited liability partnership registered in England and Wales under registered number OC307980 whose registered office address is at Halliwells LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB. A list of members is available for inspection at the registered office. Any reference to a partner in relation to Halliwells LLP means a member of Halliwells LLP. Regulated by The Solicitors Regulation Authority. CONFIDENTIALITY This email is intended only for the use of the addressee named above and may be confidential or legally privileged. If you are not the addressee you must not read it and must not use any information contained in nor copy it nor inform any person other than Halliwells LLP or the addressee of its existence or contents. If you have received this email in error please delete it and notify Halliwells LLP IT Department on 0870 365 2500. For more information about Halliwells LLP visit www.halliwells.com. ----- Excess quoted text cut - see Original Post for more ----- I could do that but as I mentioned, the file writing is not the problem. Don't forget to turn off debugging (or remove the 127.0.0.1 ip) -mark Mark A. Kruger, CFG, MCSE (402) 408-3733 ext 105 www.cfwebtools.com www.coldfusionmuse.com www.necfug.com > > Also, test you page with the user's query but with the output part > (actually writing the file) commented out... If the page is still slow > and a huge memory hog then the file stuff above won't help much and > you'll have to look at running the query in java too, but I but you'll > get something by handling the file better. > I honestly don't think it's the file writing that's the problem. I just commented out the fileWrite() statements inside the <cfloop> tags that would write each line to the file, (the file still being opened with the "header" row being written to it... and it made zero difference at all in the length of time it took to complete. The query itself runs quite fast. Returns a lot of rows but isn't a complex query. Anyway, I put some cfoutput statements in my gateway (I'm calling it as a direct cfc call not a gateway for testing).. output now().gettime() to see the ms as the method call progresses. The query returns its results in less than 4 seconds. The process of generating the csv (around line 330-340 of the sample code I posted earlier) took 62 of the 68 seconds. And that was without actually WRITING the file. the GOOD news is that a TAB delimited file takes considerably less time (32 seconds vs. 68 seconds), cutting the "output time" from 62 seconds to 26 seconds. Which means the csvFormat() function is taking up a very large part of the processing time. This is my csvFormat() function: <cffunction name="csvFormat" output="false" access="public" returnType="string"> <cfargument name="str" type="string" required="yes"> <cfif arguments.str neq "" and not isNumeric(arguments.str)> <cfreturn "#Chr(34)##replace(arguments.str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34 )#"> <cfelse> <cfreturn arguments.str> </cfif> </cffunction> Rick, So... The file is a selection of columns and filter criteria - right? It always varies per user.... Your right - it's a sticky problem :) -Mark Mark A. Kruger, CFG, MCSE (402) 408-3733 ext 105 www.cfwebtools.com www.coldfusionmuse.com www.necfug.com SQL Server 2005. I'm open to suggestion. This is part of an application that allows users to generate CSV files of their own based on their own criteria, so though I'm open to "non-CF" solutions, I'm not sure there really would be anyway except maybe a homegrown java class to handle the work and be more strict with memory consumption.... Rick ----- Excess quoted text cut - see Original Post for more ----- suck. ----- Excess quoted text cut - see Original Post for more ----- 140MB. ----- Excess quoted text cut - see Original Post for more ----- Use a Java StringBuffer or StringBuilder. Concatenating large strings in CF is always a memory hog because every single concatenation creates a new String instance. Check RIAForge, there are CFC libraries that wrap using these Java classes for exactly this purpose. You'll find memory usage drops dramatically. ----- Excess quoted text cut - see Original Post for more ----- Didn't look at the code, eh? ----- Excess quoted text cut - see Original Post for more ----- No, I opened it and saw that it was 400 lines long and didn't have time to go through it all. But sweeping through it quickly, the same advice applies. The difference is that you have to use the StringBuffer for everything. Since you aren't passing the StringBuffer into the CSVFormat method and I don't see the code for that method, I assume it is still suffering from the creation of large numbers of String instances. Try passing the StringBuffer into CSVFormat and use it within the method to append the data. ----- Excess quoted text cut - see Original Post for more ----- > The difference is that you have to use the StringBuffer for everything. > Since you aren't passing the StringBuffer into the CSVFormat method and I > don't see the code for that method, I assume it is still suffering from the > creation of large numbers of String instances. Try passing the StringBuffer > into CSVFormat and use it within the method to append the data. Now *THAT* I hadn't though of. Lemme give that a whirl. Rick I found a nice little java class library called JavaCSV that handles all the file writing and dropped my time from 68 seconds to 18 seconds. That has potential! It basically handles the writing of delimiters and the proper csv formatting.. so here's my code: <cfset var fileOutput = createObject("java","com.csvreader.CsvWriter")> <cfset fileOutput.init("#expandPath("..")#\drops\#filename#")> <cfloop query="resultSet"> <!--- write record ---> <cfloop from="1" to="#numFields#" index="i" step="1"> <cfset fileOutput.write( resultSet[fieldsArray[i]][resultSet.currentRow].toString() )> </cfloop> <!--- write end of record ---> <cfset fileOutput.endRecord()> </cfloop> <cfset fileOutput.close()> >> dropped my time from 68 seconds to 18 seconds Nice. Is that entirety of the code sans the query? >>> little java class library called JavaCSV that handles The the one from SourceForge? I am going to need something like this shortly. G -- "The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them." - Sir William Bragg > >> dropped my time from 68 seconds to 18 seconds > > Nice. Is that entirety of the code sans the query? Entirety. The query itself takes about 4 seconds to execute and return all its data. ----- Excess quoted text cut - see Original Post for more ----- Yeah, that's the one. Rick > I found a nice little java class library called JavaCSV that handles all > the file writing and dropped my time from 68 seconds to 18 seconds. That > has potential! Why CF can't translate '&' to a StringBuffer append I'll never know... -- Tom Chiverton **************************************************** This email is sent for and on behalf of Halliwells LLP. Halliwells LLP is a limited liability partnership registered in England and Wales under registered number OC307980 whose registered office address is at Halliwells LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB. A list of members is available for inspection at the registered office. Any reference to a partner in relation to Halliwells LLP means a member of Halliwells LLP. Regulated by The Solicitors Regulation Authority. CONFIDENTIALITY This email is intended only for the use of the addressee named above and may be confidential or legally privileged. If you are not the addressee you must not read it and must not use any information contained in nor copy it nor inform any person other than Halliwells LLP or the addressee of its existence or contents. If you have received this email in error please delete it and notify Halliwells LLP IT Department on 0870 365 2500. For more information about Halliwells LLP visit www.halliwells.com. Probably because it can't know if that's what you actually want to do. We probably need a new function StringAppend or something that would be able to do this. Might be time to hit the wish list! ;-) ----- Excess quoted text cut - see Original Post for more ----- > Probably because it can't know if that's what you actually want to do. We > probably need a new function StringAppend or something that would be able > to do this. Might be time to hit the wish list! ;-) I'm leaving for Scotch on the Rocks in ~12 hours, where a fair chunk of the CF9 team are hosting a BOF session :-) -- Tom Chiverton **************************************************** This email is sent for and on behalf of Halliwells LLP. Halliwells LLP is a limited liability partnership registered in England and Wales under registered number OC307980 whose registered office address is at Halliwells LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB. A list of members is available for inspection at the registered office. Any reference to a partner in relation to Halliwells LLP means a member of Halliwells LLP. Regulated by The Solicitors Regulation Authority. CONFIDENTIALITY This email is intended only for the use of the addressee named above and may be confidential or legally privileged. If you are not the addressee you must not read it and must not use any information contained in nor copy it nor inform any person other than Halliwells LLP or the addressee of its existence or contents. If you have received this email in error please delete it and notify Halliwells LLP IT Department on 0870 365 2500. For more information about Halliwells LLP visit www.halliwells.com. " where a fair chunk of the CF9 team are hosting a BOF session :-)" That was a fun and ruckus BOF session at CF.Objective()! Wil Genovese I wonder what Java string objects are used when you create a large string by outputting inside a cfsavecontent. I'm sure ColdFusion implements strings the way it does because it was found to be the most efficient method for the majority of programming needs. ~Brad Probably because it can't know if that's what you actually want to do. We probably need a new function StringAppend or something that would be able to do this. Might be time to hit the wish list! ;-) Building up strings in cfsavecontent also concatenates to the result variable so the problem is the same. ----- Excess quoted text cut - see Original Post for more ----- Good to know. What is your source of this information? ~Brad Sent: Tuesday, June 03, 2008 11:11 AM Building up strings in cfsavecontent also concatenates to the result variable so the problem is the same. > I wonder what Java string objects are used when you create a large > string by outputting inside a cfsavecontent. Just experience, since I've tried all three options (concatenation, cfsavecontent, and StringBuffer) and have had the first two generate out of memory errors while the StringBuffer worked correctly. So while cfsavecontent may indeed be faster and use less memory, I'm still pretty sure that the StringBuffer approach is better especially for very large strings (these were 100+ megabyte CSV files). ----- Excess quoted text cut - see Original Post for more ----- Just a 411 I found a nice little tute on generating csv's using the StringBuffer class in ColdFusion http://www.stillnetstudios.com/2007/03/07/java-strings-in-coldfusion/ -- "The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them." - Sir William Bragg Hmm, I don't think you are correct Brian. I just whipped up a test of string concatenation. Please spare the "proper load test" flames. This is NOT a load test-- it is intended to make a process run long enough to capture a thread stack. Actually, in the context of large file generations I would call it quite appropriate. Concatenation "Hello World. " together 30 thousand times on CF 7.0.2 Ent (Win) shows a vast difference between using & and simply outputting it inside a cfsavecontent. The & definitely spent all its time doing a Java.lang.String.concat(). The cfsavecontent not only executed 211 times faster, I didn't see a single String.contat() happening. Here are the results: And here is the code: <cfset string1 = ""> <cftimer label="string & string&" type="outline"> <cfloop from="1" to="30000" index="i"> <cfset string1 = string1 & "Hello World. "> </cfloop> </cftimer> <cfoutput>String Length: #len(string1)#</cfoutput> <cfsetting enablecfoutputonly="true"> <cftimer label="cfsavecontent" type="outline"> <cfsavecontent variable="string2"> <cfloop from="1" to="30000" index="i"> <cfoutput>Hello World.</cfoutput> </cfloop> </cfsavecontent> </cftimer> <cfsetting enablecfoutputonly="false"> <cfoutput>String Length: #len(string2)#</cfoutput> Update: I experienced the same behavior on CF 8 JVM 1.6 (Win). Well, almost the same-- the CF8 server was actually faster overall. I would like to point out it is actually a slower server too! string & string: 9141ms String Length: 390000 cfsavecontent: 31ms String Length: 390000 Here are the results: string & string: 17093ms String Length: 390000 cfsavecontent: 125ms String Length: 390000 Did you compare the memory usage by chance? G ----- Excess quoted text cut - see Original Post for more ----- No, but I would like to. The problem is I'm not sure how to get any exact numbers. I have SeeFusion installed which will tell me the overall heap size of my JVM, but it might be difficult to nail down how much was used by one thread. Alternatively, there are the totalMemory() and maxMemory() methods in the runtime object available in java.lang.Runtime. I could run that before and after the code ran, but I'm not sure how garbage collection would affect that. Suggestions? ~Brad Did you compare the memory usage by chance? G Ok, here are my memory usage stats on CF 7. Someone please correct me if my code is wrong. It's a little messy, and I apologize for that. Memory Before: 83 Megs string & string: 52795ms String Length: 650000 Memory After: 101 Megs -- Increase of 17 Megs Memory Before: 85 Megs cfsavecontent: 172ms String Length: 650000 Memory After: 97 Megs -- Increase of 12 Megs As you can see, cfsavecontent used about 1/3 less memory that the other method. Not nearly as proportional savings the execution time... Wow-- here are the numbers from my CF8 box: Memory Before: 161 Megs string & string: 26530ms String Length: 650000 Memory After: 195 Megs -- Increase of 35 Megs Memory Before: 158 Megs cfsavecontent: 47ms String Length: 650000 Memory After: 165 Megs -- Increase of 7 Megs This time the cfsavecontent used 4/5ths less memory! Very interesting indeed... Of course, please understand there are many factors and JVM settings that go into this. I'm not trying to claim everyone else will get results like this. Here the latest version of my (slightly sloppy) code: <cfset runtime = CreateObject("java", "java.lang.Runtime").getRuntime()> <cfset total_memory = runtime.totalMemory()> <cfset runtime.gc()> <cfset memory_before = (total_memory-runtime.freeMemory()) / 1024 / 1024> <cfoutput>Memory Before: #round(memory_before)# Megs<br> <cfset string1 = ""> <cftimer label="string & string" type="outline"> <cfloop from="1" to="50000" index="i"> <cfset string1 = string1 & "Hello World. "> </cfloop> </cftimer> <cfoutput>String Length: #len(string1)#</cfoutput><br> <cfset memory_after = (total_memory-runtime.freeMemory()) / 1024 / 1024> Memory After: #round(memory_after)# Megs -- Increase of #round(memory_after - memory_before)# Megs<br> <br /> </cfoutput> <cfset runtime.gc()> <cfset memory_before = (total_memory-runtime.freeMemory()) / 1024 / 1024> <cfoutput>Memory Before: #round(memory_before)# Megs<br></cfoutput> <cfsetting enablecfoutputonly="true"> <cftimer label="cfsavecontent" type="outline"> <cfsavecontent variable="string2"> <cfloop from="1" to="50000" index="i"> <cfoutput>Hello World.</cfoutput> </cfloop> </cfsavecontent> </cftimer> <cfsetting enablecfoutputonly="false"> <cfoutput>String Length: #len(string2)#</cfoutput><br> <cfset memory_after = (total_memory-runtime.freeMemory()) / 1024 / 1024> <cfoutput>Memory After: #round(memory_after)# Megs -- Increase of #round(memory_after - memory_before)# Megs<br></cfoutput> I did a million loops - I don't know what possessed me to do that. Memory was "measured" using task manager. Totally unscientific. I did a restart on the service before each trial. CF 8 developer 2 gig ram Java v. 1.6.0_01 cfsavecontent 2281 ms 192,356 k start 260,872K after 68.516 k difference concatenation using cfset timed out after 15 min 192,362 k start 580,952 k after 388,590 k difference I find this very interesting in that my totally unscientific though process was: Since cfsavecontent is so damn easy to use it *must* be resource intensive. G > Ok, here are my memory usage stats on CF 7. Someone please correct me > if my code is wrong. It's a little messy, and I apologize for that. > > -- "The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them." - Sir William Bragg Wow, I just came back to this thread. REALLY makes me wonder how they're handling cfsavecontent! timed out after 15 min lol I find this very interesting in that my totally unscientific though process was: Since cfsavecontent is so damn easy to use it *must* be resource intensive. Go figure-- ColdFusion strikes again. If string.concat() creates a brand new string object in memory to hold the combination of the two original strings, then given a consistently average sized string being added each time I would expect memory consumption to be exponential. For instance, if you concatenated a string 1 byte in size 10 times over, you would consume 55 bytes of memory. 1+2+3+4+5+6+7+8+9+10=55 For the life of me I can't figure out what the Big-O notation would be for that though... ~Brad I did a million loops - I don't know what possessed me to do that. Memory was "measured" using task manager. Totally unscientific. I did a restart on the service before each trial. CF 8 developer 2 gig ram Java v. 1.6.0_01 cfsavecontent 2281 ms 192,356 k start 260,872K after 68.516 k difference concatenation using cfset 192,362 k start 580,952 k after 388,590 k difference G > Ok, here are my memory usage stats on CF 7. Someone please correct me > if my code is wrong. It's a little messy, and I apologize for that. > > -- "The important thing in science is not so much to obtain new facts as to discover new ways of thinking about them." - Sir William Bragg Well, since we're all conducting our own little tests, here's MY test code: the cfset method took 64 seconds. The cfsavecontent method only takes 203ms. It has GOT to be using a stringbuffer then converting the result to a string at the end. <cfsetting enablecfoutputonly="yes"> <cfsetting requesttimeout="600"> <cfset reps = 100000> <cfif 1> <cfset start = now().gettime()> <cfset result = ""> <cfloop from="1" to="#reps#" step="1" index="i"> <cfset result = result & i> </cfloop> <cfset end = now().gettime()> <cfoutput><p>#end-start#ms : #len(result)#</p></cfoutput> <cfelse> <cfset start = now().gettime()> <cfsavecontent variable="result"> <cfloop from="1" to="#reps#" step="1" index="i"> <cfoutput>#i#</cfoutput> </cfloop> </cfsavecontent> <cfset end = now().gettime()> <cfoutput><p>#end-start#ms : #len(result)#</p></cfoutput> </cfif> Dang closed source apps-- if only we could just go look at the code! :) ~Brad Well, since we're all conducting our own little tests, here's MY test code: the cfset method took 64 seconds. The cfsavecontent method only takes 203ms. It has GOT to be using a stringbuffer then converting the result to a string at the end. (Sorry, I got a little CTRL-Enter happy and sent before I was ready...) Building up strings in cfsavecontent also concatenates to the result variable so the problem is the same. ============================= Hmm, I don't think you are correct Brian. I just whipped up a test of string concatenation. Please spare the "proper load test" flames. This is NOT a load test-- it is intended to make a process run long enough to capture a thread stack. Actually, in the context of large file generations I would call it quite appropriate. Concatenation "Hello World. " together 30 thousand times on CF 7.0.2 Ent (Win) shows a vast difference between using & and simply outputting it inside a cfsavecontent. The & definitely spent all its time doing a Java.lang.String.concat(). The cfsavecontent not only executed 211 times faster, I didn't see a single String.contat() happening. Here are the results: string & string: 17093ms String Length: 390000 cfsavecontent: 125ms String Length: 390000 And here is the code: <cfset string1 = ""> <cftimer label="string & string" type="outline"> <cfloop from="1" to="30000" index="i"> <cfset string1 = string1 & "Hello World. "> </cfloop> </cftimer> <cfoutput>String Length: #len(string1)#</cfoutput> <cfsetting enablecfoutputonly="true"> <cftimer label="cfsavecontent" type="outline"> <cfsavecontent variable="string2"> <cfloop from="1" to="30000" index="i"> <cfoutput>Hello World.</cfoutput> </cfloop> </cfsavecontent> </cftimer> <cfsetting enablecfoutputonly="false"> <cfoutput>String Length: #len(string2)#</cfoutput> This whole discussion prompted two blog entries... Regarding the javaCSV library: http://www.opensourcecf.com/1/2008/06/JavaCSV-for-creating-large-CSV-and-other-delmiited-files-with-Coldfusion.cfm *or http://tinyurl.com/58o7ox* ** Regarding my cfsavecontent performance tests: http://www.opensourcecf.com/1/2008/06/cfsavecontent-vs-cfset-for-performance-improvement.cfm *or http://tinyurl.com/6cafst* Rick > This whole discussion prompted two blog entries... > > Regarding the javaCSV library: > http://www.opensourcecf. com/1/2008/06/Ja> vaCSV-for-creating-large-CSV-and-other-delmiited-files-with-Coldfusion. ----- Excess quoted text cut - see Original Post for more ----- Rick, I thought I run your code test cfset vs cfsavecontent on a slightly different platform - BlueDragon for J2EE running on JBoss AS 4.22 on an XP Pox (intel core 2 duo with 2gig memory). Here are the results: cfset 145861ms : 488895 cfsavecontent 766ms : 488895 regards, larry Who, that's weird-- this message came in my inbox late last night as a duplicate of something I sent yesterday afternoon... I wasn't even in front of a computer at 11:45 pm. :) ~Brad (Sorry, I got a little CTRL-Enter happy and sent before I was ready...) Building up strings in cfsavecontent also concatenates to the result variable so the problem is the same. ============================= Hmm, I don't think ... That's pretty cool, Larry. I was wondering about BD and Smith. Will J2EE BD let you create the java.lang.runtime object to get memory usage etc? If so, I would be interested in seeing the results of my version of the test which reported the memory increase for each test. (I posted the code yesterday. Let me know if the word-wraps trashed it). ~Brad ========================= Rick, I thought I run your code test cfset vs cfsavecontent on a slightly different platform - BlueDragon for J2EE running on JBoss AS 4.22 on an XP Pox (intel core 2 duo with 2gig memory). Here are the results: cfset 145861ms : 488895 cfsavecontent 766ms : 488895 regards, larry Here are the results of your code (again BD for J2EE running on JBoss AS 4.22): string & string&: 33251ms String Length: 390000 cfsavecontent: 62ms String Length: 570006 I ran the test several times, mainly because the results for cfsavecontent looked so much like an outlier, but I got similar results. I'll have to dig up that code you posted, but in general since BlueDragon for J2EE sits on top of a java application server it can access the java.lang.runtime object no problem. When I get home tonight I'll run these tests again using open BlueDragon for J2EE, but I doubt there will be any differences. regards, larry ----- Excess quoted text cut - see Original Post for more ----- Here are the results of your code with java.lang.runtime. Forgot to mention that the JVM is 1.5.0_15-b04. Memory Before: 28 Megs string & string: 99642ms String Length: 650000 Memory After: 91 Megs -- Increase of 63 Megs Memory Before: 29 Megs cfsavecontent: 63ms String Length: 850003 Memory After: 37 Megs -- Increase of 8 Megs Basically what this tells me is that you should use cfsavecontent rather than concatenating strings. But I was not expecting such a difference. larry ----- Excess quoted text cut - see Original Post for more ----- Just ran the same code on Open BlueDragon. NThis test probably is not the equivalent of the previous tests, at home here I'm running this app on a MacBook (core duo 2.16 ghz with 2 gb RAM), running OSX 10.4 Tiger. J2SE 5. But the results are similar: Memory Before: 26 Megs string & string: 553776ms String Length: 650000 Memory After: 45 Megs -- Increase of 19 Megs Memory Before: 32 Megs cfsavecontent: 65ms String Length: 1400010 Memory After: 39 Megs -- Increase of 8 Megs ----- Excess quoted text cut - see Original Post for more ----- Thanks Larry. ~Brad Here are the results of your code with java.lang.runtime. Forgot to mention that the JVM is 1.5.0_15-b04.
|
Mailing Lists
|
Latest Fusion Authority Articles
|
||||||