|
Mailing Lists
|
Home /
Groups /
ColdFusion Talk (CF-Talk)
CSV Generation MEMORY SUCK
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306845
Thanks Larry.
~Brad
Here are the results of your code with java.lang.runtime. Forgot to
mention that the JVM is 1.5.0_15-b04.
Author: Larry Lyons
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306844
Just ran the same code on Open BlueDragon. NThis test probably is not the
equivalent of the previous tests, at home here I'm running this app on a MacBook
(core duo 2.16 ghz with 2 gb RAM), running OSX 10.4 Tiger. J2SE 5. But the
results are similar:
Memory Before: 26 Megs
string & string: 553776ms
String Length: 650000
Memory After: 45 Megs -- Increase of 19 Megs
Memory Before: 32 Megs
cfsavecontent: 65ms
String Length: 1400010
Memory After: 39 Megs -- Increase of 8 Megs
----- Excess quoted text cut - see Original Post for more -----
Author: Larry Lyons
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306832
Here are the results of your code with java.lang.runtime. Forgot to mention that
the JVM is 1.5.0_15-b04.
Memory Before: 28 Megs
string & string: 99642ms
String Length: 650000
Memory After: 91 Megs -- Increase of 63 Megs
Memory Before: 29 Megs
cfsavecontent: 63ms
String Length: 850003
Memory After: 37 Megs -- Increase of 8 Megs
Basically what this tells me is that you should use cfsavecontent rather than
concatenating strings. But I was not expecting such a difference.
larry
----- Excess quoted text cut - see Original Post for more -----
Author: Larry Lyons
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306831
Here are the results of your code (again BD for J2EE running on JBoss AS 4.22):
string & string&: 33251ms
String Length: 390000
cfsavecontent: 62ms
String Length: 570006
I ran the test several times, mainly because the results for cfsavecontent looked
so much like an outlier, but I got similar results.
I'll have to dig up that code you posted, but in general since BlueDragon for
J2EE sits on top of a java application server it can access the java.lang.runtime
object no problem. When I get home tonight I'll run these tests again using open
BlueDragon for J2EE, but I doubt there will be any differences.
regards,
larry
----- Excess quoted text cut - see Original Post for more -----
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306819
That's pretty cool, Larry. I was wondering about BD and Smith.
Will J2EE BD let you create the java.lang.runtime object to get memory
usage etc? If so, I would be interested in seeing the results of my
version of the test which reported the memory increase for each test.
(I posted the code yesterday. Let me know if the word-wraps trashed
it).
~Brad
=========================
Rick,
I thought I run your code test cfset vs cfsavecontent on a slightly
different platform - BlueDragon for J2EE running on JBoss AS 4.22 on an
XP Pox (intel core 2 duo with 2gig memory). Here are the results:
cfset 145861ms : 488895
cfsavecontent 766ms : 488895
regards,
larry
Author: Larry Lyons
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306818
> This whole discussion prompted two blog entries...
>
> Regarding the javaCSV library:
> http://www.opensourcecf.
com/1/2008/06/Ja>
vaCSV-for-creating-large-CSV-and-other-delmiited-files-with-Coldfusion.
----- Excess quoted text cut - see Original Post for more -----
Rick,
I thought I run your code test cfset vs cfsavecontent on a slightly different
platform - BlueDragon for J2EE running on JBoss AS 4.22 on an XP Pox (intel core
2 duo with 2gig memory). Here are the results:
cfset 145861ms : 488895
cfsavecontent 766ms : 488895
regards,
larry
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306810
Who, that's weird-- this message came in my inbox late last night as a
duplicate of something I sent yesterday afternoon... I wasn't even in
front of a computer at 11:45 pm. :)
~Brad
(Sorry, I got a little CTRL-Enter happy and sent before I was ready...)
Building up strings in cfsavecontent also concatenates to the result
variable so the problem is the same.
=============================
Hmm, I don't think ...
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306809
This whole discussion prompted two blog entries...
Regarding the javaCSV library:
http://www.opensourcecf.com/1/2008/06/JavaCSV-for-creating-large-CSV-and-other-delmiited-files-with-Coldfusion.cfm
*or http://tinyurl.com/58o7ox*
**
Regarding my cfsavecontent performance tests:
http://www.opensourcecf.com/1/2008/06/cfsavecontent-vs-cfset-for-performance-improvement.cfm
*or http://tinyurl.com/6cafst*
Rick
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306789
(Sorry, I got a little CTRL-Enter happy and sent before I was ready...)
Building up strings in cfsavecontent also concatenates to the result
variable so the problem is the same.
=============================
Hmm, I don't think you are correct Brian. I just whipped up a test of
string concatenation.
Please spare the "proper load test" flames. This is NOT a load test--
it is intended to make a process run long enough to capture a thread
stack. Actually, in the context of large file generations I would call
it quite appropriate.
Concatenation "Hello World. " together 30 thousand times on CF 7.0.2 Ent
(Win) shows a vast difference between using & and simply outputting it
inside a cfsavecontent.
The & definitely spent all its time doing a Java.lang.String.concat().
The cfsavecontent not only executed 211 times faster, I didn't see a
single String.contat() happening.
Here are the results:
string & string: 17093ms
String Length: 390000
cfsavecontent: 125ms
String Length: 390000
And here is the code:
<cfset string1 = "">
<cftimer label="string & string" type="outline">
<cfloop from="1" to="30000" index="i">
<cfset string1 = string1 & "Hello World. ">
</cfloop>
</cftimer>
<cfoutput>String Length: #len(string1)#</cfoutput>
<cfsetting enablecfoutputonly="true">
<cftimer label="cfsavecontent" type="outline">
<cfsavecontent variable="string2">
<cfloop from="1" to="30000" index="i">
<cfoutput>Hello World.</cfoutput>
</cfloop>
</cfsavecontent>
</cftimer>
<cfsetting enablecfoutputonly="false">
<cfoutput>String Length: #len(string2)#</cfoutput>
Author: Gerald Guido
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306764
Just a 411
I found a nice little tute on generating csv's using the StringBuffer class
in ColdFusion
http://www.stillnetstudios.com/2007/03/07/java-strings-in-coldfusion/
--
"The important thing in science is not so much to obtain new facts as to
discover new ways of thinking about them."
- Sir William Bragg
Author: Brian Kotek
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306751
Just experience, since I've tried all three options (concatenation,
cfsavecontent, and StringBuffer) and have had the first two generate out of
memory errors while the StringBuffer worked correctly. So while
cfsavecontent may indeed be faster and use less memory, I'm still pretty
sure that the StringBuffer approach is better especially for very large
strings (these were 100+ megabyte CSV files).
----- Excess quoted text cut - see Original Post for more -----
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306732
Dang closed source apps-- if only we could just go look at the code! :)
~Brad
Well, since we're all conducting our own little tests, here's MY test
code:
the cfset method took 64 seconds. The cfsavecontent method only takes
203ms.
It has GOT to be using a stringbuffer then converting the result to a
string
at the end.
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306731
Well, since we're all conducting our own little tests, here's MY test code:
the cfset method took 64 seconds. The cfsavecontent method only takes
203ms.
It has GOT to be using a stringbuffer then converting the result to a string
at the end.
<cfsetting enablecfoutputonly="yes">
<cfsetting requesttimeout="600">
<cfset reps = 100000>
<cfif 1>
<cfset start = now().gettime()>
<cfset result = "">
<cfloop from="1" to="#reps#" step="1" index="i">
<cfset result = result & i>
</cfloop>
<cfset end = now().gettime()>
<cfoutput><p>#end-start#ms :
#len(result)#</p></cfoutput>
<cfelse>
<cfset start = now().gettime()>
<cfsavecontent variable="result">
<cfloop from="1" to="#reps#" step="1" index="i">
<cfoutput>#i#</cfoutput>
</cfloop>
</cfsavecontent>
<cfset end = now().gettime()>
<cfoutput><p>#end-start#ms :
#len(result)#</p></cfoutput>
</cfif>
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306728
timed out after 15 min
lol
I find this very interesting in that my totally unscientific though
process
was: Since cfsavecontent is so damn easy to use it *must* be resource
intensive.
Go figure-- ColdFusion strikes again.
If string.concat() creates a brand new string object in memory to hold
the combination of the two original strings, then given a consistently
average sized string being added each time I would expect memory
consumption to be exponential.
For instance, if you concatenated a string 1 byte in size 10 times over,
you would consume 55 bytes of memory. 1+2+3+4+5+6+7+8+9+10=55
For the life of me I can't figure out what the Big-O notation would be
for that though...
~Brad
I did a million loops - I don't know what possessed me to do that.
Memory was "measured" using task manager. Totally unscientific.
I did a restart on the service before each trial.
CF 8 developer
2 gig ram
Java v. 1.6.0_01
cfsavecontent
2281 ms
192,356 k start
260,872K after
68.516 k difference
concatenation using cfset
192,362 k start
580,952 k after
388,590 k difference
G
> Ok, here are my memory usage stats on CF 7. Someone please correct me
> if my code is wrong. It's a little messy, and I apologize for that.
>
>
--
"The important thing in science is not so much to obtain new facts as to
discover new ways of thinking about them."
- Sir William Bragg
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306726
Wow, I just came back to this thread.
REALLY makes me wonder how they're handling cfsavecontent!
Author: Gerald Guido
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306722
I did a million loops - I don't know what possessed me to do that.
Memory was "measured" using task manager. Totally unscientific.
I did a restart on the service before each trial.
CF 8 developer
2 gig ram
Java v. 1.6.0_01
cfsavecontent
2281 ms
192,356 k start
260,872K after
68.516 k difference
concatenation using cfset
timed out after 15 min
192,362 k start
580,952 k after
388,590 k difference
I find this very interesting in that my totally unscientific though process
was: Since cfsavecontent is so damn easy to use it *must* be resource
intensive.
G
> Ok, here are my memory usage stats on CF 7. Someone please correct me
> if my code is wrong. It's a little messy, and I apologize for that.
>
>
--
"The important thing in science is not so much to obtain new facts as to
discover new ways of thinking about them."
- Sir William Bragg
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306715
Ok, here are my memory usage stats on CF 7. Someone please correct me
if my code is wrong. It's a little messy, and I apologize for that.
Memory Before: 83 Megs
string & string: 52795ms
String Length: 650000
Memory After: 101 Megs -- Increase of 17 Megs
Memory Before: 85 Megs
cfsavecontent: 172ms
String Length: 650000
Memory After: 97 Megs -- Increase of 12 Megs
As you can see, cfsavecontent used about 1/3 less memory that the other
method. Not nearly as proportional savings the execution time...
Wow-- here are the numbers from my CF8 box:
Memory Before: 161 Megs
string & string: 26530ms
String Length: 650000
Memory After: 195 Megs -- Increase of 35 Megs
Memory Before: 158 Megs
cfsavecontent: 47ms
String Length: 650000
Memory After: 165 Megs -- Increase of 7 Megs
This time the cfsavecontent used 4/5ths less memory!
Very interesting indeed... Of course, please understand there are many
factors and JVM settings that go into this. I'm not trying to claim
everyone else will get results like this.
Here the latest version of my (slightly sloppy) code:
<cfset runtime = CreateObject("java", "java.lang.Runtime").getRuntime()>
<cfset total_memory = runtime.totalMemory()>
<cfset runtime.gc()>
<cfset memory_before = (total_memory-runtime.freeMemory()) / 1024 /
1024>
<cfoutput>Memory Before: #round(memory_before)# Megs<br>
<cfset string1 = "">
<cftimer label="string & string" type="outline">
<cfloop from="1" to="50000" index="i">
<cfset string1 = string1 & "Hello World. ">
</cfloop>
</cftimer>
<cfoutput>String Length: #len(string1)#</cfoutput><br>
<cfset memory_after = (total_memory-runtime.freeMemory()) / 1024 / 1024>
Memory After: #round(memory_after)# Megs -- Increase of
#round(memory_after - memory_before)# Megs<br>
<br />
</cfoutput>
<cfset runtime.gc()>
<cfset memory_before = (total_memory-runtime.freeMemory()) / 1024 /
1024>
<cfoutput>Memory Before: #round(memory_before)#
Megs<br></cfoutput>
<cfsetting enablecfoutputonly="true">
<cftimer label="cfsavecontent" type="outline">
<cfsavecontent variable="string2">
<cfloop from="1" to="50000" index="i">
<cfoutput>Hello World.</cfoutput>
</cfloop>
</cfsavecontent>
</cftimer>
<cfsetting enablecfoutputonly="false">
<cfoutput>String Length: #len(string2)#</cfoutput><br>
<cfset memory_after = (total_memory-runtime.freeMemory()) / 1024 / 1024>
<cfoutput>Memory After: #round(memory_after)# Megs -- Increase of
#round(memory_after - memory_before)# Megs<br></cfoutput>
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306714
No, but I would like to. The problem is I'm not sure how to get any
exact numbers. I have SeeFusion installed which will tell me the
overall heap size of my JVM, but it might be difficult to nail down how
much was used by one thread.
Alternatively, there are the totalMemory() and maxMemory() methods in
the runtime object available in java.lang.Runtime. I could run that
before and after the code ran, but I'm not sure how garbage collection
would affect that.
Suggestions?
~Brad
Did you compare the memory usage by chance?
G
Author: Gerald Guido
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306711
Did you compare the memory usage by chance?
G
----- Excess quoted text cut - see Original Post for more -----
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306710
Update: I experienced the same behavior on CF 8 JVM 1.6 (Win).
Well, almost the same-- the CF8 server was actually faster overall. I
would like to point out it is actually a slower server too!
string & string: 9141ms
String Length: 390000
cfsavecontent: 31ms
String Length: 390000
Here are the results:
string & string: 17093ms
String Length: 390000
cfsavecontent: 125ms
String Length: 390000
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306707
Hmm, I don't think you are correct Brian. I just whipped up a test of
string concatenation.
Please spare the "proper load test" flames. This is NOT a load test--
it is intended to make a process run long enough to capture a thread
stack. Actually, in the context of large file generations I would call
it quite appropriate.
Concatenation "Hello World. " together 30 thousand times on CF 7.0.2 Ent
(Win) shows a vast difference between using & and simply outputting it
inside a cfsavecontent.
The & definitely spent all its time doing a Java.lang.String.concat().
The cfsavecontent not only executed 211 times faster, I didn't see a
single String.contat() happening.
Here are the results:
And here is the code:
<cfset string1 = "">
<cftimer label="string & string&" type="outline">
<cfloop from="1" to="30000" index="i">
<cfset string1 = string1 & "Hello World. ">
</cfloop>
</cftimer>
<cfoutput>String Length: #len(string1)#</cfoutput>
<cfsetting enablecfoutputonly="true">
<cftimer label="cfsavecontent" type="outline">
<cfsavecontent variable="string2">
<cfloop from="1" to="30000" index="i">
<cfoutput>Hello World.</cfoutput>
</cfloop>
</cfsavecontent>
</cftimer>
<cfsetting enablecfoutputonly="false">
<cfoutput>String Length: #len(string2)#</cfoutput>
Author: Wil Genovese
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306705
" where a fair chunk of the
CF9 team are hosting a BOF session :-)"
That was a fun and ruckus BOF session at CF.Objective()!
Wil Genovese
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306704
Good to know.
What is your source of this information?
~Brad
Sent: Tuesday, June 03, 2008 11:11 AM
Building up strings in cfsavecontent also concatenates to the result
variable so the problem is the same.
> I wonder what Java string objects are used when you create a large
> string by outputting inside a cfsavecontent.
Author: Brian Kotek
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306703
Building up strings in cfsavecontent also concatenates to the result
variable so the problem is the same.
----- Excess quoted text cut - see Original Post for more -----
Author: Tom Chiverton
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306699
> Probably because it can't know if that's what you actually want to do. We
> probably need a new function StringAppend or something that would be able
> to do this. Might be time to hit the wish list! ;-)
I'm leaving for Scotch on the Rocks in ~12 hours, where a fair chunk of the
CF9 team are hosting a BOF session :-)
--
Tom Chiverton
****************************************************
This email is sent for and on behalf of Halliwells LLP.
Halliwells LLP is a limited liability partnership registered in England and Wales
under registered number OC307980 whose registered office address is at Halliwells
LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB. A list of members is
available for inspection at the registered office. Any reference to a partner in
relation to Halliwells LLP means a member of Halliwells LLP. Regulated by The
Solicitors Regulation Authority.
CONFIDENTIALITY
This email is intended only for the use of the addressee named above and may be
confidential or legally privileged. If you are not the addressee you must not
read it and must not use any information contained in nor copy it nor inform any
person other than Halliwells LLP or the addressee of its existence or contents.
If you have received this email in error please delete it and notify Halliwells
LLP IT Department on 0870 365 2500.
For more information about Halliwells LLP visit
www.halliwells.com.
Author: Brad Wood
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306696
I wonder what Java string objects are used when you create a large
string by outputting inside a cfsavecontent.
I'm sure ColdFusion implements strings the way it does because it was
found to be the most efficient method for the majority of programming
needs.
~Brad
Probably because it can't know if that's what you actually want to do.
We
probably need a new function StringAppend or something that would be
able to
do this. Might be time to hit the wish list! ;-)
Author: Brian Kotek
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306695
Probably because it can't know if that's what you actually want to do. We
probably need a new function StringAppend or something that would be able to
do this. Might be time to hit the wish list! ;-)
----- Excess quoted text cut - see Original Post for more -----
Author: Tom Chiverton
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306685
> I found a nice little java class library called JavaCSV that handles all
> the file writing and dropped my time from 68 seconds to 18 seconds. That
> has potential!
Why CF can't translate '&' to a StringBuffer append I'll never know...
--
Tom Chiverton
****************************************************
This email is sent for and on behalf of Halliwells LLP.
Halliwells LLP is a limited liability partnership registered in England and Wales
under registered number OC307980 whose registered office address is at Halliwells
LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB. A list of members is
available for inspection at the registered office. Any reference to a partner in
relation to Halliwells LLP means a member of Halliwells LLP. Regulated by The
Solicitors Regulation Authority.
CONFIDENTIALITY
This email is intended only for the use of the addressee named above and may be
confidential or legally privileged. If you are not the addressee you must not
read it and must not use any information contained in nor copy it nor inform any
person other than Halliwells LLP or the addressee of its existence or contents.
If you have received this email in error please delete it and notify Halliwells
LLP IT Department on 0870 365 2500.
For more information about Halliwells LLP visit
www.halliwells.com.
Author: Mark Kruger
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306628
Don't forget to turn off debugging (or remove the 127.0.0.1 ip)
-mark
Mark A. Kruger, CFG, MCSE
(402) 408-3733 ext 105
www.cfwebtools.com
www.coldfusionmuse.com
www.necfug.com
>
> Also, test you page with the user's query but with the output part
> (actually writing the file) commented out... If the page is still slow
> and a huge memory hog then the file stuff above won't help much and
> you'll have to look at running the query in java too, but I but you'll
> get something by handling the file better.
>
I honestly don't think it's the file writing that's the problem.
I just commented out the fileWrite() statements inside the <cfloop> tags
that would write each line to the file, (the file still being opened with
the "header" row being written to it... and it made zero difference at all
in the length of time it took to complete.
The query itself runs quite fast. Returns a lot of rows but isn't a complex
query.
Anyway, I put some cfoutput statements in my gateway (I'm calling it as a
direct cfc call not a gateway for testing).. output now().gettime() to see
the ms as the method call progresses.
The query returns its results in less than 4 seconds. The process of
generating the csv (around line 330-340 of the sample code I posted earlier)
took 62 of the 68 seconds.
And that was without actually WRITING the file.
the GOOD news is that a TAB delimited file takes considerably less time (32
seconds vs. 68 seconds), cutting the "output time" from 62 seconds to 26
seconds. Which means the csvFormat() function is taking up a very large
part of the processing time.
This is my csvFormat() function:
<cffunction name="csvFormat" output="false" access="public"
returnType="string">
<cfargument name="str" type="string" required="yes"> <cfif
arguments.str
neq "" and not isNumeric(arguments.str)>
<cfreturn
"#Chr(34)##replace(arguments.str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34
)#">
<cfelse>
<cfreturn arguments.str>
</cfif>
</cffunction>
Author: Mark Kruger
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306627
Rick,
So... The file is a selection of columns and filter criteria - right? It
always varies per user.... Your right - it's a sticky problem :)
-Mark
Mark A. Kruger, CFG, MCSE
(402) 408-3733 ext 105
www.cfwebtools.com
www.coldfusionmuse.com
www.necfug.com
SQL Server 2005.
I'm open to suggestion. This is part of an application that allows users to
generate CSV files of their own based on their own criteria, so though I'm
open to "non-CF" solutions, I'm not sure there really would be anyway except
maybe a homegrown java class to handle the work and be more strict with
memory consumption....
Rick
----- Excess quoted text cut - see Original Post for more -----
suck.
----- Excess quoted text cut - see Original Post for more -----
140MB.
----- Excess quoted text cut - see Original Post for more -----
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306621
> >> dropped my time from 68 seconds to 18 seconds
>
> Nice. Is that entirety of the code sans the query?
Entirety. The query itself takes about 4 seconds to execute and return all
its data.
----- Excess quoted text cut - see Original Post for more -----
Yeah, that's the one.
Rick
Author: Gerald Guido
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306608
>> dropped my time from 68 seconds to 18 seconds
Nice. Is that entirety of the code sans the query?
>>> little java class library called JavaCSV that handles
The the one from SourceForge? I am going to need something like this
shortly.
G
--
"The important thing in science is not so much to obtain new facts as to
discover new ways of thinking about them."
- Sir William Bragg
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306602
I found a nice little java class library called JavaCSV that handles all the
file writing and dropped my time from 68 seconds to 18 seconds. That has
potential!
It basically handles the writing of delimiters and the proper csv
formatting.. so here's my code:
<cfset var fileOutput = createObject("java","com.csvreader.CsvWriter")>
<cfset fileOutput.init("#expandPath("..")#\drops\#filename#")>
<cfloop query="resultSet">
<!--- write record --->
<cfloop from="1" to="#numFields#" index="i" step="1">
<cfset fileOutput.write(
resultSet[fieldsArray[i]][resultSet.currentRow].toString() )>
</cfloop>
<!--- write end of record --->
<cfset fileOutput.endRecord()>
</cfloop>
<cfset fileOutput.close()>
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306594
> The difference is that you have to use the StringBuffer for everything.
> Since you aren't passing the StringBuffer into the CSVFormat method and I
> don't see the code for that method, I assume it is still suffering from the
> creation of large numbers of String instances. Try passing the StringBuffer
> into CSVFormat and use it within the method to append the data.
Now *THAT* I hadn't though of. Lemme give that a whirl.
Rick
Author: Brian Kotek
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306592
No, I opened it and saw that it was 400 lines long and didn't have time to
go through it all. But sweeping through it quickly, the same advice applies.
The difference is that you have to use the StringBuffer for everything.
Since you aren't passing the StringBuffer into the CSVFormat method and I
don't see the code for that method, I assume it is still suffering from the
creation of large numbers of String instances. Try passing the StringBuffer
into CSVFormat and use it within the method to append the data.
----- Excess quoted text cut - see Original Post for more -----
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306591
----- Excess quoted text cut - see Original Post for more -----
I could do that but as I mentioned, the file writing is not the problem.
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306589
----- Excess quoted text cut - see Original Post for more -----
This is only being done on a per field basis, so it's not a manipulation
being done on a large scale. At least not on a large string. It's being
done on the individual fields.
I suspect that the largest string of data being dealt with by the
csvFormat() function is 50 characters.
--
Rick Root
New Brian Vander Ark Album, songs in the music player and cool behind the
scenes video at www.myspace.com/brianvanderark
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306587
Didn't look at the code, eh?
----- Excess quoted text cut - see Original Post for more -----
Author: Brian Kotek
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306583
Use a Java StringBuffer or StringBuilder. Concatenating large strings in CF
is always a memory hog because every single concatenation creates a new
String instance. Check RIAForge, there are CFC libraries that wrap using
these Java classes for exactly this purpose. You'll find memory usage drops
dramatically.
----- Excess quoted text cut - see Original Post for more -----
Author: Tom Chiverton
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306580
> generating the csv (around line 330-340 of the sample code I posted
> earlier) took 62 of the 68 seconds.
Why not output the file all at once, rather than a line at a time (scrap lines
~336 - just keeping .append()'ing to a StringBuffer till your done) ?
Also, have you benchmarked the EnquireLookupCFC (line 333) ? What does that
do ?
--
Tom Chiverton
****************************************************
This email is sent for and on behalf of Halliwells LLP.
Halliwells LLP is a limited liability partnership registered in England and Wales
under registered number OC307980 whose registered office address is at Halliwells
LLP, 3 Hardman Square, Spinningfields, Manchester, M3 3EB. A list of members is
available for inspection at the registered office. Any reference to a partner in
relation to Halliwells LLP means a member of Halliwells LLP. Regulated by The
Solicitors Regulation Authority.
CONFIDENTIALITY
This email is intended only for the use of the addressee named above and may be
confidential or legally privileged. If you are not the addressee you must not
read it and must not use any information contained in nor copy it nor inform any
person other than Halliwells LLP or the addressee of its existence or contents.
If you have received this email in error please delete it and notify Halliwells
LLP IT Department on 0870 365 2500.
For more information about Halliwells LLP visit
www.halliwells.com.
Author: Gerald Guido
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306579
>> "#Chr(34)##replace(arguments
>
> .str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34)#"
There is your bottle neck. CF does not like string manipulation on a large
scale. I have tried to parsed large text files before only to watched my dev
box just keel over.
I see two options off the top of my head, let SQL server do the work or use
Java for the string manipulation.
Last time I had to parse a large text file like this I ended up writting an
ActiveX script for DTS (long time ago) .
G
----- Excess quoted text cut - see Original Post for more -----
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306576
> Also, test you page with the user's query but with the output part
> (actually writing the file) commented out... If the page is still slow
> and a huge memory hog then the file stuff above won't help much and
> you'll have to look at running the query in java too, but I but you'll
> get something by handling the file better.
>
I honestly don't think it's the file writing that's the problem.
I just commented out the fileWrite() statements inside the <cfloop> tags
that would write each line to the file, (the file still being opened with
the "header" row being written to it... and it made zero difference at all
in the length of time it took to complete.
The query itself runs quite fast. Returns a lot of rows but isn't a complex
query.
Anyway, I put some cfoutput statements in my gateway (I'm calling it as a
direct cfc call not a gateway for testing).. output now().gettime() to see
the ms as the method call progresses.
The query returns its results in less than 4 seconds. The process of
generating the csv (around line 330-340 of the sample code I posted earlier)
took 62 of the 68 seconds.
And that was without actually WRITING the file.
the GOOD news is that a TAB delimited file takes considerably less time (32
seconds vs. 68 seconds), cutting the "output time" from 62 seconds to 26
seconds. Which means the csvFormat() function is taking up a very large
part of the processing time.
This is my csvFormat() function:
<cffunction name="csvFormat" output="false" access="public"
returnType="string">
<cfargument name="str" type="string" required="yes">
<cfif arguments.str neq "" and not isNumeric(arguments.str)>
<cfreturn
"#Chr(34)##replace(arguments.str,chr(34),"#chr(34)##chr(34)#","ALL")##Chr(34)#">
<cfelse>
<cfreturn arguments.str>
</cfif>
</cffunction>
Author: Gaulin, Mark
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306570
We had a similar issue with an internal application and we made good
improvements using the standard java classes PrintStream,
FileOutputStream, and BufferedOutputStream to handle the writing to your
file.
Something like this (shown in java, so you'd have to wrap it properly
with cfscript):
PrintStream out = new PrintStream(new BufferedOutputStream(new
FileOutputStream("filename", true), bufferSize)); // pick a
decent-sized buffer.. Maybe 100k to start
This will let you do "out.print(...)" and "out.println(...)" from CF
that I think will be much more efficient than what CF can do. Be sure
to do "out.close()" within the page (so use try/catch to be sure that
the close happens).
Also, test you page with the user's query but with the output part
(actually writing the file) commented out... If the page is still slow
and a huge memory hog then the file stuff above won't help much and
you'll have to look at running the query in java too, but I but you'll
get something by handling the file better.
Thanks
Mark
SQL Server 2005.
I'm open to suggestion. This is part of an application that allows
users to generate CSV files of their own based on their own criteria, so
though I'm open to "non-CF" solutions, I'm not sure there really would
be anyway except maybe a homegrown java class to handle the work and be
more strict with memory consumption....
Rick
----- Excess quoted text cut - see Original Post for more -----
suck.
----- Excess quoted text cut - see Original Post for more -----
140MB.
----- Excess quoted text cut - see Original Post for more -----
Author: Phillip Duba
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306569
I know when I had to do this at a previous job I used ArrayAppend to build
each line in the CSV, but I see you are using the string buffer. I had no
performance diffs at the time, so I just stayed with the CF solution. The
one thing I would look at is not using list functions, but instead using
Array functions and then one ArrayToList at the end. Also, make sure the
queries being executed aren't intensive either. I found in our CSV
generation, for every second the query took, it took one second on output so
my resources were essentially 50/50 between query and output,
Phil
----- Excess quoted text cut - see Original Post for more -----
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306567
SQL Server 2005.
I'm open to suggestion. This is part of an application that allows users to
generate CSV files of their own based on their own criteria, so though I'm
open to "non-CF" solutions, I'm not sure there really would be anyway except
maybe a homegrown java class to handle the work and be more strict with
memory consumption....
Rick
----- Excess quoted text cut - see Original Post for more -----
Author: Mark Kruger
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306563
Rick,
What's your DB platform? Are you sure there is not a better "non-cf" way to
do it?
Mark A. Kruger, CFG, MCSE
(402) 408-3733 ext 105
www.cfwebtools.com
www.coldfusionmuse.com
www.necfug.com
So I've got a problem with generating large csv files.. it's a memory suck.
I do this in an event gateway so that these file drops are generated "in the
background"... here's the gateway code:
http://cfm.pastebin.org/40043
The larger the file drop, the worse the memory suck is. A relatively small
drop of about 7200 rows and 138 columns (just over 1 million pieces of data)
took 68 seconds. In my production environment, I've estimated that I can
generate between 15,000 and 20,000 pieces of data per second using the code
above.
The problem is this drop (which only generates a 5MB file) causes a memory
suck of about 100MB...
Take a look at this output from the server monitor:
www.it.dev.duke.edu/public/temp.rtf
It shows the memory graph generated from two file drops, at 9:38 and 9:45
am... the first one spiked the memory from 70MB to 170MB...the second one
dropped the memory back to about 90MB and then spiked it to 140MB.
Of course, this size drop is not what causes my concern, it's when people
are dropping 10x that amount.. say 80,000 rows at 130 columns. Over 10
million pieces of data, would take nearly 9 minutes ASSUMING you had no
memory issues, which I would. Such a drop would basically crash the
instance.
--
Rick Root
New Brian Vander Ark Album, songs in the music player and cool behind the
scenes video at www.myspace.com/brianvanderark
Author: Rick Root
Short Link: http://www.houseoffusion.com/groups/cf-talk/thread.cfm/threadid:56583#306560
So I've got a problem with generating large csv files.. it's a memory suck.
I do this in an event gateway so that these file drops are generated "in the
background"... here's the gateway code:
http://cfm.pastebin.org/40043
The larger the file drop, the worse the memory suck is. A relatively small
drop of about 7200 rows and 138 columns (just over 1 million pieces of data)
took 68 seconds. In my production environment, I've estimated that I can
generate between 15,000 and 20,000 pieces of data per second using the code
above.
The problem is this drop (which only generates a 5MB file) causes a memory
suck of about 100MB...
Take a look at this output from the server monitor:
www.it.dev.duke.edu/public/temp.rtf
It shows the memory graph generated from two file drops, at 9:38 and 9:45
am... the first one spiked the memory from 70MB to 170MB...the second one
dropped the memory back to about 90MB and then spiked it to 140MB.
Of course, this size drop is not what causes my concern, it's when people
are dropping 10x that amount.. say 80,000 rows at 130 columns. Over 10
million pieces of data, would take nearly 9 minutes ASSUMING you had no
memory issues, which I would. Such a drop would basically crash the
instance.
--
Rick Root
New Brian Vander Ark Album, songs in the music player and cool behind the
scenes video at www.myspace.com/brianvanderark
|
May 24, 2012
|
Latest Fusion Authority Articles
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||