House of Fusion
Search over 2,500 ColdFusion resources here
  
Home of the ColdFusion Community

Mailing Lists
Home /  Groups /  ColdFusion Talk (CF-Talk)

Checking for a duplicate value question

  << Previous Post |  RSS |  Sort Oldest First |  Sort Latest First |  Subscribe to this Group Next >> 
> Not sure how you tested this
Leigh
05/04/12 06:10 A
The point is that it is still possible...
Andrew Scott
05/04/12 06:41 A
I did two test using the basic code below.
Les Mizzell
05/05/12 02:41 P
Top  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Les Mizzell
05/03/2012 12:37 AM

Got an app that sends out email to various lists  - CF8 I'm checking to be sure there are any duplicates between the two lists req.this list - usually pretty small - ten or so email addresses req.groupLIST -  is the problem - it could be a500 or more addresses at times. <cfloop list="#req.thisLIST#" index="i"> <cfif NOT listfindnocase(req.groupLIST,#i#)>        DO MY STUFF HERE </cfif> </cfloop> Is there a more efficient way to do the above? If req.groupLIST ends up being a HUGE list - at what point will this choke down and time out?

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Andrew Scott
05/03/2012 01:11 AM

When getting the information out of the database, us distinct on the field. -- Regards, Andrew Scott WebSite: http://www.andyscott.id.au/ Google+: http://plus.google.com/108193156965451149543 On Thu, May 3, 2012 at 2:37 PM, Les Mizzell <lesmizz@bellsouth.net> wrote: ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Les Mizzell
05/03/2012 12:10 PM

> When getting the information out of the database, us distinct on the field. Only one list is coming from the database, the other is from a completely different source. > convert your list to an array and loop over that The small list is the one that will have the operations done to it. The large list is the one that I need to search for duplicate values. If the server was running CF9 then I could use ArrayFind but that's not available in CF8.

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Cameron Childress
05/03/2012 04:30 PM

On Thu, May 3, 2012 at 12:10 PM, Les Mizzell <lesmizz@bellsouth.net> wrote: >  > convert your list to an array and loop over that > > The small list is the one that will have the operations done to it. The > large list is the one that I need to search for duplicate values. If the > server was running CF9 then I could use ArrayFind but that's not > available in CF8. Soooo.... You don't want to convert to an array and loop over that rather than a list? ArrayFind() is a shortcut for looping over an array, but you can also do that manually. Small lists aren't much of a problem, just big ones. So convert the long list to an array, or maybe even just use the query object from the DB on the long list. Leave the short list as is, it's unlikely to ever cause performance issues at 10 items. Psudocode: loop array=biglist {   if (listFind(biglistItem,smalllist))   {       smalllist = listDeleteAt(smalllist, listFind(biglistItem,smalllist)));   } } Now you have a de-duped small list. -Cameron ...

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
.jonah
05/03/2012 04:46 PM

You could also use a query-of-queries with where biglistitem not in (smalllistitems). On 5/3/12 1:29 PM, Camer ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Andrew Scott
05/03/2012 05:10 PM

Ever since ColdFusion 6, ColdFusion has always had the feature to do an ArrayFind().... http://www.andyscott.id.au/2012/5/3/ColdFusion-and-using-ArrayFind-prior-to-ColdFusion-9 -- Regards, Andrew Scott WebSite: http://www.andyscott.id.au/ Google+: http://plus.google.com/108193156965451149543

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Leigh
05/03/2012 11:25 PM

> Ever since ColdFusion 6, ColdFusion has always had the feature to do anArrayFind()....   True. Though indexOf() has a few additional nuances over arrayFind. It is both case and data type sensitive. -Leig

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Andrew Scott
05/04/2012 04:49 AM

ArrayFind() is case senstive, that is why there is an ArrayFindNoCase(). -- Regards, Andrew Scott WebSite: http://www.andyscott.id.au/ Google+: http://plus.google.com/108193156965451149543 ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Leigh
05/04/2012 05:13 AM

>  ArrayFind() is case senstive, that is why there is an ArrayFindNoCase(). Yes but the indexOf(..) method is _always_ case sensitive. Also, unlike arrayFind/FindNoCase it is data type sensitive as well. ie indexOf(15) does not produce the same results as indexOf("15"). So it is good to be aware of the nuances.   -Lei

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Andrew Scott
05/04/2012 05:44 AM

Not sure how you tested this, but if you have <cfset myArray = ["1","2","3","4"] /> <cfoutput> #myArray.indexOf(1)+1# </cfoutput> It returns the index of the value, secondly if you wish to do a case insenstive index then it is still possible. Lets take the following code for example that returns true or false if it contains the string <cfset myArray = ["myTest","2","3","4"] /> <cfoutput> #myArray.indexOf(lcase('mytest'))+1# #containsIgnoreCase(myArray, 'myTest1')# </cfoutput> <cfscript> public boolean function containsIgnoreCase(required array arrayList, required string s) { var it = arrayList.iterator(); while(it.hasNext()) { if(it.next().equalsIgnoreCase(s)) return true; } return false; } </cfscript> So as you can see it is extremely possible, But what would be better is if ColdFusion used Java as part of its language and was OOP, that would mean you could write a component that extended the ArrayList / Array and override the equals to do the case insensitive compare. -- Regards, Andrew Scott WebSite: http://www.andyscott.id.au/ Google+: http://plus.google.com/108193156965451149543 ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Leigh
05/04/2012 06:10 AM

> Not sure how you tested this Define a variable representing a number as a double:           <cfset y = val("15")>           <cfset arr = [y]> Now run indexOf("15"). The value is not found because unlike CF functions, java lists also consider data type when determining element equality.         15 = #arr.indexOf("15")#  ==> -1 / not found          y = #arr.indexOf(y)#  ==> 0 / found > It returns the index of the value, secondly if you wish to do a case > insenstive index then it is still possible. Well my point was not that it is _not_ possible ;-) It was that indexOf is not a straight equivalent of CF9's array functions. So before using it,  be aware of these two differences. >  public boolean function containsIgnoreCase(required array arrayList, > var it = arrayList.iterator(); > while(it.hasNext()) { Honestly, I do not see the benefit of down to the java level for this. If you are going to end up looping anyway, may as well use a native cfloop array="...". It is

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Leigh
05/04/2012 06:16 AM

Grr.. my responses keep getting cut off.  Anyway, the very last sentence was: "Honestly, I do not see the benefit of down to the java level for this. If you are going to end up looping anyway, may as well use a native cfloop array="...". It is simpler and is less code".   -Leigh </testWhereResponseIsCutOff

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Leigh
05/04/2012 06:47 AM

>  The point is that it is still possible...   {tap, tap} Is this thing on? -Leig

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Andrew Scott
05/04/2012 06:57 AM

Leight I am not going to get into any more of a debate over this, the point that I made is that you can use Java to do the ArrayFind to be used in ColdFusion 6 - 8, now although you have brought up a good point any decent developer can run with this, and write an ArrayFind to do the same thing. It took me a whole 2 mins to write the ArrayFind(array, object, [case]) that can do the same thing as ColdFusion 9's ArrayFind() and ArrayFindNoCase(), I will stand by the fact that just because it is not in an older version doesn't mean you can't do it. Your point of the orignal indexOf() is very valid, but I maintain that it is do able if one is to think outside the box. -- Regards, Andrew Scott WebSite: http://www.andyscott.id.au/ Google+: http://plus.google.com/108193156965451149543 ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Leigh
05/04/2012 09:02 AM

Andrew - Sorry if you misunderstood me. It is not a debate. It is a simple explanation of how the method actually works. As the earlier example demonstrated, indexOf does not behave in a typeless manner - as most would expect from CF - and as arrayFind/FindNoCase do.  So it does quite nicely for case sensitive string searches. But if you fail to account for its detection of data types, the results for numbers, dates, etcetera will frequently be wrong.  So like any other function, it is important to understand how it operates so you can use it properly and avoid common gotchas.  -Leigh ====

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Andrew Scott
05/04/2012 09:07 AM

Yeah I know which is why I agree with that point you made, I have created a new blog post where I have taken the code for indexOf() from java and modified it to work in this manner for ColdFusion. -- Regards, Andrew Scott WebSite: http://www.andyscott.id.au/ Google+: http://plus.google.com/108193156965451149543 ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Les Mizzell
05/04/2012 06:44 PM

Thanks for this post. I've not had time to get back into this yet, but will run some experiments against the data I've got to see what actually works best. Worst case would be three or four email addresses being searched for in a result list of 4000 or so returned from the database. > I have created a new blog post where I have taken the code for indexOf() from java and > modified it to work in this manner for ColdFusion. > > -- Regards, Andrew Scott WebSite: http://www.andyscott.id.au/

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Les Mizzell
05/05/2012 02:41 PM

I did two test using the basic code below. First test exactly like below using a list. 2nd test, modifying the below to build an array instead of a list and checking for a duplicate value in the array The query used returned a list of close to 14,000 email addresses.... Average total time in both cases was around 400 milliseconds (out of running the code 20 times or so) On the average, the list compare method edged out the array by 10 or 20 milliseconds, BUT - sometimes the array method won out. Not sure what I learned from that. The list method is slightly less code and I've got that working already. Think I'll leave it as is. At the most, this type of thing only gets run a few times a day, so it's not constantly chewing server resources. --------------------------------------------------------------------------------------------------------------------------------- <cfset req.thisLIST = "lesmizz@bellsouth.net,someoneelse@somewhere.com"> <cfset sTime = getTickCount()> <cfquery name="rmailLIST">BIG QUERY HERE THAT RETURNS 14,000 email addresses</cfquery> <p>Time after Initial Query: <cfoutput>#evaluate(getTickCount() - sTime)# milliseconds</cfoutput></p> <cfset req.groupLIST = #valuelist(rmailLIST.ml_email)# /> <p>Time after Build List: <cfoutput>#evaluate(getTickCount() - sTime)# milliseconds</cfoutput></p> <cfoutput> <cfloop list="#req.thisLIST#" index="i"> <cfif IsDefined("i") AND isVALID("email", "#i#") AND NOT listfindnocase(req.groupLIST,#i#)> <p>#i# wasn't found</p> <cfelse> <p>#i# was found</p> </cfif> </cfloop> </cfoutput> <p>Time after CHECK LIST: <cfoutput>#evaluate(getTickCount() - sTime)# milliseconds</cfoutput></p> > I have created a new blog post where I have taken the code for indexOf() from java and > modified it to work in this manner for ColdFusion. > -- Regards, Andrew Scott WebSite: http://www.andyscott.id.au/

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
priya23a24
05/06/2012 04:13 PM

Anyways if you are looping why don't u add it like a key of structure in that case all keys are your unique email id's. On May 5, 2012, at 2:41 PM, Les Mizzell <lesmizz@bellsouth.net> wrote: ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Cameron Childress
05/03/2012 07:13 AM

As Andrew says, it's best done in the DB before your data becomes a list. If you can't do that for some reason, convert your list to an array and loop over that.  Arrays are much faster than long lists. The longer the list/array, the bigger the performance difference. - Cameron On May 3, 2012, at 12:37 AM, Les Mizzell <lesmizz@bellsouth.net> wrote: ----- Excess quoted text cut - see Original Post for more -----

Top  |   Parent  |   Reply  |   Original Post  |   RSS Feed  |   Subscribe to this Group
Author:
Christopher Watson
05/03/2012 12:39 PM

In reading through this thread, it appears as though one of your lists (not clear whether it's the short or long list) is sourced from database, and the other not. But if you've got one list of key column values in a CF List var, then you should be able to query the database to pull out everything NOT IN that list. Something like: <cfquery name="UniqueRecords" datasource="MyDatasource">     SELECT DISTINCT SomeColumn     FROM SomeTable     WHERE SomeColumn NOT IN (<cfqueryparam cfsqltype="cf_sql_varchar" list="yes" value="#OtherList#">) </cfquery> Then ValueList the column from that result, append your other list, and you should have your comprehensive list, with no dupes: <cfset ComprehensiveList = "#ValueList(UniqueRecords.SomeColumn)#,#OtherList#"> -Christopher > I'm checking to be sure there are any duplicates between the two lists > ... > [other content and sample code clipped]


<< Previous Thread Today's Threads Next Thread >>

Search cf-talk

June 20, 2013

<<   <   Today   >   >>
Su Mo Tu We Th Fr Sa
             1
2 3 4 5 6 7 8
9 10 11 12 13 14 15
16 17 18 19 20 21 22
23 24 25 26 27 28 29
30             

Designer, Developer and mobile workflow conference