|
Mailing Lists
|
Home /
Groups /
ColdFusion Talk (CF-Talk)
Odd database behavior: duplicate key error
Hey all,Josh Nathanson 09/10/07 04:11 P You probably have a race condition, where two people execute the codeBarney Boisvert 09/10/07 04:16 P If this is true - wrap the code in a named lock:Jake Pilgrim 09/10/07 05:01 P Thanks Jake and Barney, I'll lock it and see if that helps.Josh Nathanson 09/10/07 05:05 P Sorry for my ignorance on this,Vince Collins 09/11/07 10:29 A Yes. CFTRANSACTION only creates a transaction for the current threadBrian Kotek 09/11/07 10:46 A Thanks Brian!Vince Collins 09/11/07 11:12 A > Yes. CFTRANSACTION only creates a transaction for the currentDave Watts 09/11/07 02:30 P I can't test it myself to simulate the race condition, but I definitely knowBrian Kotek 09/11/07 03:37 P >>people had tried CFTRANSACTION alone and it didn't make a difference.Claude Schneegans 09/11/07 03:44 P Well that's just it, from what Dave (and Simon's article) are saying, thisBrian Kotek 09/11/07 04:07 P Jake Pilgrim wrote:Jochem van Dieten 09/12/07 08:16 A Brian Kotek wrote:Jochem van Dieten 09/12/07 08:14 A Dave,Jaime Metcher 09/11/07 06:14 P If he is using MS SQL server wouldn't with(tablock holdlock) also needBrad Wood 09/11/07 04:18 P Original poster here...for the record I'm using MySQL 5.Josh Nathanson 09/11/07 04:45 P > According to livedocs cftransaction by default uses theDave Watts 09/12/07 01:19 A No, it's definitely set by the DB. For our Oracle servers, it'sJames Holmes 09/12/07 01:42 A Dave Watts wrote:Jochem van Dieten 09/12/07 08:11 A > Yes, you could use CFtransaction for this type of isolation,Dave Watts 09/12/07 01:27 A > If he is using MS SQL server wouldn't with(tablock holdlock)Dave Watts 09/12/07 01:35 A > I can't test it myself to simulate the race condition, but IDave Watts 09/12/07 01:42 A > They recently changed it to the current wording because IDave Watts 09/12/07 08:51 A Just to follow up on this thread, is the final verdict that you must useBrian Kotek 09/13/07 10:18 A I don't want to cloud this conversation but I have to say that in all myVince Collins 09/13/07 11:18 A Brian Kotek wrote:Jochem van Dieten 09/13/07 12:00 P > Just to follow up on this thread, is the final verdict thatDave Watts 09/13/07 12:21 P Hey all, In an inherited application, we have the following pseudo-code - two queries, one right after the other: <cfquery> DELETE FROM table WHERE ID = '#myidstring#' </cfquery> <cfquery> INSERT INTO table (ID) VALUES ('#myidstring#') </cfquery> Given this scenario, it would seem that you could never have a duplicate key error, since you are deleting any rows with the string in question, before trying to do an insert. However, occasionally this throws a "duplicate key" error. Is it possible that somehow the insert occasionally runs before the delete? Do I need to use cftransaction to ensure the proper order of execution here? You probably have a race condition, where two people execute the code at the "same time", and the order goes like this: A - delete B - delete A - insert B - insert In this case, you'd expect the fourth statement (the second insert) to throw the dupe key exception. cheers, barneyb ----- Excess quoted text cut - see Original Post for more ----- -- Barney Boisvert bboisvert@gmail.com http://www.barneyb.com/ Got Gmail? I have 100 invites. If this is true - wrap the code in a named lock: <cflock name="aUniqueLockName" type="exclusive"> ... your SQL here </cflock> A named lock ensures that only one thread at a time will be allowed to run the code within the cflock tag. CFTransaction will not give you this same level of isolation. Jake Pilgrim ----- Excess quoted text cut - see Original Post for more ----- Thanks Jake and Barney, I'll lock it and see if that helps. -- Josh ----- Excess quoted text cut - see Original Post for more ----- Sorry for my ignorance on this, I would have suggested using cftransaction. However, now, I'm thinking that cftransaction may just ensure that that particular "thread" of commits are done in order but it doesn't prevent other "threads'" from colliding with the database which is why cflock is used? Is that correct? Vince Jake Pilgrim wrote: ----- Excess quoted text cut - see Original Post for more ----- Yes. CFTRANSACTION only creates a transaction for the current thread (actually the current database connection from the connection pool). If multiple threads could be running the same process at the same time, resulting in a race condition, then you need to lock that set of queries as well. Use a named lock so that it only applies to that block of code, or move the code into a stored procedure which should allow the database itself to manage any concurrency issues. ----- Excess quoted text cut - see Original Post for more ----- Thanks Brian! Brian Kotek wrote: ----- Excess quoted text cut - see Original Post for more ----- ----- Excess quoted text cut - see Original Post for more ----- This is not correct at all, according to my understanding. CFTRANSACTION creates a database transaction, which uses either locking or MVCC (depending on the specifics of the database used) to prevent concurrent access to database objects that would cause data integrity problems. It doesn't just affect the current thread; it affects any attempt to simultaneously manipulate the same database objects - not just from within other threads in CF, but even from other database clients. The default isolation level for CFTRANSACTION is SERIALIZABLE, if I recall correctly. http://en.wikipedia.org/wiki/Isolation_(computer_science) In short, CFTRANSACTION will not only prevent race conditions within a series of related queries across multiple requests, it should be used instead of CFLOCK for this purpose. And, moving the code into a stored procedure, by itself, would make absolutely no difference, unless you also placed transactional logic within it - which would have the same effect as using CFTRANSACTION. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! I can't test it myself to simulate the race condition, but I definitely know that people have been talking about this issue for a very long time. I can remember reading posts about it going back for years, and in those cases, people had tried CFTRANSACTION alone and it didn't make a difference. When they added a named lock, the problem was resolved. So, I'm not sure if the behavior is different between different RDBMSs, or if the isolation level of the transaction makes some difference, but there have definitely been cases in the past where people did not get the desired result with CFTRANSACTION alone. I've read of others having a similar issue where inserting and then trying to select the last inserted value within a CFTRANSACTION block would still return the incorrect id if another thread had also performed an insert at the same time. A quick Google search reveals this article by Simon, where he seems to indicate that setting the isolation level to "serializable" should work. Maybe that is the key. It would be worth it to try that out before adding the named lock. You know more about databases than I do Dave, does that sound like it would make a difference? ----- Excess quoted text cut - see Original Post for more ----- >>people had tried CFTRANSACTION alone and it didn't make a difference. CFTRANSACTION won't help for race condition problems, it only handles rollback if a problem happens during execution. But two updates can still compete one against another without causing any execution error. CFLOCK will handle race condition. -- _______________________________________ REUSE CODE! Use custom tags; See http://www.contentbox.com/claude/customtags/tagstore.cfm (Please send any spam to this address: piegeacon@internetique.com) Thanks. Well that's just it, from what Dave (and Simon's article) are saying, this is not the case. CFTRANSACTION (with the appropriate level of isolation) should handle concurrency across threads as well as handling rollbacks. I'd like to confirm that this is true, because if it is, my whole understanding of CFTRANSACTION was incorrect and things would be a lot easier without having to add locks manually. ----- Excess quoted text cut - see Original Post for more ----- Yes, you could use CFtransaction for this type of isolation, but this can also cause problems that you wouldn't see with cflock. Cftransaction isoloation='serializabile' for example will give you an exclusive lock to all tables within the transaction (most restrictive level of isloation). Depending on what you're doing, this can be desirable or problematic as this can easily cause database deadlocks which are almost always highly frustrating to the end user, and are usually equally frustrating to debug. Cflock will not (directly) cause database deadlocks, but can be problematic if you have a large number of requests that are going to be using the cflock'ed code as a lock will wait (until timeout) for the locked resource to be released. So potential database lock error vs potential cflock timeout error - the choice is yours. :) Jake Pilgrim wrote: > Yes, you could use CFtransaction for this type of isolation, but this can also cause problems that you wouldn't see with cflock. Cftransaction isoloation='serializabile' for example will give you an exclusive lock to all tables within the transaction (most restrictive level of isloation). No it doesn't. Serializability does not require some sort of locking, it requires that parallel transactions will have the same outcome as if they were executed one after another. Locking is an implementation detail, not some purpose. MVCC based databases exploit the difference between mathematical serializability and serializability according to the SQL standard to make sure that exclusive locking is not necessary: no objects will be exclusively locked just because they participate in a serializable transaction. And for instance MS SQL Server uses intend locks on tables (which are not exclusive) combined with exclusive locks on rows (only to be escalated when a transaction has taken 1250 locks). Jochem Brian Kotek wrote: > I can't test it myself to simulate the race condition, but I definitely know > that people have been talking about this issue for a very long time. I can > remember reading posts about it going back for years, and in those cases, > people had tried CFTRANSACTION alone and it didn't make a difference. When > they added a named lock, the problem was resolved. So, I'm not sure if the > behavior is different between different RDBMSs It is. > or if the isolation level of the transaction makes some difference It does. > but there have definitely been cases > in the past where people did not get the desired result with CFTRANSACTION > alone. I've read of others having a similar issue where inserting and then > trying to select the last inserted value within a CFTRANSACTION block would > still return the incorrect id if another thread had also performed an insert > at the same time. Semantics: the database always correctly returns the last inserted ID. The objective of preventing another thread from inserting another ID between the insertion and the select in the thread with cftransaction just wasn't met. > A quick Google search reveals this article by Simon, where he seems to > indicate that setting the isolation level to "serializable" should work. For the last inserted ID issue it should. Jochem Dave, According to livedocs cftransaction by default uses the default isolation level of the database - which in most installations will be either read committed or repeatable read. What you say is true if one sets the isolation level explicitly to serializable. However (I'm sure you know this but it's worth saying), it's hard to overstate the crippling effect on database throughput that serializable transactions can have. I would recommend that even for lightly loaded sites it's worth the extra effort to put a locking strategy in place using cflock, and save serializable transactions for "stop the world" database maintenance tasks. Jaime Metcher ----- Excess quoted text cut - see Original Post for more ----- If he is using MS SQL server wouldn't with(tablock holdlock) also need to be used on the delete AS WELL as a transaction to actually prevent another process from inserting into the table until the transaction was complete. It is my understanding that with(holdlock) would be necessary for the lock to be maintained until the transaction was committed. Furthermore wouldn't the DB attempt to only acquire an exclusive lock on the rows being modified? If course, depending of your lock escalation threshold, the row locks may have been escalated to a table lock anyway. Regardless, with(tablock) would be the only way to ensure an exclusive lock was placed on the entire table. A named cflock would help but ONLY if CF code on ONE server was the only thing accessing that table in the database. Code from another machine, or a SQL Agent job would not care about a named cflock. ~Brad ----- Excess quoted text cut - see Original Post for more ----- This is not correct at all, according to my understanding. CFTRANSACTION creates a database transaction, which uses either locking or MVCC (depending on the specifics of the database used) to prevent concurrent access to database objects that would cause data integrity problems. It doesn't just affect the current thread; it affects any attempt to simultaneously manipulate the same database objects - not just from within other threads in CF, but even from other database clients. The default isolation level for CFTRANSACTION is SERIALIZABLE, if I recall correctly. http://en.wikipedia.org/wiki/Isolation_(computer_science) In short, CFTRANSACTION will not only prevent race conditions within a series of related queries across multiple requests, it should be used instead of CFLOCK for this purpose. And, moving the code into a stored procedure, by itself, would make absolutely no difference, unless you also placed transactional logic within it - which would have the same effect as using CFTRANSACTION. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! Original poster here...for the record I'm using MySQL 5. Update: I added the named lock and that squashed the problem completely. In my case there is some additional code involved rather than just back-to-back database updates, PLUS the possibility of some shared scope issues, AND we are on a single server, so I think the named lock is the best option for me in this case. I will definitely keep cftransaction in mind, if I have a situation which seems to merit it more definitively. -- Josh > According to livedocs cftransaction by default uses the > default isolation level of the database - which in most > installations will be either read committed or repeatable > read. This has come up before; I believe that this may be a documentation error. I recall receiving an email about this from Macromedia tech support saying that the default isolation level for CFTRANSACTION was SERIALIZABLE, but since I'm away from my email archive I can't get it now. I'll try to follow up on this when I return. But in any case, in the majority of instances where people use CFTRANSACTION, READ_COMMITTED would be sufficient to prevent data integrity problems/logical errors/race conditions, and it would still be preferable to using CFLOCK. ----- Excess quoted text cut - see Original Post for more ----- When you want to prevent any concurrent access to database objects, you want to use serializable transactions. Of course, like any sort of locking, the more restrictive the lock the greater the potential bottleneck, but serializable transactions do have an important place in many applications - and not just for maintenance purposes. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! No, it's definitely set by the DB. For our Oracle servers, it's read-committed. I've checked this with code to make sure (I can do so again if no-one believes me :-) Regardless, Dave's points still hold - cftransaction works across all DB sessions, across all connections. What it you have two CF servers hitting the DB at the same time? CFLOCK won't help. What if someone is using an entirely different server system against your DB at the same time? etc. Unless the app is under extreme load, a read-committed transaction will generally do the job. ----- Excess quoted text cut - see Original Post for more ----- -- mxAjax / CFAjax docs and other useful articles: http://www.bifrost.com.au/blog/ Dave Watts wrote: ----- Excess quoted text cut - see Original Post for more ----- They recently changed it to the current wording because I filed a bug against the previous wording that it used serializable. Jochem ----- Excess quoted text cut - see Original Post for more ----- Databases provide concurrency control for exactly this purpose. If I want to control concurrent access within the database, I want to use the functionality that the database provides. It's there for a reason. Recommending the synchronization of application code to solve database concurrency problems goes against best practices for database applications as well as common sense. If you can use a less restrictive isolation level, of course, you should. As for deadlocks, these don't just happen - they require a certain level of complexity within your transactional logic. You need to have conflicting transactions which access the same resources in different order. For example, transaction 1 locks table A and needs to then lock table B, while transaction 2 locks table B and needs to then lock table A. This isn't as common as you imply. In any case, most DBMSs simply kill one of the two transactions to resolve the deadlock, and I don't see how that's any worse than a lock timeout from the end user's perspective - they both result in "no workie". Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! > If he is using MS SQL server wouldn't with(tablock holdlock) > also need to be used on the delete AS WELL as a transaction > to actually prevent another process from inserting into the > table until the transaction was complete. Not if you specify a sufficiently restrictive isolation level. > Furthermore wouldn't the DB attempt to only acquire an > exclusive lock on the rows being modified? Again, this depends on the isolation level, among other things. > A named cflock would help but ONLY if CF code on ONE server > was the only thing accessing that table in the database. > Code from another machine, or a SQL Agent job would not care > about a named cflock. Exactly! This is why databases provide concurrency control. This is why we rely on that concurrency control instead of writing our own. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! ----- Excess quoted text cut - see Original Post for more ----- This will depend on the isolation level of the transaction and the default concurrency control behavior of your database. > A quick Google search reveals this article by Simon, where he > seems to indicate that setting the isolation level to > "serializable" should work. Maybe that is the key. It would be > worth it to try that out before adding the named lock. You know > more about databases than I do Dave, does that sound like it > would make a difference? Yes, using SERIALIZABLE should prevent the problem you're describing. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! > They recently changed it to the current wording because I > filed a bug against the previous wording that it used > serializable. Ah! So I remembered a minor detail, but was completely wrong about the important part. Thanks for setting me straight! Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information! Just to follow up on this thread, is the final verdict that you must use cflock to prevent these kinds of concurrency issues in the database, UNLESS you use CFTRANSACTION with the serializable isolation level? ----- Excess quoted text cut - see Original Post for more ----- I don't want to cloud this conversation but I have to say that in all my years of development, I've never had a database collision using ColdFusion. I put my multiple database calls into cftransaction and I use cflock when I'm trying to access a client's session or cookie. I've never put a single cflock around my cftransactions and I've been fine. Maybe, according to some theories on this thread, a better way to put it is that I've been lucky but I'm not so sure. I allow the database to control this which is what my MS SQL Servers are designed for. Maybe other databases are problematic? Maybe my applications aren't as busy? I'm really trying to understand the logic behind the need to use cflock in place of or in addition to cftransaction. Vince Collins http://www.vincentcollins.com Brian Kotek wrote: > Just to follow up on this thread, is the final verdict that you must use > cflock to prevent these kinds of concurrency issues in the database, UNLESS > you use CFTRANSACTION with the serializable isolation level? There may be specific cases where the above holds true, but there is no general answer to that. Databases are just too different. Configuration parameters are just too different. Queries are just too different. Jochem > Just to follow up on this thread, is the final verdict that > you must use cflock to prevent these kinds of concurrency > issues in the database, UNLESS you use CFTRANSACTION with the > serializable isolation level? That's not my final verdict, for what that's worth; depending on the specific queries and the granularity of locks (in databases that use locking) you may well be able to use REPEATABLE_READ or even READ_COMMITTED. But if you need a serializable transaction, you shouldn't be afraid to use one. Dave Watts, CTO, Fig Leaf Software http://www.figleaf.com/ Fig Leaf Software provides the highest caliber vendor-authorized instruction at our training centers in Washington DC, Atlanta, Chicago, Baltimore, Northern Virginia, or on-site at your location. Visit http://training.figleaf.com/ for more information!
|
March 21, 2010
|
Latest Fusion Authority Articles
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||