You are viewing a plain text version of this content. The canonical link for it is here.

Posted to derby-user@db.apache.org by David Van Couvering <Da...@Sun.COM> on 2005/07/19 18:23:10 UTC

[Fwd: Re: locking issues]

Hi, all.  We have an internal user who is having deadlock problems.  Can 
anybody on the group help Eric out here?

Eric, please do subscribe to derby-user@db.apache.org by sending an 
email to derby-user-subscribe@db.apache.org, this is a very helpful list.

Thanks,

David

-------- Original Message --------
Subject: Re: locking issues
Date: Tue, 19 Jul 2005 12:02:01 +0200
From: Eric Lemoine <Er...@Sun.COM>

Hi Olav,


On Tue, Jul 19, 2005 at 09:36:20AM +0200, Olav Sandstaa wrote:
> See comments inline.
> 
> Eric Lemoine <Er...@Sun.COM> wrote:
> > 2005-07-18 10:04:42.945 GMT Thread[DRDAConnThread_3,5,main] (XID = 15355871), (SESSIONID = 6), (DATABASE = Ganymede), (DRDAID = AC10C102.OE9F-448669991195791140{7}), Cleanup action starting
> > 2005-07-18 10:04:42.945 GMT Thread[DRDAConnThread_3,5,main] (XID = 15355871), (SESSIONID = 6), (DATABASE = Ganymede), (DRDAID = AC10C102.OE9F-448669991195791140{7}), Failed Statement is: DELETE FROM hourlymeasures WHERE time < 1121681048 AND mo IN (SELECT id FROM managedobjects WHERE node=1)
> > ERROR 40001: A lock could not be obtained due to a deadlock, cycle of locks & waiters is:
> > Lock : ROW, HOURLYMEASURES, (3,229)
> >   Waiting XID : {15355871, U} , GANYMEDE, DELETE FROM hourlymeasures WHERE time < 1121681048 AND mo IN (SELECT id FROM managedobjects WHERE node=1)
> >   Granted XID : {15345560, X}
> > Lock : ROW, HOURLYMEASURES, (78,164)
> >   Waiting XID : {15345560, U} , GANYMEDE, DELETE FROM hourlymeasures WHERE time < 1121681045 AND mo IN (SELECT id FROM managedobjects WHERE node=3)
> >   Granted XID : {15355871, U}
> > The selected victim is XID : 15355871
> > ---
> > 
> > I understand from this trace that that XID 15355871 is trying to grab
> > a lock on entry (3,229), which is already grabbed by XID 15345560.
> > And, at the same time, XID 15345560 is trying to grab a lock on entry
> > (78,164), which is already grabbed by XID 15355871. Is my
> > understanding correct?
> 
> Yes.
> 
> > Why does the Derby code let a transaction try to grab a lock while it
> > already holds on another one? Reading the Developper's Guide I
> > understand that the transaction should release (or downgrade it to
> > "shared" [*]) an update lock on a row when going to the next row. To
> > me this is in contradiction with the above trace. Can anyone shed
> > so light here?
> 
> A transaction will keep all its lock until it commits, at least all
> its exclusive locks. Since you are using Uncommitted Read isolation
> level the transaction should release the update locks when it has
> decided that it will not do an update to the record as you said. 
> 
> In the example above you see one transaction (15355871) that holds one
> Update lock while it tries to grab one more Update lock. So why didn't
> this transaction not release the first Update lock before trying to
> grab a new Update lock?
> I am not sure of the exact answer, but I think
> this might have to do with that Derby will keep the first Update lock
> until it actually have acquired the next lock (and then release it).
> (I am sending a CC to two other members in the team who might have a
> better answer to this.)

That's precisely my concern. If another transaction grabs the same
two Update locks in the reverse order, this is deadlock-prone.

To me this deadlock looks like a bug in the Derby server. I, as
a JDBC user, have no control on how row locking is done.

> I hope this helped. I will try to get back to you with a more detailed
> answer when I have it.

Thanks a lot Olav,

-- 
Eric

Re: [Fwd: Re: locking issues]

Posted by David Van Couvering <Da...@Sun.COM>.

Thanks, Mike, forwarding your request to Eric.

Mike Matrigali wrote:

> Is there any chance to get more info on reproducing the problem.
> 
> Best case would be a copy of the database, with the tables populated as
> they appeared when the problem happened.
> 
> If the data is sensitive or too big, then at least the ddl involved in
> creating the tables involved and what indexes exist is necessary to
> understand what locking is going on.
> 
> Getting a query plan may be necessary also.
> 
> Is it definite that both sessions are using read uncommitted isolation
> level?  Is autocommit being used?
> 
> 
> 
> David Van Couvering wrote:
> 
>>Hi, all.  We have an internal user who is having deadlock problems.  Can
>>anybody on the group help Eric out here?
>>
>>Eric, please do subscribe to derby-user@db.apache.org by sending an
>>email to derby-user-subscribe@db.apache.org, this is a very helpful list.
>>
>>Thanks,
>>
>>David
>>
>>-------- Original Message --------
>>Subject: Re: locking issues
>>Date: Tue, 19 Jul 2005 12:02:01 +0200
>>From: Eric Lemoine <Er...@Sun.COM>
>>
>>Hi Olav,
>>
>>
>>On Tue, Jul 19, 2005 at 09:36:20AM +0200, Olav Sandstaa wrote:
>>
>>
>>>See comments inline.
>>>
>>>Eric Lemoine <Er...@Sun.COM> wrote:
>>>
>>>>2005-07-18 10:04:42.945 GMT Thread[DRDAConnThread_3,5,main] (XID =
>>>
>>>15355871), (SESSIONID = 6), (DATABASE = Ganymede), (DRDAID =
>>>AC10C102.OE9F-448669991195791140{7}), Cleanup action starting
>>>
>>>>2005-07-18 10:04:42.945 GMT Thread[DRDAConnThread_3,5,main] (XID =
>>>
>>>15355871), (SESSIONID = 6), (DATABASE = Ganymede), (DRDAID =
>>>AC10C102.OE9F-448669991195791140{7}), Failed Statement is: DELETE FROM
>>>hourlymeasures WHERE time < 1121681048 AND mo IN (SELECT id FROM
>>>managedobjects WHERE node=1)
>>>
>>>>ERROR 40001: A lock could not be obtained due to a deadlock, cycle
>>>
>>>of locks & waiters is:
>>>
>>>>Lock : ROW, HOURLYMEASURES, (3,229)
>>>>  Waiting XID : {15355871, U} , GANYMEDE, DELETE FROM hourlymeasures
>>>
>>>WHERE time < 1121681048 AND mo IN (SELECT id FROM managedobjects WHERE
>>>node=1)
>>>
>>>>  Granted XID : {15345560, X}
>>>>Lock : ROW, HOURLYMEASURES, (78,164)
>>>>  Waiting XID : {15345560, U} , GANYMEDE, DELETE FROM hourlymeasures
>>>
>>>WHERE time < 1121681045 AND mo IN (SELECT id FROM managedobjects WHERE
>>>node=3)
>>>
>>>>  Granted XID : {15355871, U}
>>>>The selected victim is XID : 15355871
>>>>---
>>>>
>>>>>I understand from this trace that that XID 15355871 is trying to grab
>>>>
>>>>a lock on entry (3,229), which is already grabbed by XID 15345560.
>>>>And, at the same time, XID 15345560 is trying to grab a lock on entry
>>>>(78,164), which is already grabbed by XID 15355871. Is my
>>>>understanding correct?
>>>
>>>Yes.
>>>
>>>
>>>>Why does the Derby code let a transaction try to grab a lock while it
>>>>already holds on another one? Reading the Developper's Guide I
>>>>understand that the transaction should release (or downgrade it to
>>>>"shared" [*]) an update lock on a row when going to the next row. To
>>>>me this is in contradiction with the above trace. Can anyone shed
>>>>so light here?
>>>
>>>A transaction will keep all its lock until it commits, at least all
>>>its exclusive locks. Since you are using Uncommitted Read isolation
>>>level the transaction should release the update locks when it has
>>>decided that it will not do an update to the record as you said.
>>>In the example above you see one transaction (15355871) that holds one
>>>Update lock while it tries to grab one more Update lock. So why didn't
>>>this transaction not release the first Update lock before trying to
>>>grab a new Update lock?
>>>I am not sure of the exact answer, but I think
>>>this might have to do with that Derby will keep the first Update lock
>>>until it actually have acquired the next lock (and then release it).
>>>(I am sending a CC to two other members in the team who might have a
>>>better answer to this.)
>>
>>
>>That's precisely my concern. If another transaction grabs the same
>>two Update locks in the reverse order, this is deadlock-prone.
>>
>>To me this deadlock looks like a bug in the Derby server. I, as
>>a JDBC user, have no control on how row locking is done.
>>
>>
>>>I hope this helped. I will try to get back to you with a more detailed
>>>answer when I have it.
>>
>>
>>Thanks a lot Olav,
>>

Re: [Fwd: Re: locking issues]

Posted by Mike Matrigali <mi...@sbcglobal.net>.

Is there any chance to get more info on reproducing the problem.

Best case would be a copy of the database, with the tables populated as
they appeared when the problem happened.

If the data is sensitive or too big, then at least the ddl involved in
creating the tables involved and what indexes exist is necessary to
understand what locking is going on.

Getting a query plan may be necessary also.

Is it definite that both sessions are using read uncommitted isolation
level?  Is autocommit being used?



David Van Couvering wrote:
> Hi, all.  We have an internal user who is having deadlock problems.  Can
> anybody on the group help Eric out here?
> 
> Eric, please do subscribe to derby-user@db.apache.org by sending an
> email to derby-user-subscribe@db.apache.org, this is a very helpful list.
> 
> Thanks,
> 
> David
> 
> -------- Original Message --------
> Subject: Re: locking issues
> Date: Tue, 19 Jul 2005 12:02:01 +0200
> From: Eric Lemoine <Er...@Sun.COM>
> 
> Hi Olav,
> 
> 
> On Tue, Jul 19, 2005 at 09:36:20AM +0200, Olav Sandstaa wrote:
> 
>> See comments inline.
>>
>> Eric Lemoine <Er...@Sun.COM> wrote:
>> > 2005-07-18 10:04:42.945 GMT Thread[DRDAConnThread_3,5,main] (XID =
>> 15355871), (SESSIONID = 6), (DATABASE = Ganymede), (DRDAID =
>> AC10C102.OE9F-448669991195791140{7}), Cleanup action starting
>> > 2005-07-18 10:04:42.945 GMT Thread[DRDAConnThread_3,5,main] (XID =
>> 15355871), (SESSIONID = 6), (DATABASE = Ganymede), (DRDAID =
>> AC10C102.OE9F-448669991195791140{7}), Failed Statement is: DELETE FROM
>> hourlymeasures WHERE time < 1121681048 AND mo IN (SELECT id FROM
>> managedobjects WHERE node=1)
>> > ERROR 40001: A lock could not be obtained due to a deadlock, cycle
>> of locks & waiters is:
>> > Lock : ROW, HOURLYMEASURES, (3,229)
>> >   Waiting XID : {15355871, U} , GANYMEDE, DELETE FROM hourlymeasures
>> WHERE time < 1121681048 AND mo IN (SELECT id FROM managedobjects WHERE
>> node=1)
>> >   Granted XID : {15345560, X}
>> > Lock : ROW, HOURLYMEASURES, (78,164)
>> >   Waiting XID : {15345560, U} , GANYMEDE, DELETE FROM hourlymeasures
>> WHERE time < 1121681045 AND mo IN (SELECT id FROM managedobjects WHERE
>> node=3)
>> >   Granted XID : {15355871, U}
>> > The selected victim is XID : 15355871
>> > ---
>> > > I understand from this trace that that XID 15355871 is trying to grab
>> > a lock on entry (3,229), which is already grabbed by XID 15345560.
>> > And, at the same time, XID 15345560 is trying to grab a lock on entry
>> > (78,164), which is already grabbed by XID 15355871. Is my
>> > understanding correct?
>>
>> Yes.
>>
>> > Why does the Derby code let a transaction try to grab a lock while it
>> > already holds on another one? Reading the Developper's Guide I
>> > understand that the transaction should release (or downgrade it to
>> > "shared" [*]) an update lock on a row when going to the next row. To
>> > me this is in contradiction with the above trace. Can anyone shed
>> > so light here?
>>
>> A transaction will keep all its lock until it commits, at least all
>> its exclusive locks. Since you are using Uncommitted Read isolation
>> level the transaction should release the update locks when it has
>> decided that it will not do an update to the record as you said.
>> In the example above you see one transaction (15355871) that holds one
>> Update lock while it tries to grab one more Update lock. So why didn't
>> this transaction not release the first Update lock before trying to
>> grab a new Update lock?
>> I am not sure of the exact answer, but I think
>> this might have to do with that Derby will keep the first Update lock
>> until it actually have acquired the next lock (and then release it).
>> (I am sending a CC to two other members in the team who might have a
>> better answer to this.)
> 
> 
> That's precisely my concern. If another transaction grabs the same
> two Update locks in the reverse order, this is deadlock-prone.
> 
> To me this deadlock looks like a bug in the Derby server. I, as
> a JDBC user, have no control on how row locking is done.
> 
>> I hope this helped. I will try to get back to you with a more detailed
>> answer when I have it.
> 
> 
> Thanks a lot Olav,
>