You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@oozie.apache.org by Brandon Vanhoy <br...@yahoo.com> on 2012/02/15 18:26:09 UTC

Re: Some lock issue is occuring.

Was there any resolution to this issue?  

It looks like the same thing happened to me this morning.  
We have a job that runs once a day, and this morning workflow never left the 'PREP' state, and the coordinator show's the workflow as failed.
I ended up killing the failed coordinator and restarting it (so things look like they are back to normal).  

However, I'm wondering if this will happen again, and if so, how often?
We're using cdh3u2 (so, Oozie 2.3.2-u2).
Any help would be appreciated.

Below is error that was reported in oozie.log:

2012-02-15 01:02:41,753 ERROR StartCommand:522 - USER[oozie] GROUP[users] TOKEN[] APP[comps_primary_workflow] JOB[0001668-111221102141548-oozie-oozi-W] ACTION[0000639-111221102141548-oozie-oozi-C@20] XException, org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
Lock : ROW, WF_JOBS, (2,7)  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971338, U} 
Lock : ROW, WF_JOBS, (952,43)  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971343, X} 
. The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, p
roto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
  <property>
    <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:Lock : ROW, WF_JOBS, (2,7)
  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971338, U} 
Lock : ROW, WF_JOBS, (952,43)
  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971343, X} 
. The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
  <property>
    <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
        at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:823)
        at org.apache.oozie.store.WorkflowStore.updateWorkflow(WorkflowStore.java:197)
        at org.apache.oozie.command.wf.SignalCommand.call(SignalCommand.java:199)
        at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:305)
        at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:59)
        at org.apache.oozie.command.Command.call(Command.java:202)
        at org.apache.oozie.DagEngine.start(DagEngine.java:184)
        at org.apache.oozie.DagEngine.submitJob(DagEngine.java:98)
        at org.apache.oozie.command.coord.CoordActionStartCommand.call(CoordActionStartCommand.java:157)
        at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:211)
        at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:45)
        at org.apache.oozie.command.Command.call(Command.java:202)
        at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:132)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: <openjpa-1.2.1-r752877:753278 nonfatal store error> org.apache.openjpa.persistence.OptimisticLockException: Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
Lock : ROW, WF_JOBS, (2,7)
  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971338, U} 
Lock : ROW, WF_JOBS, (952,43)
  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971343, X} 
. The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
  <property>
    <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
FailedObject: UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [java.lang.String]
        at org.apache.openjpa.jdbc.sql.DBDictionary.narrow(DBDictionary.java:4222)
        at org.apache.openjpa.jdbc.sql.DBDictionary.newStoreException(DBDictionary.java:4197)
        at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:102)
        at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:72)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:515)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:440)
        at org.apache.openjpa.kernel.ExpressionStoreQuery$DataStoreExecutor.executeUpdate(ExpressionStoreQuery.java:694)
        at org.apache.openjpa.kernel.QueryImpl.update(QueryImpl.java:1039)
        at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:852)
        at org.apache.openjpa.kernel.QueryImpl.updateAll(QueryImpl.java:888)
        at org.apache.openjpa.kernel.DelegatingQuery.updateAll(DelegatingQuery.java:581)
        at org.apache.openjpa.persistence.QueryImpl.executeUpdate(QueryImpl.java:338)
        at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:202)
        at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:197)
        at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:808)
        ... 15 more
Caused by: org.apache.openjpa.lib.jdbc.ReportingSQLException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
Lock : ROW, WF_JOBS, (2,7)
  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971338, U} 
Lock : ROW, WF_JOBS, (952,43)
  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971343, X} 
. The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
  <property>
    <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001]
        at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.wrap(LoggingConnectionDecorator.java:192)
        at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.access$700(LoggingConnectionDecorator.java:57)
        at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator$LoggingConnection$LoggingPreparedStatement.executeUpdate(LoggingConnectionDecorator.java:866)
        at org.apache.openjpa.lib.jdbc.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:269)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreManager$CancelPreparedStatement.executeUpdate(JDBCStoreManager.java:1586)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:671)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:513)
        ... 25 more



________________________________
 From: Mohammad Islam <mi...@yahoo.com>
To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org> 
Sent: Friday, January 20, 2012 12:13 AM
Subject: Re: Some lock issue is occuring.
 
Did you recently upgrade oozie?
Can you please submit a new job and try to kill it

In which log file did you get it? looks like it is not in oozie.log. May be catalina.out ot localhsot*
Regards,
Mohammad



----- Original Message -----
From: Harshal <ha...@komli.com>
To: oozie-users@incubator.apache.org
Cc: 
Sent: Thursday, January 19, 2012 11:40 PM
Subject: Re: Some lock issue is occuring.

Also,

Error: E0607 : E0607: Other error in operation
[getActionsForCoordinatorJob], An optimistic lock violation was detected
when flushing object instance "A lock could not be obtained within the
time requested [java.lang.String]" to the data store.  This indicates
that the object was concurrently modified in another transaction.



On Fri, 2012-01-20 at 12:53 +0530, Harshal wrote:
> What do you mean by this? I am not able to kill any of my jobs.
> 
> Error: E0607 : E0607: Other error in operation
> [updateCoordinatorAction], Unable to obtain an object lock on "A lock
> could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, COORD_ACTIONS, (336,8)   Waiting XID : {312103, U} , SA,
> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312125, X}
> Lock : ROW, COORD_ACTIONS, (67,8)   Waiting XID : {312125, S} , SA,
> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312103,
> X}  . The selected victim is XID : 312103. {prepstmnt 34907896 UPDATE
> COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url = NULL,
> created_conf = ?, error_code = NULL, error_message = NULL,
> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(int) 68, (String)
> <coordinator-app xmlns="uri:oozie:coordinator:0.1" name="plp-coord" ...,
> (String) <configuration>    <property>
> <name>oozie.coord.application.pat..., (String) , (String)
> <configuration>    <property>      <name>oozie.coord.application.pat...,
> (int) -1, (Timestamp) 2012-01-18 00:04:27.148, (String)
> 0000003-120112014121356-oozie-oozi-C, (Timestamp) 2012-01-20 12:49:01.8,
> (Timestamp) 2012-01-17 15:00:00.0, (String) KILLED, (String)
> 0000003-120112014121356-oozie-oozi-C@68, (String)
> CoordinatorActionBean]} [code=30000, state=40001] [java.lang.String]".
> 
>

Re: Some lock issue is occuring.

Posted by Harshal <ha...@komli.com>.
An easy solution that I found to this was to restart oozie. It will give
up all the locks on the database and you can execute your commands as
soon as oozie is restarted.

Regards,

On Wed, 2012-02-15 at 12:15 -0800, Brandon Vanhoy wrote:
> In that case I update Oozie to use mysql.
> 
> 
> Thank you Mohammad and Alejandro for the fast feedback,
> 
>   - Brandon
> 
> 
> ________________________________
>  From: Alejandro Abdelnur <tu...@cloudera.com>
> To: oozie-users@incubator.apache.org; Mohammad Islam <mi...@yahoo.com> 
> Sent: Wednesday, February 15, 2012 11:27 AM
> Subject: Re: Some lock issue is occuring.
>  
> Brandon,
> 
> As Mohammad mentions this has been addressed in Oozie 3.
> 
> In Oozie 2 (bundled with CDH) this problem occurs commonly with Derby
> DB, and much less frequent with mysql/oracle/postgres DBs. And Most of
> the time, Oozie recovers automatically from this failures.
> 
> Upcoming releases of CDH will include Oozie 3.
> 
> Thanks.
> 
> Alejandro
> 
> On Wed, Feb 15, 2012 at 10:38 AM, Mohammad Islam <mi...@yahoo.com> wrote:
> > Hi Brandon,
> > Sorry for that.
> > After Oozie 2.3, we did big design change to reduce this type of issues.
> >
> > We used to see this issue a lot in our system. In the later releases, we don't see the issue lately.
> >
> > Alejandro, can talk further on CDH release plan.
> >
> > Regards,
> > Mohammad
> >
> >
> >
> > ________________________________
> >  From: Brandon Vanhoy <br...@yahoo.com>
> > To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org>
> > Sent: Wednesday, February 15, 2012 9:26 AM
> > Subject: Re: Some lock issue is occuring.
> >
> > Was there any resolution to this issue?
> >
> > It looks like the same thing happened to me this morning.
> > We have a job that runs once a day, and this morning workflow never left the 'PREP' state, and the coordinator show's the workflow as failed.
> > I ended up killing the failed coordinator and restarting it (so things look like they are back to normal).
> >
> > However, I'm wondering if this will happen again, and if so, how often?
> > We're using cdh3u2 (so, Oozie 2.3.2-u2).
> > Any help would be appreciated.
> >
> > Below is error that was reported in oozie.log:
> >
> > 2012-02-15 01:02:41,753 ERROR StartCommand:522 - USER[oozie] GROUP[users] TOKEN[] APP[comps_primary_workflow] JOB[0001668-111221102141548-oozie-oozi-W] ACTION[0000639-111221102141548-oozie-oozi-C@20] XException, org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> > Lock : ROW, WF_JOBS, (2,7)  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
> > ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
> >   Granted XID : {1971338, U}
> > Lock : ROW, WF_JOBS, (952,43)  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
> > ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
> >   Granted XID : {1971343, X}
> > . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, p
> > roto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
> >   <property>
> >     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
> > org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:Lock : ROW, WF_JOBS, (2,7)
> >   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
> >   Granted XID : {1971338, U}
> > Lock : ROW, WF_JOBS, (952,43)
> >   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
> >   Granted XID : {1971343, X}
> > . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
> >   <property>
> >     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
> >         at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:823)
> >         at org.apache.oozie.store.WorkflowStore.updateWorkflow(WorkflowStore.java:197)
> >         at org.apache.oozie.command.wf.SignalCommand.call(SignalCommand.java:199)
> >         at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:305)
> >         at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:59)
> >         at org.apache.oozie.command.Command.call(Command.java:202)
> >         at org.apache.oozie.DagEngine.start(DagEngine.java:184)
> >         at org.apache.oozie.DagEngine.submitJob(DagEngine.java:98)
> >         at org.apache.oozie.command.coord.CoordActionStartCommand.call(CoordActionStartCommand.java:157)
> >         at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:211)
> >         at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:45)
> >         at org.apache.oozie.command.Command.call(Command.java:202)
> >         at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:132)
> >         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >         at java.lang.Thread.run(Thread.java:662)
> > Caused by: <openjpa-1.2.1-r752877:753278 nonfatal store error> org.apache.openjpa.persistence.OptimisticLockException: Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> > Lock : ROW, WF_JOBS, (2,7)
> >   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
> >   Granted XID : {1971338, U}
> > Lock : ROW, WF_JOBS, (952,43)
> >   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
> >   Granted XID : {1971343, X}
> > . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
> >   <property>
> >     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
> > FailedObject: UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [java.lang.String]
> >         at org.apache.openjpa.jdbc.sql.DBDictionary.narrow(DBDictionary.java:4222)
> >         at org.apache.openjpa.jdbc.sql.DBDictionary.newStoreException(DBDictionary.java:4197)
> >         at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:102)
> >         at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:72)
> >         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:515)
> >         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:440)
> >         at org.apache.openjpa.kernel.ExpressionStoreQuery$DataStoreExecutor.executeUpdate(ExpressionStoreQuery.java:694)
> >         at org.apache.openjpa.kernel.QueryImpl.update(QueryImpl.java:1039)
> >         at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:852)
> >         at org.apache.openjpa.kernel.QueryImpl.updateAll(QueryImpl.java:888)
> >         at org.apache.openjpa.kernel.DelegatingQuery.updateAll(DelegatingQuery.java:581)
> >         at org.apache.openjpa.persistence.QueryImpl.executeUpdate(QueryImpl.java:338)
> >         at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:202)
> >         at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:197)
> >         at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:808)
> >         ... 15 more
> > Caused by: org.apache.openjpa.lib.jdbc.ReportingSQLException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> > Lock : ROW, WF_JOBS, (2,7)
> >   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
> >   Granted XID : {1971338, U}
> > Lock : ROW, WF_JOBS, (952,43)
> >   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
> >   Granted XID : {1971343, X}
> > . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
> >   <property>
> >     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001]
> >         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.wrap(LoggingConnectionDecorator.java:192)
> >         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.access$700(LoggingConnectionDecorator.java:57)
> >         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator$LoggingConnection$LoggingPreparedStatement.executeUpdate(LoggingConnectionDecorator.java:866)
> >         at org.apache.openjpa.lib.jdbc.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:269)
> >         at org.apache.openjpa.jdbc.kernel.JDBCStoreManager$CancelPreparedStatement.executeUpdate(JDBCStoreManager.java:1586)
> >         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:671)
> >         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:513)
> >         ... 25 more
> >
> >
> >
> > ________________________________
> > From: Mohammad Islam <mi...@yahoo.com>
> > To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org>
> > Sent: Friday, January 20, 2012 12:13 AM
> > Subject: Re: Some lock issue is occuring.
> >
> > Did you recently upgrade oozie?
> > Can you please submit a new job and try to kill it
> >
> > In which log file did you get it? looks like it is not in oozie.log. May be catalina.out ot localhsot*
> > Regards,
> > Mohammad
> >
> >
> >
> > ----- Original Message -----
> > From: Harshal <ha...@komli.com>
> > To: oozie-users@incubator.apache.org
> > Cc:
> > Sent: Thursday, January 19, 2012 11:40 PM
> > Subject: Re: Some lock issue is occuring.
> >
> > Also,
> >
> > Error: E0607 : E0607: Other error in operation
> > [getActionsForCoordinatorJob], An optimistic lock violation was detected
> > when flushing object instance "A lock could not be obtained within the
> > time requested [java.lang.String]" to the data store.  This indicates
> > that the object was concurrently modified in another transaction.
> >
> >
> >
> > On Fri, 2012-01-20 at 12:53 +0530, Harshal wrote:
> >> What do you mean by this? I am not able to kill any of my jobs.
> >>
> >> Error: E0607 : E0607: Other error in operation
> >> [updateCoordinatorAction], Unable to obtain an object lock on "A lock
> >> could not be obtained due to a deadlock, cycle of locks and waiters is:
> >> Lock : ROW, COORD_ACTIONS, (336,8)   Waiting XID : {312103, U} , SA,
> >> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
> >> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
> >> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> >> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> >> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> >> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> >> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312125, X}
> >> Lock : ROW, COORD_ACTIONS, (67,8)   Waiting XID : {312125, S} , SA,
> >> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
> >> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
> >> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> >> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> >> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> >> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> >> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312103,
> >> X}  . The selected victim is XID : 312103. {prepstmnt 34907896 UPDATE
> >> COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url = NULL,
> >> created_conf = ?, error_code = NULL, error_message = NULL,
> >> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> >> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> >> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> >> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> >> t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(int) 68, (String)
> >> <coordinator-app xmlns="uri:oozie:coordinator:0.1" name="plp-coord" ...,
> >> (String) <configuration>    <property>
> >> <name>oozie.coord.application.pat..., (String) , (String)
> >> <configuration>    <property>      <name>oozie.coord.application.pat...,
> >> (int) -1, (Timestamp) 2012-01-18 00:04:27.148, (String)
> >> 0000003-120112014121356-oozie-oozi-C, (Timestamp) 2012-01-20 12:49:01.8,
> >> (Timestamp) 2012-01-17 15:00:00.0, (String) KILLED, (String)
> >> 0000003-120112014121356-oozie-oozi-C@68, (String)
> >> CoordinatorActionBean]} [code=30000, state=40001] [java.lang.String]".
> >>
> >>



Re: Some lock issue is occuring.

Posted by Brandon Vanhoy <br...@yahoo.com>.
In that case I update Oozie to use mysql.


Thank you Mohammad and Alejandro for the fast feedback,

  - Brandon


________________________________
 From: Alejandro Abdelnur <tu...@cloudera.com>
To: oozie-users@incubator.apache.org; Mohammad Islam <mi...@yahoo.com> 
Sent: Wednesday, February 15, 2012 11:27 AM
Subject: Re: Some lock issue is occuring.
 
Brandon,

As Mohammad mentions this has been addressed in Oozie 3.

In Oozie 2 (bundled with CDH) this problem occurs commonly with Derby
DB, and much less frequent with mysql/oracle/postgres DBs. And Most of
the time, Oozie recovers automatically from this failures.

Upcoming releases of CDH will include Oozie 3.

Thanks.

Alejandro

On Wed, Feb 15, 2012 at 10:38 AM, Mohammad Islam <mi...@yahoo.com> wrote:
> Hi Brandon,
> Sorry for that.
> After Oozie 2.3, we did big design change to reduce this type of issues.
>
> We used to see this issue a lot in our system. In the later releases, we don't see the issue lately.
>
> Alejandro, can talk further on CDH release plan.
>
> Regards,
> Mohammad
>
>
>
> ________________________________
>  From: Brandon Vanhoy <br...@yahoo.com>
> To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org>
> Sent: Wednesday, February 15, 2012 9:26 AM
> Subject: Re: Some lock issue is occuring.
>
> Was there any resolution to this issue?
>
> It looks like the same thing happened to me this morning.
> We have a job that runs once a day, and this morning workflow never left the 'PREP' state, and the coordinator show's the workflow as failed.
> I ended up killing the failed coordinator and restarting it (so things look like they are back to normal).
>
> However, I'm wondering if this will happen again, and if so, how often?
> We're using cdh3u2 (so, Oozie 2.3.2-u2).
> Any help would be appreciated.
>
> Below is error that was reported in oozie.log:
>
> 2012-02-15 01:02:41,753 ERROR StartCommand:522 - USER[oozie] GROUP[users] TOKEN[] APP[comps_primary_workflow] JOB[0001668-111221102141548-oozie-oozi-W] ACTION[0000639-111221102141548-oozie-oozi-C@20] XException, org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, WF_JOBS, (2,7)  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
> ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971338, U}
> Lock : ROW, WF_JOBS, (952,43)  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
> ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971343, X}
> . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, p
> roto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
>   <property>
>     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
> org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:Lock : ROW, WF_JOBS, (2,7)
>   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971338, U}
> Lock : ROW, WF_JOBS, (952,43)
>   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971343, X}
> . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
>   <property>
>     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
>         at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:823)
>         at org.apache.oozie.store.WorkflowStore.updateWorkflow(WorkflowStore.java:197)
>         at org.apache.oozie.command.wf.SignalCommand.call(SignalCommand.java:199)
>         at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:305)
>         at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:59)
>         at org.apache.oozie.command.Command.call(Command.java:202)
>         at org.apache.oozie.DagEngine.start(DagEngine.java:184)
>         at org.apache.oozie.DagEngine.submitJob(DagEngine.java:98)
>         at org.apache.oozie.command.coord.CoordActionStartCommand.call(CoordActionStartCommand.java:157)
>         at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:211)
>         at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:45)
>         at org.apache.oozie.command.Command.call(Command.java:202)
>         at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:132)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: <openjpa-1.2.1-r752877:753278 nonfatal store error> org.apache.openjpa.persistence.OptimisticLockException: Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, WF_JOBS, (2,7)
>   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971338, U}
> Lock : ROW, WF_JOBS, (952,43)
>   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971343, X}
> . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
>   <property>
>     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
> FailedObject: UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [java.lang.String]
>         at org.apache.openjpa.jdbc.sql.DBDictionary.narrow(DBDictionary.java:4222)
>         at org.apache.openjpa.jdbc.sql.DBDictionary.newStoreException(DBDictionary.java:4197)
>         at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:102)
>         at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:72)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:515)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:440)
>         at org.apache.openjpa.kernel.ExpressionStoreQuery$DataStoreExecutor.executeUpdate(ExpressionStoreQuery.java:694)
>         at org.apache.openjpa.kernel.QueryImpl.update(QueryImpl.java:1039)
>         at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:852)
>         at org.apache.openjpa.kernel.QueryImpl.updateAll(QueryImpl.java:888)
>         at org.apache.openjpa.kernel.DelegatingQuery.updateAll(DelegatingQuery.java:581)
>         at org.apache.openjpa.persistence.QueryImpl.executeUpdate(QueryImpl.java:338)
>         at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:202)
>         at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:197)
>         at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:808)
>         ... 15 more
> Caused by: org.apache.openjpa.lib.jdbc.ReportingSQLException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, WF_JOBS, (2,7)
>   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971338, U}
> Lock : ROW, WF_JOBS, (952,43)
>   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971343, X}
> . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
>   <property>
>     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001]
>         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.wrap(LoggingConnectionDecorator.java:192)
>         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.access$700(LoggingConnectionDecorator.java:57)
>         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator$LoggingConnection$LoggingPreparedStatement.executeUpdate(LoggingConnectionDecorator.java:866)
>         at org.apache.openjpa.lib.jdbc.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:269)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreManager$CancelPreparedStatement.executeUpdate(JDBCStoreManager.java:1586)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:671)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:513)
>         ... 25 more
>
>
>
> ________________________________
> From: Mohammad Islam <mi...@yahoo.com>
> To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org>
> Sent: Friday, January 20, 2012 12:13 AM
> Subject: Re: Some lock issue is occuring.
>
> Did you recently upgrade oozie?
> Can you please submit a new job and try to kill it
>
> In which log file did you get it? looks like it is not in oozie.log. May be catalina.out ot localhsot*
> Regards,
> Mohammad
>
>
>
> ----- Original Message -----
> From: Harshal <ha...@komli.com>
> To: oozie-users@incubator.apache.org
> Cc:
> Sent: Thursday, January 19, 2012 11:40 PM
> Subject: Re: Some lock issue is occuring.
>
> Also,
>
> Error: E0607 : E0607: Other error in operation
> [getActionsForCoordinatorJob], An optimistic lock violation was detected
> when flushing object instance "A lock could not be obtained within the
> time requested [java.lang.String]" to the data store.  This indicates
> that the object was concurrently modified in another transaction.
>
>
>
> On Fri, 2012-01-20 at 12:53 +0530, Harshal wrote:
>> What do you mean by this? I am not able to kill any of my jobs.
>>
>> Error: E0607 : E0607: Other error in operation
>> [updateCoordinatorAction], Unable to obtain an object lock on "A lock
>> could not be obtained due to a deadlock, cycle of locks and waiters is:
>> Lock : ROW, COORD_ACTIONS, (336,8)   Waiting XID : {312103, U} , SA,
>> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
>> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
>> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
>> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
>> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
>> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
>> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312125, X}
>> Lock : ROW, COORD_ACTIONS, (67,8)   Waiting XID : {312125, S} , SA,
>> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
>> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
>> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
>> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
>> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
>> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
>> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312103,
>> X}  . The selected victim is XID : 312103. {prepstmnt 34907896 UPDATE
>> COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url = NULL,
>> created_conf = ?, error_code = NULL, error_message = NULL,
>> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
>> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
>> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
>> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
>> t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(int) 68, (String)
>> <coordinator-app xmlns="uri:oozie:coordinator:0.1" name="plp-coord" ...,
>> (String) <configuration>    <property>
>> <name>oozie.coord.application.pat..., (String) , (String)
>> <configuration>    <property>      <name>oozie.coord.application.pat...,
>> (int) -1, (Timestamp) 2012-01-18 00:04:27.148, (String)
>> 0000003-120112014121356-oozie-oozi-C, (Timestamp) 2012-01-20 12:49:01.8,
>> (Timestamp) 2012-01-17 15:00:00.0, (String) KILLED, (String)
>> 0000003-120112014121356-oozie-oozi-C@68, (String)
>> CoordinatorActionBean]} [code=30000, state=40001] [java.lang.String]".
>>
>>

Re: Some lock issue is occuring.

Posted by Alejandro Abdelnur <tu...@cloudera.com>.
Brandon,

As Mohammad mentions this has been addressed in Oozie 3.

In Oozie 2 (bundled with CDH) this problem occurs commonly with Derby
DB, and much less frequent with mysql/oracle/postgres DBs. And Most of
the time, Oozie recovers automatically from this failures.

Upcoming releases of CDH will include Oozie 3.

Thanks.

Alejandro

On Wed, Feb 15, 2012 at 10:38 AM, Mohammad Islam <mi...@yahoo.com> wrote:
> Hi Brandon,
> Sorry for that.
> After Oozie 2.3, we did big design change to reduce this type of issues.
>
> We used to see this issue a lot in our system. In the later releases, we don't see the issue lately.
>
> Alejandro, can talk further on CDH release plan.
>
> Regards,
> Mohammad
>
>
>
> ________________________________
>  From: Brandon Vanhoy <br...@yahoo.com>
> To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org>
> Sent: Wednesday, February 15, 2012 9:26 AM
> Subject: Re: Some lock issue is occuring.
>
> Was there any resolution to this issue?
>
> It looks like the same thing happened to me this morning.
> We have a job that runs once a day, and this morning workflow never left the 'PREP' state, and the coordinator show's the workflow as failed.
> I ended up killing the failed coordinator and restarting it (so things look like they are back to normal).
>
> However, I'm wondering if this will happen again, and if so, how often?
> We're using cdh3u2 (so, Oozie 2.3.2-u2).
> Any help would be appreciated.
>
> Below is error that was reported in oozie.log:
>
> 2012-02-15 01:02:41,753 ERROR StartCommand:522 - USER[oozie] GROUP[users] TOKEN[] APP[comps_primary_workflow] JOB[0001668-111221102141548-oozie-oozi-W] ACTION[0000639-111221102141548-oozie-oozi-C@20] XException, org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, WF_JOBS, (2,7)  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
> ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971338, U}
> Lock : ROW, WF_JOBS, (952,43)  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
> ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971343, X}
> . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, p
> roto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
>   <property>
>     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
> org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:Lock : ROW, WF_JOBS, (2,7)
>   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971338, U}
> Lock : ROW, WF_JOBS, (952,43)
>   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971343, X}
> . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
>   <property>
>     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
>         at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:823)
>         at org.apache.oozie.store.WorkflowStore.updateWorkflow(WorkflowStore.java:197)
>         at org.apache.oozie.command.wf.SignalCommand.call(SignalCommand.java:199)
>         at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:305)
>         at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:59)
>         at org.apache.oozie.command.Command.call(Command.java:202)
>         at org.apache.oozie.DagEngine.start(DagEngine.java:184)
>         at org.apache.oozie.DagEngine.submitJob(DagEngine.java:98)
>         at org.apache.oozie.command.coord.CoordActionStartCommand.call(CoordActionStartCommand.java:157)
>         at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:211)
>         at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:45)
>         at org.apache.oozie.command.Command.call(Command.java:202)
>         at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:132)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>         at java.lang.Thread.run(Thread.java:662)
> Caused by: <openjpa-1.2.1-r752877:753278 nonfatal store error> org.apache.openjpa.persistence.OptimisticLockException: Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, WF_JOBS, (2,7)
>   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971338, U}
> Lock : ROW, WF_JOBS, (952,43)
>   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971343, X}
> . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
>   <property>
>     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
> FailedObject: UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [java.lang.String]
>         at org.apache.openjpa.jdbc.sql.DBDictionary.narrow(DBDictionary.java:4222)
>         at org.apache.openjpa.jdbc.sql.DBDictionary.newStoreException(DBDictionary.java:4197)
>         at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:102)
>         at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:72)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:515)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:440)
>         at org.apache.openjpa.kernel.ExpressionStoreQuery$DataStoreExecutor.executeUpdate(ExpressionStoreQuery.java:694)
>         at org.apache.openjpa.kernel.QueryImpl.update(QueryImpl.java:1039)
>         at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:852)
>         at org.apache.openjpa.kernel.QueryImpl.updateAll(QueryImpl.java:888)
>         at org.apache.openjpa.kernel.DelegatingQuery.updateAll(DelegatingQuery.java:581)
>         at org.apache.openjpa.persistence.QueryImpl.executeUpdate(QueryImpl.java:338)
>         at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:202)
>         at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:197)
>         at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:808)
>         ... 15 more
> Caused by: org.apache.openjpa.lib.jdbc.ReportingSQLException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, WF_JOBS, (2,7)
>   Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971338, U}
> Lock : ROW, WF_JOBS, (952,43)
>   Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
>   Granted XID : {1971343, X}
> . The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
>   <property>
>     <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001]
>         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.wrap(LoggingConnectionDecorator.java:192)
>         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.access$700(LoggingConnectionDecorator.java:57)
>         at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator$LoggingConnection$LoggingPreparedStatement.executeUpdate(LoggingConnectionDecorator.java:866)
>         at org.apache.openjpa.lib.jdbc.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:269)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreManager$CancelPreparedStatement.executeUpdate(JDBCStoreManager.java:1586)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:671)
>         at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:513)
>         ... 25 more
>
>
>
> ________________________________
> From: Mohammad Islam <mi...@yahoo.com>
> To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org>
> Sent: Friday, January 20, 2012 12:13 AM
> Subject: Re: Some lock issue is occuring.
>
> Did you recently upgrade oozie?
> Can you please submit a new job and try to kill it
>
> In which log file did you get it? looks like it is not in oozie.log. May be catalina.out ot localhsot*
> Regards,
> Mohammad
>
>
>
> ----- Original Message -----
> From: Harshal <ha...@komli.com>
> To: oozie-users@incubator.apache.org
> Cc:
> Sent: Thursday, January 19, 2012 11:40 PM
> Subject: Re: Some lock issue is occuring.
>
> Also,
>
> Error: E0607 : E0607: Other error in operation
> [getActionsForCoordinatorJob], An optimistic lock violation was detected
> when flushing object instance "A lock could not be obtained within the
> time requested [java.lang.String]" to the data store.  This indicates
> that the object was concurrently modified in another transaction.
>
>
>
> On Fri, 2012-01-20 at 12:53 +0530, Harshal wrote:
>> What do you mean by this? I am not able to kill any of my jobs.
>>
>> Error: E0607 : E0607: Other error in operation
>> [updateCoordinatorAction], Unable to obtain an object lock on "A lock
>> could not be obtained due to a deadlock, cycle of locks and waiters is:
>> Lock : ROW, COORD_ACTIONS, (336,8)   Waiting XID : {312103, U} , SA,
>> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
>> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
>> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
>> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
>> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
>> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
>> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312125, X}
>> Lock : ROW, COORD_ACTIONS, (67,8)   Waiting XID : {312125, S} , SA,
>> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
>> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
>> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
>> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
>> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
>> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
>> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312103,
>> X}  . The selected victim is XID : 312103. {prepstmnt 34907896 UPDATE
>> COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url = NULL,
>> created_conf = ?, error_code = NULL, error_message = NULL,
>> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
>> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
>> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
>> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
>> t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(int) 68, (String)
>> <coordinator-app xmlns="uri:oozie:coordinator:0.1" name="plp-coord" ...,
>> (String) <configuration>    <property>
>> <name>oozie.coord.application.pat..., (String) , (String)
>> <configuration>    <property>      <name>oozie.coord.application.pat...,
>> (int) -1, (Timestamp) 2012-01-18 00:04:27.148, (String)
>> 0000003-120112014121356-oozie-oozi-C, (Timestamp) 2012-01-20 12:49:01.8,
>> (Timestamp) 2012-01-17 15:00:00.0, (String) KILLED, (String)
>> 0000003-120112014121356-oozie-oozi-C@68, (String)
>> CoordinatorActionBean]} [code=30000, state=40001] [java.lang.String]".
>>
>>

Re: Some lock issue is occuring.

Posted by Mohammad Islam <mi...@yahoo.com>.
Hi Brandon,
Sorry for that.
After Oozie 2.3, we did big design change to reduce this type of issues.

We used to see this issue a lot in our system. In the later releases, we don't see the issue lately.

Alejandro, can talk further on CDH release plan.

Regards,
Mohammad



________________________________
 From: Brandon Vanhoy <br...@yahoo.com>
To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org> 
Sent: Wednesday, February 15, 2012 9:26 AM
Subject: Re: Some lock issue is occuring.
 
Was there any resolution to this issue?  

It looks like the same thing happened to me this morning.  
We have a job that runs once a day, and this morning workflow never left the 'PREP' state, and the coordinator show's the workflow as failed.
I ended up killing the failed coordinator and restarting it (so things look like they are back to normal).  

However, I'm wondering if this will happen again, and if so, how often?
We're using cdh3u2 (so, Oozie 2.3.2-u2).
Any help would be appreciated.

Below is error that was reported in oozie.log:

2012-02-15 01:02:41,753 ERROR StartCommand:522 - USER[oozie] GROUP[users] TOKEN[] APP[comps_primary_workflow] JOB[0001668-111221102141548-oozie-oozi-W] ACTION[0000639-111221102141548-oozie-oozi-C@20] XException, org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
Lock : ROW, WF_JOBS, (2,7)  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971338, U} 
Lock : ROW, WF_JOBS, (952,43)  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_x
ml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971343, X} 
. The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, p
roto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
  <property>
    <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
org.apache.oozie.store.StoreException: E0607: Other error in operation [updateWorkflow], Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:Lock : ROW, WF_JOBS, (2,7)
  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971338, U} 
Lock : ROW, WF_JOBS, (952,43)
  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971343, X} 
. The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
  <property>
    <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
        at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:823)
        at org.apache.oozie.store.WorkflowStore.updateWorkflow(WorkflowStore.java:197)
        at org.apache.oozie.command.wf.SignalCommand.call(SignalCommand.java:199)
        at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:305)
        at org.apache.oozie.command.wf.SignalCommand.execute(SignalCommand.java:59)
        at org.apache.oozie.command.Command.call(Command.java:202)
        at org.apache.oozie.DagEngine.start(DagEngine.java:184)
        at org.apache.oozie.DagEngine.submitJob(DagEngine.java:98)
        at org.apache.oozie.command.coord.CoordActionStartCommand.call(CoordActionStartCommand.java:157)
        at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:211)
        at org.apache.oozie.command.coord.CoordActionStartCommand.execute(CoordActionStartCommand.java:45)
        at org.apache.oozie.command.Command.call(Command.java:202)
        at org.apache.oozie.service.CallableQueueService$CallableWrapper.run(CallableQueueService.java:132)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: <openjpa-1.2.1-r752877:753278 nonfatal store error> org.apache.openjpa.persistence.OptimisticLockException: Unable to obtain an object lock on "A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
Lock : ROW, WF_JOBS, (2,7)
  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971338, U} 
Lock : ROW, WF_JOBS, (952,43)
  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971343, X} 
. The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
  <property>
    <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001] [java.lang.String]".
FailedObject: UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [java.lang.String]
        at org.apache.openjpa.jdbc.sql.DBDictionary.narrow(DBDictionary.java:4222)
        at org.apache.openjpa.jdbc.sql.DBDictionary.newStoreException(DBDictionary.java:4197)
        at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:102)
        at org.apache.openjpa.jdbc.sql.SQLExceptions.getStore(SQLExceptions.java:72)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:515)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:440)
        at org.apache.openjpa.kernel.ExpressionStoreQuery$DataStoreExecutor.executeUpdate(ExpressionStoreQuery.java:694)
        at org.apache.openjpa.kernel.QueryImpl.update(QueryImpl.java:1039)
        at org.apache.openjpa.kernel.QueryImpl.execute(QueryImpl.java:852)
        at org.apache.openjpa.kernel.QueryImpl.updateAll(QueryImpl.java:888)
        at org.apache.openjpa.kernel.DelegatingQuery.updateAll(DelegatingQuery.java:581)
        at org.apache.openjpa.persistence.QueryImpl.executeUpdate(QueryImpl.java:338)
        at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:202)
        at org.apache.oozie.store.WorkflowStore$5.call(WorkflowStore.java:197)
        at org.apache.oozie.store.WorkflowStore.doOperation(WorkflowStore.java:808)
        ... 15 more
Caused by: org.apache.openjpa.lib.jdbc.ReportingSQLException: A lock could not be obtained due to a deadlock, cycle of locks and waiters is:
Lock : ROW, WF_JOBS, (2,7)
  Waiting XID : {1971343, U} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971338, U} 
Lock : ROW, WF_JOBS, (952,43)
  Waiting XID : {1971338, S} , SA, UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?)
  Granted XID : {1971343, X} 
. The selected victim is XID : 1971343. {prepstmnt 1601033644 UPDATE WF_JOBS SET app_name = ?, app_path = ?, conf = ?, group_name = ?, run = ?, user_name = ?, auth_token = ?, created_time = ?, end_time = NULL, external_id = NULL, last_modified_time = ?, log_token = ?, proto_action_conf = ?, sla_xml = ?, start_time = ?, status = ?, wf_instance = ? WHERE id IN (SELECT DISTINCT t0.id FROM WF_JOBS t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(String) comps_primary_workflow, (String) hdfs://analyticsnn1.sv2.trulia.com:8020/oozie/workflow/coordinators/..., (String) <configuration>
  <property>
    <name>downloadedForSaleDir</name>..., (String) users, (int) 0, (String) oozie, (String) ?, (Timestamp) 2012-02-15 01:02:21.683, (Timestamp) 2012-02-15 01:02:21.705, (String) , (String) <?xml version="1.0" encoding="UTF-8" standalone="no"?><configuration..., (String) , (Timestamp) 2012-02-15 01:02:21.7, (String) RUNNING, (byte[]) [B@7b46f3aa, (String) 0001668-111221102141548-oozie-oozi-W, (String) WorkflowJobBean]} [code=30000, state=40001]
        at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.wrap(LoggingConnectionDecorator.java:192)
        at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator.access$700(LoggingConnectionDecorator.java:57)
        at org.apache.openjpa.lib.jdbc.LoggingConnectionDecorator$LoggingConnection$LoggingPreparedStatement.executeUpdate(LoggingConnectionDecorator.java:866)
        at org.apache.openjpa.lib.jdbc.DelegatingPreparedStatement.executeUpdate(DelegatingPreparedStatement.java:269)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreManager$CancelPreparedStatement.executeUpdate(JDBCStoreManager.java:1586)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeUpdate(JDBCStoreQuery.java:671)
        at org.apache.openjpa.jdbc.kernel.JDBCStoreQuery.executeBulkOperation(JDBCStoreQuery.java:513)
        ... 25 more



________________________________
From: Mohammad Islam <mi...@yahoo.com>
To: "oozie-users@incubator.apache.org" <oo...@incubator.apache.org> 
Sent: Friday, January 20, 2012 12:13 AM
Subject: Re: Some lock issue is occuring.

Did you recently upgrade oozie?
Can you please submit a new job and try to kill it

In which log file did you get it? looks like it is not in oozie.log. May be catalina.out ot localhsot*
Regards,
Mohammad



----- Original Message -----
From: Harshal <ha...@komli.com>
To: oozie-users@incubator.apache.org
Cc: 
Sent: Thursday, January 19, 2012 11:40 PM
Subject: Re: Some lock issue is occuring.

Also,

Error: E0607 : E0607: Other error in operation
[getActionsForCoordinatorJob], An optimistic lock violation was detected
when flushing object instance "A lock could not be obtained within the
time requested [java.lang.String]" to the data store.  This indicates
that the object was concurrently modified in another transaction.



On Fri, 2012-01-20 at 12:53 +0530, Harshal wrote:
> What do you mean by this? I am not able to kill any of my jobs.
> 
> Error: E0607 : E0607: Other error in operation
> [updateCoordinatorAction], Unable to obtain an object lock on "A lock
> could not be obtained due to a deadlock, cycle of locks and waiters is:
> Lock : ROW, COORD_ACTIONS, (336,8)   Waiting XID : {312103, U} , SA,
> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312125, X}
> Lock : ROW, COORD_ACTIONS, (67,8)   Waiting XID : {312125, S} , SA,
> UPDATE COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url
> = NULL, created_conf = ?, error_code = NULL, error_message = NULL,
> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> t0 WHERE (t0.id = ?) AND t0.bean_type = ?)   Granted XID : {312103,
> X}  . The selected victim is XID : 312103. {prepstmnt 34907896 UPDATE
> COORD_ACTIONS SET action_number = ?, action_xml = ?, console_url = NULL,
> created_conf = ?, error_code = NULL, error_message = NULL,
> external_status = NULL, missing_dependencies = ?, run_conf = ?, time_out
> = ?, tracker_uri = NULL, job_type = NULL, created_time = ?, external_id
> = NULL, job_id = ?, last_modified_time = ?, nominal_time = ?, sla_xml =
> NULL, status = ? WHERE id IN (SELECT DISTINCT t0.id FROM COORD_ACTIONS
> t0 WHERE (t0.id = ?) AND t0.bean_type = ?) [params=(int) 68, (String)
> <coordinator-app xmlns="uri:oozie:coordinator:0.1" name="plp-coord" ...,
> (String) <configuration>    <property>
> <name>oozie.coord.application.pat..., (String) , (String)
> <configuration>    <property>      <name>oozie.coord.application.pat...,
> (int) -1, (Timestamp) 2012-01-18 00:04:27.148, (String)
> 0000003-120112014121356-oozie-oozi-C, (Timestamp) 2012-01-20 12:49:01.8,
> (Timestamp) 2012-01-17 15:00:00.0, (String) KILLED, (String)
> 0000003-120112014121356-oozie-oozi-C@68, (String)
> CoordinatorActionBean]} [code=30000, state=40001] [java.lang.String]".
> 
>