You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by "reshu.agarwal" <re...@orkash.com> on 2014/06/03 14:59:26 UTC

CanceledByDriver status in DUCC

Hi,

I faces some times the "CanceledByDriver" status to a job in DUCC, If 
error count reaches to 14. The all errors are due to Cas Timed Out 
exception and after 5 continuous errors the Cas timed out on server is 
null. Whole Job is cancelled and documents get skipped by this.

Can I set some configuration in DUCC to reprocess this particular job 
again as new job if job is cancelled by driver?


-- 
Thanks,
Reshu Agarwal


Re: CanceledByDriver status in DUCC

Posted by Lou DeGenaro <lo...@gmail.com>.
Great!  Glad you figured it out.

I think the the post-install script is supposed to customize the
ducc.properties file properly.  If is does not perhaps a Jira is in order.

Lou.


On Mon, Jun 9, 2014 at 1:52 AM, reshu.agarwal <re...@orkash.com>
wrote:

> On 06/09/2014 09:50 AM, reshu.agarwal wrote:
>
>> On 06/06/2014 06:03 PM, Lou DeGenaro wrote:
>>
>>> files for ERROR or WARN messages especially
>>> relative to the MqReaper.
>>>
>> Hi Lou,
>>
>> I have found an error in DUCC logs or.log:
>>
>> java.lang.reflect.UndeclaredThrowableException
>>         at com.sun.proxy.$Proxy19.getQueues(Unknown Source)
>>         at org.apache.uima.ducc.common.mq.MqHelper.getQueueList(
>> MqHelper.java:152)
>>         at org.apache.uima.ducc.orchestrator.maintenance.
>> MqReaper.getJdQueues(MqReaper.java:121)
>>         at org.apache.uima.ducc.orchestrator.maintenance.
>> MqReaper.removeUnusedJdQueues(MqReaper.java:159)
>>         at org.apache.uima.ducc.orchestrator.maintenance.
>> MaintenanceThread.run(MaintenanceThread.java:106)
>> Caused by: javax.management.InstanceNotFoundException:
>> org.apache.activemq:BrokerName=S1,Type=Broker
>>         at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.
>> getMBean(DefaultMBeanServerInterceptor.java:1095)
>>         at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.
>> getAttribute(DefaultMBeanServerInterceptor.java:643)
>>         at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(
>> JmxMBeanServer.java:669)
>>         at javax.management.remote.rmi.RMIConnectionImpl.doOperation(
>> RMIConnectionImpl.java:1463)
>>         at javax.management.remote.rmi.RMIConnectionImpl.access$300(
>> RMIConnectionImpl.java:96)
>>         at javax.management.remote.rmi.RMIConnectionImpl$
>> PrivilegedOperation.run(RMIConnectionImpl.java:1327)
>>         at javax.management.remote.rmi.RMIConnectionImpl.
>> doPrivilegedOperation(RMIConnectionImpl.java:1419)
>>         at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(
>> RMIConnectionImpl.java:656)
>>         at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>>         at java.lang.reflect.Method.invoke(Method.java:601)
>>         at sun.rmi.server.UnicastServerRef.dispatch(
>> UnicastServerRef.java:322)
>>         at sun.rmi.transport.Transport$1.run(Transport.java:177)
>>         at sun.rmi.transport.Transport$1.run(Transport.java:174)
>>         at java.security.AccessController.doPrivileged(Native Method)
>>         at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
>>         at sun.rmi.transport.tcp.TCPTransport.handleMessages(
>> TCPTransport.java:553)
>>         at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(
>> TCPTransport.java:808)
>>         at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(
>> TCPTransport.java:667)
>>         at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1145)
>>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:615)
>>         at java.lang.Thread.run(Thread.java:722)
>>         at sun.rmi.transport.StreamRemoteCall.
>> exceptionReceivedFromServer(StreamRemoteCall.java:273)
>>         at sun.rmi.transport.StreamRemoteCall.executeCall(
>> StreamRemoteCall.java:251)
>>         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
>>         at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
>>         at javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown
>> Source)
>>         at javax.management.remote.rmi.RMIConnector$
>> RemoteMBeanServerConnection.getAttribute(RMIConnector.java:901)
>>         at javax.management.MBeanServerInvocationHandler.invoke(
>> MBeanServerInvocationHandler.java:280)
>>         ... 5 more
>>
>>  Hi Lou,
>
> This problem has been resolved. I just set ducc.broker.name=localhost
> which is before set as S1.
>
>
> --
> Thanks,
> Reshu Agarwal
>
>

Re: CanceledByDriver status in DUCC

Posted by "reshu.agarwal" <re...@orkash.com>.
On 06/09/2014 09:50 AM, reshu.agarwal wrote:
> On 06/06/2014 06:03 PM, Lou DeGenaro wrote:
>> files for ERROR or WARN messages especially
>> relative to the MqReaper.
> Hi Lou,
>
> I have found an error in DUCC logs or.log:
>
> java.lang.reflect.UndeclaredThrowableException
>         at com.sun.proxy.$Proxy19.getQueues(Unknown Source)
>         at 
> org.apache.uima.ducc.common.mq.MqHelper.getQueueList(MqHelper.java:152)
>         at 
> org.apache.uima.ducc.orchestrator.maintenance.MqReaper.getJdQueues(MqReaper.java:121)
>         at 
> org.apache.uima.ducc.orchestrator.maintenance.MqReaper.removeUnusedJdQueues(MqReaper.java:159)
>         at 
> org.apache.uima.ducc.orchestrator.maintenance.MaintenanceThread.run(MaintenanceThread.java:106)
> Caused by: javax.management.InstanceNotFoundException: 
> org.apache.activemq:BrokerName=S1,Type=Broker
>         at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
>         at 
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:643)
>         at 
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:669)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1463)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:96)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1327)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1419)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:656)
>         at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:601)
>         at 
> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
>         at sun.rmi.transport.Transport$1.run(Transport.java:177)
>         at sun.rmi.transport.Transport$1.run(Transport.java:174)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
>         at 
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
>         at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
>         at 
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:722)
>         at 
> sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:273)
>         at 
> sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:251)
>         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
>         at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
>         at 
> javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown Source) 
>
>         at 
> javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:901)
>         at 
> javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:280)
>         ... 5 more
>
Hi Lou,

This problem has been resolved. I just set ducc.broker.name=localhost 
which is before set as S1.


-- 
Thanks,
Reshu Agarwal


Re: CanceledByDriver status in DUCC

Posted by "reshu.agarwal" <re...@orkash.com>.
On 06/06/2014 06:03 PM, Lou DeGenaro wrote:
> files for ERROR or WARN messages especially
> relative to the MqReaper.
Hi Lou,

I have found an error in DUCC logs or.log:

java.lang.reflect.UndeclaredThrowableException
         at com.sun.proxy.$Proxy19.getQueues(Unknown Source)
         at 
org.apache.uima.ducc.common.mq.MqHelper.getQueueList(MqHelper.java:152)
         at 
org.apache.uima.ducc.orchestrator.maintenance.MqReaper.getJdQueues(MqReaper.java:121)
         at 
org.apache.uima.ducc.orchestrator.maintenance.MqReaper.removeUnusedJdQueues(MqReaper.java:159)
         at 
org.apache.uima.ducc.orchestrator.maintenance.MaintenanceThread.run(MaintenanceThread.java:106)
Caused by: javax.management.InstanceNotFoundException: 
org.apache.activemq:BrokerName=S1,Type=Broker
         at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
         at 
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:643)
         at 
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:669)
         at 
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1463)
         at 
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:96)
         at 
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1327)
         at 
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1419)
         at 
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:656)
         at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
         at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
         at java.lang.reflect.Method.invoke(Method.java:601)
         at 
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
         at sun.rmi.transport.Transport$1.run(Transport.java:177)
         at sun.rmi.transport.Transport$1.run(Transport.java:174)
         at java.security.AccessController.doPrivileged(Native Method)
         at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
         at 
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
         at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
         at 
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
         at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
         at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
         at java.lang.Thread.run(Thread.java:722)
         at 
sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:273)
         at 
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:251)
         at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
         at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
         at 
javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown 
Source)
         at 
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:901)
         at 
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:280)
         ... 5 more

-- 
Thanks,
Reshu Agarwal


Re: CanceledByDriver status in DUCC

Posted by Lou DeGenaro <lo...@gmail.com>.
There should be no job queue "leak" on the broker.  The Orchestrator should
be cleaning these up.  It's a bug otherwise.

Search <ducc_home>/logs/or.log* files for ERROR or WARN messages especially
relative to the MqReaper.

Lou.



On Fri, Jun 6, 2014 at 5:41 AM, reshu.agarwal <re...@orkash.com>
wrote:

> On 06/05/2014 06:30 PM, Lou DeGenaro wrote:
>
>> How long does it take to process your longest work item through your
>>
> Hi Lou,
>
> It was 3 Minutes that time. We increased this to 10 minutes now for
> testing as you told me to increase it. I have also restarted DUCC. Till
> now, I didn't face the similar problem. But, If I face this again I will
> update you in this same mail trail. But if you see any other reason for
> this then please let me know.
>
> I have one more question as I have read in DUCC document, Orchestrator
> automatically deletes the job's queue from the Broker when job is over. But
> when I monitor the broker service using jconsole, I saw all the jobs queues
> are still there with in the broker.
>
> Will this can create a problem after a time in DUCC?
>
> Is this what Orchestrator ensure in DUCC?
>
> Thanks in advance.
>
> --
> Reshu Agarwal
>
>

Re: CanceledByDriver status in DUCC

Posted by "reshu.agarwal" <re...@orkash.com>.
On 06/05/2014 06:30 PM, Lou DeGenaro wrote:
> How long does it take to process your longest work item through your
Hi Lou,

It was 3 Minutes that time. We increased this to 10 minutes now for 
testing as you told me to increase it. I have also restarted DUCC. Till 
now, I didn't face the similar problem. But, If I face this again I will 
update you in this same mail trail. But if you see any other reason for 
this then please let me know.

I have one more question as I have read in DUCC document, Orchestrator 
automatically deletes the job's queue from the Broker when job is over. 
But when I monitor the broker service using jconsole, I saw all the jobs 
queues are still there with in the broker.

Will this can create a problem after a time in DUCC?

Is this what Orchestrator ensure in DUCC?

Thanks in advance.

-- 
Reshu Agarwal


Re: CanceledByDriver status in DUCC

Posted by Lou DeGenaro <lo...@gmail.com>.
How long does it take to process your longest work item through your
pipeline?  And what do you specify in your job submission for:

--process_per_item_time_max <integer>                      Maximum elapsed
time (in minutes) for processing one CAS.

Lou.


On Wed, Jun 4, 2014 at 9:47 AM, reshu.agarwal <re...@orkash.com>
wrote:

> HI Lou,
>
> We have debugged our pipeline. If this is the problem of pipeline or code
> then when we run this batch again, then the same errors must be displayed
> in error log. But the same batch processes successfully without any error
> so, this is not the error at code level.
>
>
> And the error log is just as given below:
>
> org.apache.uima.resource.ResourceProcessException: Request To Process Cas
> Has Timed-out. Service Queue:ducc.jd.queue.57. Broker:
> tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
> Cas Timed-out on host: 192.168.xx.xxx
> at org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineComm
> on_impl.sendAndReceiveCAS(BaseUIMAAsynchronousEngineCommon_impl.java:2207)
> at org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineComm
> on_impl.sendAndReceiveCAS(BaseUIMAAsynchronousEngineCommon_impl.java:2042)
> at org.apache.uima.ducc.jd.client.WorkItem.run(WorkItem.java:142)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.uima.aae.error.UimaASProcessCasTimeout: UIMA AS
> Client Timed Out Waiting for Reply From Service:ducc.jd.queue.57
> Broker:tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.
> useCompression=true&closeAsync=false
> ... 9 more
>
> and after 5 similar type errors, node:null, PID:null, and Cas Timed-out on
> host: null,
>
> node:null PID:null directive:ProcessContinue_CasNoRetry
> 04 Jun 2014 15:36:25,592 83 ERROR user.err workItemError 57 N/A
> org.apache.uima.resource.ResourceProcessException: Request To Process Cas
> Has Timed-out. Service Queue:ducc.jd.queue.57. Broker:
> tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
> Cas Timed-out on host: null
>
>
> Please suggest me some solution for this problem. Thanks in advanced.
>
> --
>
> Reshu Agarwal
>
>

Re: CanceledByDriver status in DUCC

Posted by "reshu.agarwal" <re...@orkash.com>.
HI Lou,

We have debugged our pipeline. If this is the problem of pipeline or 
code then when we run this batch again, then the same errors must be 
displayed in error log. But the same batch processes successfully 
without any error so, this is not the error at code level.


And the error log is just as given below:

org.apache.uima.resource.ResourceProcessException: Request To Process 
Cas Has Timed-out. Service Queue:ducc.jd.queue.57. Broker: 
tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false 
Cas Timed-out on host: 192.168.xx.xxx
at 
org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl.sendAndReceiveCAS(BaseUIMAAsynchronousEngineCommon_impl.java:2207)
at 
org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl.sendAndReceiveCAS(BaseUIMAAsynchronousEngineCommon_impl.java:2042)
at org.apache.uima.ducc.jd.client.WorkItem.run(WorkItem.java:142)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.uima.aae.error.UimaASProcessCasTimeout: UIMA AS 
Client Timed Out Waiting for Reply From Service:ducc.jd.queue.57 
Broker:tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
... 9 more

and after 5 similar type errors, node:null, PID:null, and Cas Timed-out 
on host: null,

node:null PID:null directive:ProcessContinue_CasNoRetry
04 Jun 2014 15:36:25,592 83 ERROR user.err workItemError 57 N/A
org.apache.uima.resource.ResourceProcessException: Request To Process 
Cas Has Timed-out. Service Queue:ducc.jd.queue.57. Broker: 
tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false 
Cas Timed-out on host: null


Please suggest me some solution for this problem. Thanks in advanced.

-- 

Reshu Agarwal


Re: CanceledByDriver status in DUCC

Posted by Lou DeGenaro <lo...@gmail.com>.
DUCC does not have an automatic resubmit capability.

Usually a high error count is indicative of a flawed job, and the default
DUCC plug-in error handler cancels the job after 15 errors have occurred so
as to not waste resources.  Normally, one debugs the pipeline before
scaling-out and the number of errors should be zero.  This error limit can
be raised.  Information on the plug-in error handler is available in the
DUCC documentation.

You can also change the timeout value at job submit time by using:

--process_per_item_time_max <integer>                      Maximum elapsed
time (in minutes) for processing one CAS.

Lou.


On Tue, Jun 3, 2014 at 8:59 AM, reshu.agarwal <re...@orkash.com>
wrote:

>
> Hi,
>
> I faces some times the "CanceledByDriver" status to a job in DUCC, If
> error count reaches to 14. The all errors are due to Cas Timed Out
> exception and after 5 continuous errors the Cas timed out on server is
> null. Whole Job is cancelled and documents get skipped by this.
>
> Can I set some configuration in DUCC to reprocess this particular job
> again as new job if job is cancelled by driver?
>
>
> --
> Thanks,
> Reshu Agarwal
>
>