You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by "reshu.agarwal" <re...@orkash.com> on 2014/06/03 14:59:26 UTC
CanceledByDriver status in DUCC
Hi,
I faces some times the "CanceledByDriver" status to a job in DUCC, If
error count reaches to 14. The all errors are due to Cas Timed Out
exception and after 5 continuous errors the Cas timed out on server is
null. Whole Job is cancelled and documents get skipped by this.
Can I set some configuration in DUCC to reprocess this particular job
again as new job if job is cancelled by driver?
--
Thanks,
Reshu Agarwal
Re: CanceledByDriver status in DUCC
Posted by Lou DeGenaro <lo...@gmail.com>.
Great! Glad you figured it out.
I think the the post-install script is supposed to customize the
ducc.properties file properly. If is does not perhaps a Jira is in order.
Lou.
On Mon, Jun 9, 2014 at 1:52 AM, reshu.agarwal <re...@orkash.com>
wrote:
> On 06/09/2014 09:50 AM, reshu.agarwal wrote:
>
>> On 06/06/2014 06:03 PM, Lou DeGenaro wrote:
>>
>>> files for ERROR or WARN messages especially
>>> relative to the MqReaper.
>>>
>> Hi Lou,
>>
>> I have found an error in DUCC logs or.log:
>>
>> java.lang.reflect.UndeclaredThrowableException
>> at com.sun.proxy.$Proxy19.getQueues(Unknown Source)
>> at org.apache.uima.ducc.common.mq.MqHelper.getQueueList(
>> MqHelper.java:152)
>> at org.apache.uima.ducc.orchestrator.maintenance.
>> MqReaper.getJdQueues(MqReaper.java:121)
>> at org.apache.uima.ducc.orchestrator.maintenance.
>> MqReaper.removeUnusedJdQueues(MqReaper.java:159)
>> at org.apache.uima.ducc.orchestrator.maintenance.
>> MaintenanceThread.run(MaintenanceThread.java:106)
>> Caused by: javax.management.InstanceNotFoundException:
>> org.apache.activemq:BrokerName=S1,Type=Broker
>> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.
>> getMBean(DefaultMBeanServerInterceptor.java:1095)
>> at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.
>> getAttribute(DefaultMBeanServerInterceptor.java:643)
>> at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(
>> JmxMBeanServer.java:669)
>> at javax.management.remote.rmi.RMIConnectionImpl.doOperation(
>> RMIConnectionImpl.java:1463)
>> at javax.management.remote.rmi.RMIConnectionImpl.access$300(
>> RMIConnectionImpl.java:96)
>> at javax.management.remote.rmi.RMIConnectionImpl$
>> PrivilegedOperation.run(RMIConnectionImpl.java:1327)
>> at javax.management.remote.rmi.RMIConnectionImpl.
>> doPrivilegedOperation(RMIConnectionImpl.java:1419)
>> at javax.management.remote.rmi.RMIConnectionImpl.getAttribute(
>> RMIConnectionImpl.java:656)
>> at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>> at sun.reflect.DelegatingMethodAccessorImpl.invoke(
>> DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:601)
>> at sun.rmi.server.UnicastServerRef.dispatch(
>> UnicastServerRef.java:322)
>> at sun.rmi.transport.Transport$1.run(Transport.java:177)
>> at sun.rmi.transport.Transport$1.run(Transport.java:174)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
>> at sun.rmi.transport.tcp.TCPTransport.handleMessages(
>> TCPTransport.java:553)
>> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(
>> TCPTransport.java:808)
>> at sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(
>> TCPTransport.java:667)
>> at java.util.concurrent.ThreadPoolExecutor.runWorker(
>> ThreadPoolExecutor.java:1145)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
>> ThreadPoolExecutor.java:615)
>> at java.lang.Thread.run(Thread.java:722)
>> at sun.rmi.transport.StreamRemoteCall.
>> exceptionReceivedFromServer(StreamRemoteCall.java:273)
>> at sun.rmi.transport.StreamRemoteCall.executeCall(
>> StreamRemoteCall.java:251)
>> at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
>> at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
>> at javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown
>> Source)
>> at javax.management.remote.rmi.RMIConnector$
>> RemoteMBeanServerConnection.getAttribute(RMIConnector.java:901)
>> at javax.management.MBeanServerInvocationHandler.invoke(
>> MBeanServerInvocationHandler.java:280)
>> ... 5 more
>>
>> Hi Lou,
>
> This problem has been resolved. I just set ducc.broker.name=localhost
> which is before set as S1.
>
>
> --
> Thanks,
> Reshu Agarwal
>
>
Re: CanceledByDriver status in DUCC
Posted by "reshu.agarwal" <re...@orkash.com>.
On 06/09/2014 09:50 AM, reshu.agarwal wrote:
> On 06/06/2014 06:03 PM, Lou DeGenaro wrote:
>> files for ERROR or WARN messages especially
>> relative to the MqReaper.
> Hi Lou,
>
> I have found an error in DUCC logs or.log:
>
> java.lang.reflect.UndeclaredThrowableException
> at com.sun.proxy.$Proxy19.getQueues(Unknown Source)
> at
> org.apache.uima.ducc.common.mq.MqHelper.getQueueList(MqHelper.java:152)
> at
> org.apache.uima.ducc.orchestrator.maintenance.MqReaper.getJdQueues(MqReaper.java:121)
> at
> org.apache.uima.ducc.orchestrator.maintenance.MqReaper.removeUnusedJdQueues(MqReaper.java:159)
> at
> org.apache.uima.ducc.orchestrator.maintenance.MaintenanceThread.run(MaintenanceThread.java:106)
> Caused by: javax.management.InstanceNotFoundException:
> org.apache.activemq:BrokerName=S1,Type=Broker
> at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
> at
> com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:643)
> at
> com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:669)
> at
> javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1463)
> at
> javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:96)
> at
> javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1327)
> at
> javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1419)
> at
> javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:656)
> at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at
> sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
> at sun.rmi.transport.Transport$1.run(Transport.java:177)
> at sun.rmi.transport.Transport$1.run(Transport.java:174)
> at java.security.AccessController.doPrivileged(Native Method)
> at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
> at
> sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
> at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
> at
> sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> at
> sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:273)
> at
> sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:251)
> at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
> at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
> at
> javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown Source)
>
> at
> javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:901)
> at
> javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:280)
> ... 5 more
>
Hi Lou,
This problem has been resolved. I just set ducc.broker.name=localhost
which is before set as S1.
--
Thanks,
Reshu Agarwal
Re: CanceledByDriver status in DUCC
Posted by "reshu.agarwal" <re...@orkash.com>.
On 06/06/2014 06:03 PM, Lou DeGenaro wrote:
> files for ERROR or WARN messages especially
> relative to the MqReaper.
Hi Lou,
I have found an error in DUCC logs or.log:
java.lang.reflect.UndeclaredThrowableException
at com.sun.proxy.$Proxy19.getQueues(Unknown Source)
at
org.apache.uima.ducc.common.mq.MqHelper.getQueueList(MqHelper.java:152)
at
org.apache.uima.ducc.orchestrator.maintenance.MqReaper.getJdQueues(MqReaper.java:121)
at
org.apache.uima.ducc.orchestrator.maintenance.MqReaper.removeUnusedJdQueues(MqReaper.java:159)
at
org.apache.uima.ducc.orchestrator.maintenance.MaintenanceThread.run(MaintenanceThread.java:106)
Caused by: javax.management.InstanceNotFoundException:
org.apache.activemq:BrokerName=S1,Type=Broker
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getMBean(DefaultMBeanServerInterceptor.java:1095)
at
com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttribute(DefaultMBeanServerInterceptor.java:643)
at
com.sun.jmx.mbeanserver.JmxMBeanServer.getAttribute(JmxMBeanServer.java:669)
at
javax.management.remote.rmi.RMIConnectionImpl.doOperation(RMIConnectionImpl.java:1463)
at
javax.management.remote.rmi.RMIConnectionImpl.access$300(RMIConnectionImpl.java:96)
at
javax.management.remote.rmi.RMIConnectionImpl$PrivilegedOperation.run(RMIConnectionImpl.java:1327)
at
javax.management.remote.rmi.RMIConnectionImpl.doPrivilegedOperation(RMIConnectionImpl.java:1419)
at
javax.management.remote.rmi.RMIConnectionImpl.getAttribute(RMIConnectionImpl.java:656)
at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at
sun.rmi.server.UnicastServerRef.dispatch(UnicastServerRef.java:322)
at sun.rmi.transport.Transport$1.run(Transport.java:177)
at sun.rmi.transport.Transport$1.run(Transport.java:174)
at java.security.AccessController.doPrivileged(Native Method)
at sun.rmi.transport.Transport.serviceCall(Transport.java:173)
at
sun.rmi.transport.tcp.TCPTransport.handleMessages(TCPTransport.java:553)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run0(TCPTransport.java:808)
at
sun.rmi.transport.tcp.TCPTransport$ConnectionHandler.run(TCPTransport.java:667)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
at
sun.rmi.transport.StreamRemoteCall.exceptionReceivedFromServer(StreamRemoteCall.java:273)
at
sun.rmi.transport.StreamRemoteCall.executeCall(StreamRemoteCall.java:251)
at sun.rmi.server.UnicastRef.invoke(UnicastRef.java:160)
at com.sun.jmx.remote.internal.PRef.invoke(Unknown Source)
at
javax.management.remote.rmi.RMIConnectionImpl_Stub.getAttribute(Unknown
Source)
at
javax.management.remote.rmi.RMIConnector$RemoteMBeanServerConnection.getAttribute(RMIConnector.java:901)
at
javax.management.MBeanServerInvocationHandler.invoke(MBeanServerInvocationHandler.java:280)
... 5 more
--
Thanks,
Reshu Agarwal
Re: CanceledByDriver status in DUCC
Posted by Lou DeGenaro <lo...@gmail.com>.
There should be no job queue "leak" on the broker. The Orchestrator should
be cleaning these up. It's a bug otherwise.
Search <ducc_home>/logs/or.log* files for ERROR or WARN messages especially
relative to the MqReaper.
Lou.
On Fri, Jun 6, 2014 at 5:41 AM, reshu.agarwal <re...@orkash.com>
wrote:
> On 06/05/2014 06:30 PM, Lou DeGenaro wrote:
>
>> How long does it take to process your longest work item through your
>>
> Hi Lou,
>
> It was 3 Minutes that time. We increased this to 10 minutes now for
> testing as you told me to increase it. I have also restarted DUCC. Till
> now, I didn't face the similar problem. But, If I face this again I will
> update you in this same mail trail. But if you see any other reason for
> this then please let me know.
>
> I have one more question as I have read in DUCC document, Orchestrator
> automatically deletes the job's queue from the Broker when job is over. But
> when I monitor the broker service using jconsole, I saw all the jobs queues
> are still there with in the broker.
>
> Will this can create a problem after a time in DUCC?
>
> Is this what Orchestrator ensure in DUCC?
>
> Thanks in advance.
>
> --
> Reshu Agarwal
>
>
Re: CanceledByDriver status in DUCC
Posted by "reshu.agarwal" <re...@orkash.com>.
On 06/05/2014 06:30 PM, Lou DeGenaro wrote:
> How long does it take to process your longest work item through your
Hi Lou,
It was 3 Minutes that time. We increased this to 10 minutes now for
testing as you told me to increase it. I have also restarted DUCC. Till
now, I didn't face the similar problem. But, If I face this again I will
update you in this same mail trail. But if you see any other reason for
this then please let me know.
I have one more question as I have read in DUCC document, Orchestrator
automatically deletes the job's queue from the Broker when job is over.
But when I monitor the broker service using jconsole, I saw all the jobs
queues are still there with in the broker.
Will this can create a problem after a time in DUCC?
Is this what Orchestrator ensure in DUCC?
Thanks in advance.
--
Reshu Agarwal
Re: CanceledByDriver status in DUCC
Posted by Lou DeGenaro <lo...@gmail.com>.
How long does it take to process your longest work item through your
pipeline? And what do you specify in your job submission for:
--process_per_item_time_max <integer> Maximum elapsed
time (in minutes) for processing one CAS.
Lou.
On Wed, Jun 4, 2014 at 9:47 AM, reshu.agarwal <re...@orkash.com>
wrote:
> HI Lou,
>
> We have debugged our pipeline. If this is the problem of pipeline or code
> then when we run this batch again, then the same errors must be displayed
> in error log. But the same batch processes successfully without any error
> so, this is not the error at code level.
>
>
> And the error log is just as given below:
>
> org.apache.uima.resource.ResourceProcessException: Request To Process Cas
> Has Timed-out. Service Queue:ducc.jd.queue.57. Broker:
> tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
> Cas Timed-out on host: 192.168.xx.xxx
> at org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineComm
> on_impl.sendAndReceiveCAS(BaseUIMAAsynchronousEngineCommon_impl.java:2207)
> at org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineComm
> on_impl.sendAndReceiveCAS(BaseUIMAAsynchronousEngineCommon_impl.java:2042)
> at org.apache.uima.ducc.jd.client.WorkItem.run(WorkItem.java:142)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
> at java.util.concurrent.FutureTask.run(FutureTask.java:166)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(
> ThreadPoolExecutor.java:1145)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(
> ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:722)
> Caused by: org.apache.uima.aae.error.UimaASProcessCasTimeout: UIMA AS
> Client Timed Out Waiting for Reply From Service:ducc.jd.queue.57
> Broker:tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.
> useCompression=true&closeAsync=false
> ... 9 more
>
> and after 5 similar type errors, node:null, PID:null, and Cas Timed-out on
> host: null,
>
> node:null PID:null directive:ProcessContinue_CasNoRetry
> 04 Jun 2014 15:36:25,592 83 ERROR user.err workItemError 57 N/A
> org.apache.uima.resource.ResourceProcessException: Request To Process Cas
> Has Timed-out. Service Queue:ducc.jd.queue.57. Broker:
> tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
> Cas Timed-out on host: null
>
>
> Please suggest me some solution for this problem. Thanks in advanced.
>
> --
>
> Reshu Agarwal
>
>
Re: CanceledByDriver status in DUCC
Posted by "reshu.agarwal" <re...@orkash.com>.
HI Lou,
We have debugged our pipeline. If this is the problem of pipeline or
code then when we run this batch again, then the same errors must be
displayed in error log. But the same batch processes successfully
without any error so, this is not the error at code level.
And the error log is just as given below:
org.apache.uima.resource.ResourceProcessException: Request To Process
Cas Has Timed-out. Service Queue:ducc.jd.queue.57. Broker:
tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
Cas Timed-out on host: 192.168.xx.xxx
at
org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl.sendAndReceiveCAS(BaseUIMAAsynchronousEngineCommon_impl.java:2207)
at
org.apache.uima.adapter.jms.client.BaseUIMAAsynchronousEngineCommon_impl.sendAndReceiveCAS(BaseUIMAAsynchronousEngineCommon_impl.java:2042)
at org.apache.uima.ducc.jd.client.WorkItem.run(WorkItem.java:142)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.apache.uima.aae.error.UimaASProcessCasTimeout: UIMA AS
Client Timed Out Waiting for Reply From Service:ducc.jd.queue.57
Broker:tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
... 9 more
and after 5 similar type errors, node:null, PID:null, and Cas Timed-out
on host: null,
node:null PID:null directive:ProcessContinue_CasNoRetry
04 Jun 2014 15:36:25,592 83 ERROR user.err workItemError 57 N/A
org.apache.uima.resource.ResourceProcessException: Request To Process
Cas Has Timed-out. Service Queue:ducc.jd.queue.57. Broker:
tcp://S1:61616?wireFormat.maxInactivityDuration=0&jms.useCompression=true&closeAsync=false
Cas Timed-out on host: null
Please suggest me some solution for this problem. Thanks in advanced.
--
Reshu Agarwal
Re: CanceledByDriver status in DUCC
Posted by Lou DeGenaro <lo...@gmail.com>.
DUCC does not have an automatic resubmit capability.
Usually a high error count is indicative of a flawed job, and the default
DUCC plug-in error handler cancels the job after 15 errors have occurred so
as to not waste resources. Normally, one debugs the pipeline before
scaling-out and the number of errors should be zero. This error limit can
be raised. Information on the plug-in error handler is available in the
DUCC documentation.
You can also change the timeout value at job submit time by using:
--process_per_item_time_max <integer> Maximum elapsed
time (in minutes) for processing one CAS.
Lou.
On Tue, Jun 3, 2014 at 8:59 AM, reshu.agarwal <re...@orkash.com>
wrote:
>
> Hi,
>
> I faces some times the "CanceledByDriver" status to a job in DUCC, If
> error count reaches to 14. The all errors are due to Cas Timed Out
> exception and after 5 continuous errors the Cas timed out on server is
> null. Whole Job is cancelled and documents get skipped by this.
>
> Can I set some configuration in DUCC to reprocess this particular job
> again as new job if job is cancelled by driver?
>
>
> --
> Thanks,
> Reshu Agarwal
>
>