You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by "reshu.agarwal" <re...@orkash.com> on 2014/03/21 07:12:17 UTC

Ducc Problems

Hi all,

DUCC does not call collectionProcessComplete method of Cas Consumer. 
Hence we can not attempt batch processing in cas consumer and it 
increases our process timing. Is there any other option for that or is 
it a bug in DUCC?

Thanks in advance

-- 
Reshu Agarwal


Re: Ducc Problems

Posted by Eddie Epstein <ea...@gmail.com>.
A short discussion about Cas Consumers is at
http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1300008.3

Individual Work Items may be treated as collections. The Work Item CAS can
be used to trigger collection process complete for these collections. See
http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1340008.5.2

Eddie


On Fri, Mar 21, 2014 at 2:12 AM, reshu.agarwal <re...@orkash.com>wrote:

>
> Hi all,
>
> DUCC does not call collectionProcessComplete method of Cas Consumer. Hence
> we can not attempt batch processing in cas consumer and it increases our
> process timing. Is there any other option for that or is it a bug in DUCC?
>
> Thanks in advance
>
> --
> Reshu Agarwal
>
>

Re: Ducc Problems

Posted by Eddie Epstein <ea...@gmail.com>.
Reshu,

With preemption active there is no way to guarantee that the destroy method
will be called in a Job Process. A job submitted to a non-preemptable
class, like "fixed", will not have any preemption.

Eddie


On Sun, Feb 22, 2015 at 11:11 PM, reshu.agarwal <re...@orkash.com>
wrote:

> I am running Uima-AS 2.6.0. Please have a look on logs:
>
> Feb 23, 2015 9:37:44 AM org.apache.uima.adapter.jms.service.UIMA_Service
> initialize(67)
> INFO: UIMA-AS version 2.6.0
>
>
>
> On 02/20/2015 08:20 PM, Jaroslaw Cwiklik wrote:
>
>> Reshu, can you confirm if you are running UIMA-AS 2.6.0 or earlier. You
>> can
>> look at the log for something similar to this:
>>
>> +------------------------------------------------------------------
>> + Service Name:Person Title Annotator
>> + Service Queue Name:PersonTitleAnnotatorQueue
>> + Service Start Time:20 Feb 2015 09:25:43
>> *+ UIMA AS Version:2.6.0
>>   <-----------------------------------------------------------
>> ------------*
>> + UIMA Core Version:2.6.0
>> + OS Name:Linux
>> + OS Version:2.6.32-279.el6.x86_64
>> + OS Architecture:amd64
>> + OS CPU Count:4
>> + JVM Vendor:IBM Corporation
>> + JVM Name:IBM J9 VM
>> + JVM Version:2.6
>>
>> Jerry
>>
>> On Thu, Feb 19, 2015 at 11:26 PM, reshu.agarwal <reshu.agarwal@orkash.com
>> >
>> wrote:
>>
>>  Dear Cwiklik,
>>>
>>> There is only 2 seconds delay between the last log message and
>>> org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl
>>> quiesceAndStop.
>>>
>>> Please have a look on the logs:
>>>
>>>   Process Received a Message. Is Process target for message:true. Target
>>>
>>>> PID:22640
>>>>>>>>>>>>
>>>>>>>>>>>>  configFactory.stop() - stopped route:mina:tcp://localhost:
>>>>>>>>>>>
>>>>>>>>>> 52449?transferExchange=true&sync=false
>>>>>>
>>>>>>  Feb 19, 2015 5:39:54 PM org.apache.uima.aae.controller.
>>>>>
>>>> PrimitiveAnalysisEngineController_impl quiesceAndStop
>>> INFO: Stopping Controller: ducc.jd.queue.13202
>>> Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0
>>> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow false
>>> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>>> Command=2002.
>>> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow false
>>> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow true
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>>> Command=2002.
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow true
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow false
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>>> Command=2002.
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow false
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow true
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>>> Command=2002.
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow true
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>>> UIMA-AS Service is Stopping, All CASes Have Been Processed
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.aae.controller.
>>> PrimitiveAnalysisEngineController_impl stop
>>> INFO: Stopping Controller: ducc.jd.queue.13202
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow true
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>>> Command=2002.
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>>> ShutdownNow true
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsInputChannel
>>> stopChannel
>>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.
>>> activemq.JmsOutputChannel
>>> stop
>>> INFO: Controller: ducc.jd.queue.13202 Output Channel Shutdown Completed
>>>
>>> Thanks Reshu.
>>>
>>>
>>>
>>> On 02/20/2015 12:40 AM, Jaroslaw Cwiklik wrote:
>>>
>>>  One possible explanation for destroy() not getting called is that a
>>>> process
>>>> (JP) may be still working on a CAS when Ducc deallocates the process.
>>>> Ducc
>>>> first asks the process to quiesce and stop and allows it 1 minute to
>>>> terminate on its own. If this does not happen, Ducc kills the process
>>>> via
>>>> kill -9. In such case the process will be clobbered and destroy()
>>>> methods
>>>> in UIMA-AS are not called.
>>>> There should be some evidence in JP logs at the very end. Look for
>>>> something like this:
>>>>
>>>>   Process Received a Message. Is Process target for message:true.
>>>>
>>>>> Target PID:27520
>>>>>>>>>>>>
>>>>>>>>>>> configFactory.stop() - stopped
>>>>>
>>>>>> route:mina:tcp://localhost:49338?transferExchange=true&sync=false
>>>>>>
>>>>> 01:56:22.735 - 94:
>>>> org.apache.uima.aae.controller.PrimitiveAnalysisEngineControl
>>>> ler_impl.quiesceAndStop:
>>>> INFO: Stopping Controller: ducc.jd.queue.226091
>>>> Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0
>>>>
>>>> Look at the timestamp of >>>>>>>>> Process Received a Message. Is
>>>> Process
>>>> target for message:true.
>>>> and compare it to a timestamp of the last log message. Does it look like
>>>> there is a long delay?
>>>>
>>>>
>>>> Jerry
>>>>
>>>> On Wed, Feb 18, 2015 at 2:03 AM, reshu.agarwal <
>>>> reshu.agarwal@orkash.com>
>>>> wrote:
>>>>
>>>>   Dear Eddie,
>>>>
>>>>> This problem has been resolved by using destroy method in ducc version
>>>>> 1.0.0 but when I upgrade my ducc version from 1.0.0 to 1.1.0 DUCC
>>>>> didn't
>>>>> call the destroy method.
>>>>>
>>>>> It also do not call the stop method of CollectionReader as well as
>>>>> finalize method of any java class as well as destroy/
>>>>> collectionProcessComplete
>>>>> method of cas consumer.
>>>>>
>>>>> I want to close my connection to Database after completion of job as
>>>>> well
>>>>> as want to use batch processing at cas consumer level like
>>>>> PersonTitleDBWriterCasConsumer.
>>>>>
>>>>> Thanks in advanced.
>>>>>
>>>>> Reshu.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 03/31/2014 04:14 PM, reshu.agarwal wrote:
>>>>>
>>>>>   On 03/28/2014 05:28 PM, Eddie Epstein wrote:
>>>>>
>>>>>>   Another alternative would be to do the final flush in the Cas
>>>>>>
>>>>>>> consumer's
>>>>>>> destroy method.
>>>>>>>
>>>>>>> Another issue to be aware of, in order to balance resources between
>>>>>>> jobs,
>>>>>>> DUCC uses preemption of job processes scheduled in a "fair-share"
>>>>>>> class.
>>>>>>> This may not be acceptable for jobs which are doing incremental
>>>>>>> commits.
>>>>>>> The solution is to schedule the job in a non-preemptable class.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Mar 28, 2014 at 1:22 AM, reshu.agarwal <
>>>>>>> reshu.agarwal@orkash.com
>>>>>>>
>>>>>>>  wrote:
>>>>>>>>
>>>>>>>>     On 03/28/2014 01:28 AM, Eddie Epstein wrote:
>>>>>>>
>>>>>>>     Hi Reshu,
>>>>>>>>
>>>>>>>>  The Job model in DUCC is for the Collection Reader to send "work
>>>>>>>>> item
>>>>>>>>> CASes", where a work item represents a collection of work to be
>>>>>>>>> done
>>>>>>>>> by a
>>>>>>>>> Job Process. For example, a work item could be a file or a subset
>>>>>>>>> of
>>>>>>>>> a
>>>>>>>>> file
>>>>>>>>> that contains many documents, where each document would be
>>>>>>>>> individually
>>>>>>>>> put
>>>>>>>>> into a CAS by the Cas Multiplier in the Job Process.
>>>>>>>>>
>>>>>>>>> DUCC is designed so that after processing the "mini-collection"
>>>>>>>>> represented
>>>>>>>>> by the work item,  the Cas Consumer should flush any data. This is
>>>>>>>>> done by
>>>>>>>>> routing the "work item CAS" to the Cas Consumer, after all work
>>>>>>>>> item
>>>>>>>>> documents are completed, at which point the CC does the flush.
>>>>>>>>>
>>>>>>>>> The sample code described in
>>>>>>>>> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009
>>>>>>>>> uses
>>>>>>>>> the
>>>>>>>>> work item CAS to flush data in exactly this way.
>>>>>>>>>
>>>>>>>>> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a
>>>>>>>>> commit)
>>>>>>>>> in
>>>>>>>>> the process method after every 50 documents.
>>>>>>>>>
>>>>>>>>> Regards
>>>>>>>>> Eddie
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <
>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>     On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>>>>>>>>
>>>>>>>>>      Hi Reshu,
>>>>>>>>>
>>>>>>>>>>   The collectionProcessingComplete() method in UIMA-AS has a
>>>>>>>>>>
>>>>>>>>>>> limitation: a
>>>>>>>>>>> Collection Processing Complete request sent to the UIMA-AS
>>>>>>>>>>> Analysis
>>>>>>>>>>> Service
>>>>>>>>>>> is cascaded down to all delegates; however, if a particular
>>>>>>>>>>> delegate
>>>>>>>>>>> is
>>>>>>>>>>> scaled-out, only one of the instances of the delegate will get
>>>>>>>>>>> this
>>>>>>>>>>> call.
>>>>>>>>>>>
>>>>>>>>>>> Since DUCC is using UIMA-AS to scale out the Job processes, it
>>>>>>>>>>> has
>>>>>>>>>>> no
>>>>>>>>>>> way
>>>>>>>>>>> to deliver a CPC to all instances.
>>>>>>>>>>>
>>>>>>>>>>> The applications we have been running on DUCC have used the Work
>>>>>>>>>>> Item
>>>>>>>>>>> CAS
>>>>>>>>>>> as a signal to CAS consumers to do CPC level processing. That is
>>>>>>>>>>> discussed
>>>>>>>>>>> in the first reference above, in the paragraph "Flushing Cached
>>>>>>>>>>> Data".
>>>>>>>>>>>
>>>>>>>>>>> Eddie
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <
>>>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>      On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>>>>>>>>
>>>>>>>>>>>       Are you using standard UIMA interface code to Solr? If so,
>>>>>>>>>>> which
>>>>>>>>>>>
>>>>>>>>>>>  Cas
>>>>>>>>>>>>
>>>>>>>>>>>>    Consumer?
>>>>>>>>>>>>
>>>>>>>>>>>>  Taking at quick look at the source code for SolrCASConsumer,
>>>>>>>>>>>>> the
>>>>>>>>>>>>> batch
>>>>>>>>>>>>> and
>>>>>>>>>>>>> collection process complete methods appear to do nothing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Eddie
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>       On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>        Hence we can not attempt batch processing in cas
>>>>>>>>>>>>> consumer
>>>>>>>>>>>>> and
>>>>>>>>>>>>> it
>>>>>>>>>>>>>
>>>>>>>>>>>>>      increases our process timing. Is there any other option
>>>>>>>>>>>>> for
>>>>>>>>>>>>>
>>>>>>>>>>>>>> that or
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   is
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> it a
>>>>>>>>>>>>>>> bug in DUCC?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>       Please reply on this problem as if I am sending
>>>>>>>>>>>>>>> document
>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>> solr
>>>>>>>>>>>>>>> one by
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>     one by cas consumer without using batch process and
>>>>>>>>>>>>>>> committing
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>   solr. It
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> not optimum way to use this. Why ducc is not calling
>>>>>>>>>>>>>> collection
>>>>>>>>>>>>>> Process
>>>>>>>>>>>>>> Complete method of Cas Consumer? And If I want to do that then
>>>>>>>>>>>>>> What
>>>>>>>>>>>>>> is
>>>>>>>>>>>>>> the
>>>>>>>>>>>>>> way to do this?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks in Advanced.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>       Hi Eddie,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     I am not using standard UIMA interface code to Solr. I
>>>>>>>>>>>>>> create my
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   own Cas
>>>>>>>>>>>>>>
>>>>>>>>>>>>>    Consumer. I will take a look on that too. But the problem is
>>>>>>>>>>>>> not
>>>>>>>>>>>>>
>>>>>>>>>>>>>  for
>>>>>>>>>>>> particularly to use solr, I can use any source to store my
>>>>>>>>>>>> output. I
>>>>>>>>>>>> want
>>>>>>>>>>>> to do batch processing and want to use
>>>>>>>>>>>> collectionProcessComplete.
>>>>>>>>>>>> Why
>>>>>>>>>>>> DUCC
>>>>>>>>>>>> is not calling it? I check it with UIMA AS also and my cas
>>>>>>>>>>>> consumer
>>>>>>>>>>>> is
>>>>>>>>>>>> working fine with it and also performing batch processing.
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>      Hi Eddie,
>>>>>>>>>>>>
>>>>>>>>>>>>    I am using cas consumer similar to apache uima example:
>>>>>>>>>>>>
>>>>>>>>>>>>       "apache-uima/examples/src/org/apache/uima/examples/cpe/
>>>>>>>>>>>
>>>>>>>>>> PersonTitleDBWriterCasConsumer.java"
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thanks,
>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Hi Eddie,
>>>>>>>>>>
>>>>>>>>>>   You are right I know this fact. PersonTitleDBWriterCasConsumer
>>>>>>>>>> is
>>>>>>>>>>
>>>>>>>>> doing a
>>>>>>>> flush (a commit) in the process method after every 50 documents and
>>>>>>>> if
>>>>>>>> less
>>>>>>>> then 50 documents in cas it will do commit or flush by
>>>>>>>> collectionProcessComplete method. So, If it is not called then those
>>>>>>>> documents can not be committed. That is why I want ducc calls this
>>>>>>>> method.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks,
>>>>>>>> Reshu Agarwal
>>>>>>>>
>>>>>>>>
>>>>>>>>    Hi,
>>>>>>>>
>>>>>>>>  Destroy method worked for me. It did the same what I wanted from
>>>>>>>
>>>>>> CollectionProcessComplete method.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>

Re: Ducc Problems

Posted by "reshu.agarwal" <re...@orkash.com>.
I am running Uima-AS 2.6.0. Please have a look on logs:

Feb 23, 2015 9:37:44 AM org.apache.uima.adapter.jms.service.UIMA_Service initialize(67)
INFO: UIMA-AS version 2.6.0


On 02/20/2015 08:20 PM, Jaroslaw Cwiklik wrote:
> Reshu, can you confirm if you are running UIMA-AS 2.6.0 or earlier. You can
> look at the log for something similar to this:
>
> +------------------------------------------------------------------
> + Service Name:Person Title Annotator
> + Service Queue Name:PersonTitleAnnotatorQueue
> + Service Start Time:20 Feb 2015 09:25:43
> *+ UIMA AS Version:2.6.0
>   <-----------------------------------------------------------------------*
> + UIMA Core Version:2.6.0
> + OS Name:Linux
> + OS Version:2.6.32-279.el6.x86_64
> + OS Architecture:amd64
> + OS CPU Count:4
> + JVM Vendor:IBM Corporation
> + JVM Name:IBM J9 VM
> + JVM Version:2.6
>
> Jerry
>
> On Thu, Feb 19, 2015 at 11:26 PM, reshu.agarwal <re...@orkash.com>
> wrote:
>
>> Dear Cwiklik,
>>
>> There is only 2 seconds delay between the last log message and
>> org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl
>> quiesceAndStop.
>>
>> Please have a look on the logs:
>>
>>   Process Received a Message. Is Process target for message:true. Target
>>>>>>>>>>> PID:22640
>>>>>>>>>>>
>>>>>>>>>> configFactory.stop() - stopped route:mina:tcp://localhost:
>>>>> 52449?transferExchange=true&sync=false
>>>>>
>>>> Feb 19, 2015 5:39:54 PM org.apache.uima.aae.controller.
>> PrimitiveAnalysisEngineController_impl quiesceAndStop
>> INFO: Stopping Controller: ducc.jd.queue.13202
>> Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0
>> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow false
>> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>> Command=2002.
>> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow false
>> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow true
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>> Command=2002.
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow true
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow false
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>> Command=2002.
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow false
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow true
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>> Command=2002.
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow true
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>> UIMA-AS Service is Stopping, All CASes Have Been Processed
>> Feb 19, 2015 5:39:56 PM org.apache.uima.aae.controller.
>> PrimitiveAnalysisEngineController_impl stop
>> INFO: Stopping Controller: ducc.jd.queue.13202
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow true
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
>> Command=2002.
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
>> ShutdownNow true
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
>> stopChannel
>> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
>> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
>> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsOutputChannel
>> stop
>> INFO: Controller: ducc.jd.queue.13202 Output Channel Shutdown Completed
>>
>> Thanks Reshu.
>>
>>
>>
>> On 02/20/2015 12:40 AM, Jaroslaw Cwiklik wrote:
>>
>>> One possible explanation for destroy() not getting called is that a
>>> process
>>> (JP) may be still working on a CAS when Ducc deallocates the process. Ducc
>>> first asks the process to quiesce and stop and allows it 1 minute to
>>> terminate on its own. If this does not happen, Ducc kills the process via
>>> kill -9. In such case the process will be clobbered and destroy() methods
>>> in UIMA-AS are not called.
>>> There should be some evidence in JP logs at the very end. Look for
>>> something like this:
>>>
>>>   Process Received a Message. Is Process target for message:true.
>>>>>>>>>>> Target PID:27520
>>>> configFactory.stop() - stopped
>>>>> route:mina:tcp://localhost:49338?transferExchange=true&sync=false
>>> 01:56:22.735 - 94:
>>> org.apache.uima.aae.controller.PrimitiveAnalysisEngineControl
>>> ler_impl.quiesceAndStop:
>>> INFO: Stopping Controller: ducc.jd.queue.226091
>>> Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0
>>>
>>> Look at the timestamp of >>>>>>>>> Process Received a Message. Is Process
>>> target for message:true.
>>> and compare it to a timestamp of the last log message. Does it look like
>>> there is a long delay?
>>>
>>>
>>> Jerry
>>>
>>> On Wed, Feb 18, 2015 at 2:03 AM, reshu.agarwal <re...@orkash.com>
>>> wrote:
>>>
>>>   Dear Eddie,
>>>> This problem has been resolved by using destroy method in ducc version
>>>> 1.0.0 but when I upgrade my ducc version from 1.0.0 to 1.1.0 DUCC didn't
>>>> call the destroy method.
>>>>
>>>> It also do not call the stop method of CollectionReader as well as
>>>> finalize method of any java class as well as destroy/
>>>> collectionProcessComplete
>>>> method of cas consumer.
>>>>
>>>> I want to close my connection to Database after completion of job as well
>>>> as want to use batch processing at cas consumer level like
>>>> PersonTitleDBWriterCasConsumer.
>>>>
>>>> Thanks in advanced.
>>>>
>>>> Reshu.
>>>>
>>>>
>>>>
>>>>
>>>> On 03/31/2014 04:14 PM, reshu.agarwal wrote:
>>>>
>>>>   On 03/28/2014 05:28 PM, Eddie Epstein wrote:
>>>>>   Another alternative would be to do the final flush in the Cas
>>>>>> consumer's
>>>>>> destroy method.
>>>>>>
>>>>>> Another issue to be aware of, in order to balance resources between
>>>>>> jobs,
>>>>>> DUCC uses preemption of job processes scheduled in a "fair-share"
>>>>>> class.
>>>>>> This may not be acceptable for jobs which are doing incremental
>>>>>> commits.
>>>>>> The solution is to schedule the job in a non-preemptable class.
>>>>>>
>>>>>>
>>>>>> On Fri, Mar 28, 2014 at 1:22 AM, reshu.agarwal <
>>>>>> reshu.agarwal@orkash.com
>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>    On 03/28/2014 01:28 AM, Eddie Epstein wrote:
>>>>>>
>>>>>>>    Hi Reshu,
>>>>>>>
>>>>>>>> The Job model in DUCC is for the Collection Reader to send "work item
>>>>>>>> CASes", where a work item represents a collection of work to be done
>>>>>>>> by a
>>>>>>>> Job Process. For example, a work item could be a file or a subset of
>>>>>>>> a
>>>>>>>> file
>>>>>>>> that contains many documents, where each document would be
>>>>>>>> individually
>>>>>>>> put
>>>>>>>> into a CAS by the Cas Multiplier in the Job Process.
>>>>>>>>
>>>>>>>> DUCC is designed so that after processing the "mini-collection"
>>>>>>>> represented
>>>>>>>> by the work item,  the Cas Consumer should flush any data. This is
>>>>>>>> done by
>>>>>>>> routing the "work item CAS" to the Cas Consumer, after all work item
>>>>>>>> documents are completed, at which point the CC does the flush.
>>>>>>>>
>>>>>>>> The sample code described in
>>>>>>>> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009
>>>>>>>> uses
>>>>>>>> the
>>>>>>>> work item CAS to flush data in exactly this way.
>>>>>>>>
>>>>>>>> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a
>>>>>>>> commit)
>>>>>>>> in
>>>>>>>> the process method after every 50 documents.
>>>>>>>>
>>>>>>>> Regards
>>>>>>>> Eddie
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <
>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>     On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>>>>>>>
>>>>>>>>      Hi Reshu,
>>>>>>>>>   The collectionProcessingComplete() method in UIMA-AS has a
>>>>>>>>>> limitation: a
>>>>>>>>>> Collection Processing Complete request sent to the UIMA-AS Analysis
>>>>>>>>>> Service
>>>>>>>>>> is cascaded down to all delegates; however, if a particular
>>>>>>>>>> delegate
>>>>>>>>>> is
>>>>>>>>>> scaled-out, only one of the instances of the delegate will get this
>>>>>>>>>> call.
>>>>>>>>>>
>>>>>>>>>> Since DUCC is using UIMA-AS to scale out the Job processes, it has
>>>>>>>>>> no
>>>>>>>>>> way
>>>>>>>>>> to deliver a CPC to all instances.
>>>>>>>>>>
>>>>>>>>>> The applications we have been running on DUCC have used the Work
>>>>>>>>>> Item
>>>>>>>>>> CAS
>>>>>>>>>> as a signal to CAS consumers to do CPC level processing. That is
>>>>>>>>>> discussed
>>>>>>>>>> in the first reference above, in the paragraph "Flushing Cached
>>>>>>>>>> Data".
>>>>>>>>>>
>>>>>>>>>> Eddie
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <
>>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>      On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>>>>>>>
>>>>>>>>>>       Are you using standard UIMA interface code to Solr? If so,
>>>>>>>>>> which
>>>>>>>>>>
>>>>>>>>>>> Cas
>>>>>>>>>>>
>>>>>>>>>>>    Consumer?
>>>>>>>>>>>
>>>>>>>>>>>> Taking at quick look at the source code for SolrCASConsumer, the
>>>>>>>>>>>> batch
>>>>>>>>>>>> and
>>>>>>>>>>>> collection process complete methods appear to do nothing.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Eddie
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>       On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>        Hence we can not attempt batch processing in cas consumer
>>>>>>>>>>>> and
>>>>>>>>>>>> it
>>>>>>>>>>>>
>>>>>>>>>>>>      increases our process timing. Is there any other option for
>>>>>>>>>>>>> that or
>>>>>>>>>>>>>
>>>>>>>>>>>>>   is
>>>>>>>>>>>>>> it a
>>>>>>>>>>>>>> bug in DUCC?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>       Please reply on this problem as if I am sending document
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> solr
>>>>>>>>>>>>>> one by
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     one by cas consumer without using batch process and
>>>>>>>>>>>>>> committing
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   solr. It
>>>>>>>>>>>>> is
>>>>>>>>>>>>> not optimum way to use this. Why ducc is not calling collection
>>>>>>>>>>>>> Process
>>>>>>>>>>>>> Complete method of Cas Consumer? And If I want to do that then
>>>>>>>>>>>>> What
>>>>>>>>>>>>> is
>>>>>>>>>>>>> the
>>>>>>>>>>>>> way to do this?
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks in Advanced.
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>       Hi Eddie,
>>>>>>>>>>>>>
>>>>>>>>>>>>>     I am not using standard UIMA interface code to Solr. I
>>>>>>>>>>>>> create my
>>>>>>>>>>>>>
>>>>>>>>>>>>>   own Cas
>>>>>>>>>>>>    Consumer. I will take a look on that too. But the problem is
>>>>>>>>>>>> not
>>>>>>>>>>>>
>>>>>>>>>>> for
>>>>>>>>>>> particularly to use solr, I can use any source to store my
>>>>>>>>>>> output. I
>>>>>>>>>>> want
>>>>>>>>>>> to do batch processing and want to use collectionProcessComplete.
>>>>>>>>>>> Why
>>>>>>>>>>> DUCC
>>>>>>>>>>> is not calling it? I check it with UIMA AS also and my cas
>>>>>>>>>>> consumer
>>>>>>>>>>> is
>>>>>>>>>>> working fine with it and also performing batch processing.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>      Hi Eddie,
>>>>>>>>>>>
>>>>>>>>>>>    I am using cas consumer similar to apache uima example:
>>>>>>>>>>>
>>>>>>>>>>      "apache-uima/examples/src/org/apache/uima/examples/cpe/
>>>>>>>>> PersonTitleDBWriterCasConsumer.java"
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks,
>>>>>>>>> Reshu Agarwal
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     Hi Eddie,
>>>>>>>>>
>>>>>>>>>   You are right I know this fact. PersonTitleDBWriterCasConsumer is
>>>>>>> doing a
>>>>>>> flush (a commit) in the process method after every 50 documents and if
>>>>>>> less
>>>>>>> then 50 documents in cas it will do commit or flush by
>>>>>>> collectionProcessComplete method. So, If it is not called then those
>>>>>>> documents can not be committed. That is why I want ducc calls this
>>>>>>> method.
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>> Reshu Agarwal
>>>>>>>
>>>>>>>
>>>>>>>    Hi,
>>>>>>>
>>>>>> Destroy method worked for me. It did the same what I wanted from
>>>>> CollectionProcessComplete method.
>>>>>
>>>>>
>>>>>


Re: Ducc Problems

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.
Reshu, can you confirm if you are running UIMA-AS 2.6.0 or earlier. You can
look at the log for something similar to this:

+------------------------------------------------------------------
+ Service Name:Person Title Annotator
+ Service Queue Name:PersonTitleAnnotatorQueue
+ Service Start Time:20 Feb 2015 09:25:43
*+ UIMA AS Version:2.6.0
 <-----------------------------------------------------------------------*
+ UIMA Core Version:2.6.0
+ OS Name:Linux
+ OS Version:2.6.32-279.el6.x86_64
+ OS Architecture:amd64
+ OS CPU Count:4
+ JVM Vendor:IBM Corporation
+ JVM Name:IBM J9 VM
+ JVM Version:2.6

Jerry

On Thu, Feb 19, 2015 at 11:26 PM, reshu.agarwal <re...@orkash.com>
wrote:

>
> Dear Cwiklik,
>
> There is only 2 seconds delay between the last log message and
> org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl
> quiesceAndStop.
>
> Please have a look on the logs:
>
>  Process Received a Message. Is Process target for message:true. Target
>>>>>>>>>> PID:22640
>>>>>>>>>>
>>>>>>>>> configFactory.stop() - stopped route:mina:tcp://localhost:
>>>> 52449?transferExchange=true&sync=false
>>>>
>>> Feb 19, 2015 5:39:54 PM org.apache.uima.aae.controller.
> PrimitiveAnalysisEngineController_impl quiesceAndStop
> INFO: Stopping Controller: ducc.jd.queue.13202
> Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0
> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow false
> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
> Command=2002.
> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow false
> Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow true
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
> Command=2002.
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow true
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow false
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
> Command=2002.
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow false
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow true
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
> Command=2002.
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow true
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
> UIMA-AS Service is Stopping, All CASes Have Been Processed
> Feb 19, 2015 5:39:56 PM org.apache.uima.aae.controller.
> PrimitiveAnalysisEngineController_impl stop
> INFO: Stopping Controller: ducc.jd.queue.13202
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow true
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR
> Command=2002.
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202
> ShutdownNow true
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel
> stopChannel
> INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint:
> queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
> Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsOutputChannel
> stop
> INFO: Controller: ducc.jd.queue.13202 Output Channel Shutdown Completed
>
> Thanks Reshu.
>
>
>
> On 02/20/2015 12:40 AM, Jaroslaw Cwiklik wrote:
>
>> One possible explanation for destroy() not getting called is that a
>> process
>> (JP) may be still working on a CAS when Ducc deallocates the process. Ducc
>> first asks the process to quiesce and stop and allows it 1 minute to
>> terminate on its own. If this does not happen, Ducc kills the process via
>> kill -9. In such case the process will be clobbered and destroy() methods
>> in UIMA-AS are not called.
>> There should be some evidence in JP logs at the very end. Look for
>> something like this:
>>
>>  Process Received a Message. Is Process target for message:true.
>>>>>>>>>>>
>>>>>>>>>> Target PID:27520
>>
>>> configFactory.stop() - stopped
>>>>>
>>>> route:mina:tcp://localhost:49338?transferExchange=true&sync=false
>> 01:56:22.735 - 94:
>> org.apache.uima.aae.controller.PrimitiveAnalysisEngineControl
>> ler_impl.quiesceAndStop:
>> INFO: Stopping Controller: ducc.jd.queue.226091
>> Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0
>>
>> Look at the timestamp of >>>>>>>>> Process Received a Message. Is Process
>> target for message:true.
>> and compare it to a timestamp of the last log message. Does it look like
>> there is a long delay?
>>
>>
>> Jerry
>>
>> On Wed, Feb 18, 2015 at 2:03 AM, reshu.agarwal <re...@orkash.com>
>> wrote:
>>
>>  Dear Eddie,
>>>
>>> This problem has been resolved by using destroy method in ducc version
>>> 1.0.0 but when I upgrade my ducc version from 1.0.0 to 1.1.0 DUCC didn't
>>> call the destroy method.
>>>
>>> It also do not call the stop method of CollectionReader as well as
>>> finalize method of any java class as well as destroy/
>>> collectionProcessComplete
>>> method of cas consumer.
>>>
>>> I want to close my connection to Database after completion of job as well
>>> as want to use batch processing at cas consumer level like
>>> PersonTitleDBWriterCasConsumer.
>>>
>>> Thanks in advanced.
>>>
>>> Reshu.
>>>
>>>
>>>
>>>
>>> On 03/31/2014 04:14 PM, reshu.agarwal wrote:
>>>
>>>  On 03/28/2014 05:28 PM, Eddie Epstein wrote:
>>>>
>>>>  Another alternative would be to do the final flush in the Cas
>>>>> consumer's
>>>>> destroy method.
>>>>>
>>>>> Another issue to be aware of, in order to balance resources between
>>>>> jobs,
>>>>> DUCC uses preemption of job processes scheduled in a "fair-share"
>>>>> class.
>>>>> This may not be acceptable for jobs which are doing incremental
>>>>> commits.
>>>>> The solution is to schedule the job in a non-preemptable class.
>>>>>
>>>>>
>>>>> On Fri, Mar 28, 2014 at 1:22 AM, reshu.agarwal <
>>>>> reshu.agarwal@orkash.com
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>   On 03/28/2014 01:28 AM, Eddie Epstein wrote:
>>>>>
>>>>>>   Hi Reshu,
>>>>>>
>>>>>>> The Job model in DUCC is for the Collection Reader to send "work item
>>>>>>> CASes", where a work item represents a collection of work to be done
>>>>>>> by a
>>>>>>> Job Process. For example, a work item could be a file or a subset of
>>>>>>> a
>>>>>>> file
>>>>>>> that contains many documents, where each document would be
>>>>>>> individually
>>>>>>> put
>>>>>>> into a CAS by the Cas Multiplier in the Job Process.
>>>>>>>
>>>>>>> DUCC is designed so that after processing the "mini-collection"
>>>>>>> represented
>>>>>>> by the work item,  the Cas Consumer should flush any data. This is
>>>>>>> done by
>>>>>>> routing the "work item CAS" to the Cas Consumer, after all work item
>>>>>>> documents are completed, at which point the CC does the flush.
>>>>>>>
>>>>>>> The sample code described in
>>>>>>> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009
>>>>>>> uses
>>>>>>> the
>>>>>>> work item CAS to flush data in exactly this way.
>>>>>>>
>>>>>>> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a
>>>>>>> commit)
>>>>>>> in
>>>>>>> the process method after every 50 documents.
>>>>>>>
>>>>>>> Regards
>>>>>>> Eddie
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <
>>>>>>> reshu.agarwal@orkash.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>    On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>>>>>>
>>>>>>>     Hi Reshu,
>>>>>>>>
>>>>>>>>  The collectionProcessingComplete() method in UIMA-AS has a
>>>>>>>>> limitation: a
>>>>>>>>> Collection Processing Complete request sent to the UIMA-AS Analysis
>>>>>>>>> Service
>>>>>>>>> is cascaded down to all delegates; however, if a particular
>>>>>>>>> delegate
>>>>>>>>> is
>>>>>>>>> scaled-out, only one of the instances of the delegate will get this
>>>>>>>>> call.
>>>>>>>>>
>>>>>>>>> Since DUCC is using UIMA-AS to scale out the Job processes, it has
>>>>>>>>> no
>>>>>>>>> way
>>>>>>>>> to deliver a CPC to all instances.
>>>>>>>>>
>>>>>>>>> The applications we have been running on DUCC have used the Work
>>>>>>>>> Item
>>>>>>>>> CAS
>>>>>>>>> as a signal to CAS consumers to do CPC level processing. That is
>>>>>>>>> discussed
>>>>>>>>> in the first reference above, in the paragraph "Flushing Cached
>>>>>>>>> Data".
>>>>>>>>>
>>>>>>>>> Eddie
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <
>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>     On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>>>>>>
>>>>>>>>>      Are you using standard UIMA interface code to Solr? If so,
>>>>>>>>> which
>>>>>>>>>
>>>>>>>>>> Cas
>>>>>>>>>>
>>>>>>>>>>   Consumer?
>>>>>>>>>>
>>>>>>>>>>> Taking at quick look at the source code for SolrCASConsumer, the
>>>>>>>>>>> batch
>>>>>>>>>>> and
>>>>>>>>>>> collection process complete methods appear to do nothing.
>>>>>>>>>>>
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Eddie
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>      On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>>>>>>>
>>>>>>>>>>>       Hence we can not attempt batch processing in cas consumer
>>>>>>>>>>> and
>>>>>>>>>>> it
>>>>>>>>>>>
>>>>>>>>>>>     increases our process timing. Is there any other option for
>>>>>>>>>>>> that or
>>>>>>>>>>>>
>>>>>>>>>>>>  is
>>>>>>>>>>>>> it a
>>>>>>>>>>>>> bug in DUCC?
>>>>>>>>>>>>>
>>>>>>>>>>>>>      Please reply on this problem as if I am sending document
>>>>>>>>>>>>> in
>>>>>>>>>>>>> solr
>>>>>>>>>>>>> one by
>>>>>>>>>>>>>
>>>>>>>>>>>>>    one by cas consumer without using batch process and
>>>>>>>>>>>>> committing
>>>>>>>>>>>>>
>>>>>>>>>>>>>  solr. It
>>>>>>>>>>>> is
>>>>>>>>>>>> not optimum way to use this. Why ducc is not calling collection
>>>>>>>>>>>> Process
>>>>>>>>>>>> Complete method of Cas Consumer? And If I want to do that then
>>>>>>>>>>>> What
>>>>>>>>>>>> is
>>>>>>>>>>>> the
>>>>>>>>>>>> way to do this?
>>>>>>>>>>>>
>>>>>>>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks in Advanced.
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>      Hi Eddie,
>>>>>>>>>>>>
>>>>>>>>>>>>    I am not using standard UIMA interface code to Solr. I
>>>>>>>>>>>> create my
>>>>>>>>>>>>
>>>>>>>>>>>>  own Cas
>>>>>>>>>>>
>>>>>>>>>>>   Consumer. I will take a look on that too. But the problem is
>>>>>>>>>>> not
>>>>>>>>>>>
>>>>>>>>>> for
>>>>>>>>>> particularly to use solr, I can use any source to store my
>>>>>>>>>> output. I
>>>>>>>>>> want
>>>>>>>>>> to do batch processing and want to use collectionProcessComplete.
>>>>>>>>>> Why
>>>>>>>>>> DUCC
>>>>>>>>>> is not calling it? I check it with UIMA AS also and my cas
>>>>>>>>>> consumer
>>>>>>>>>> is
>>>>>>>>>> working fine with it and also performing batch processing.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thanks,
>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Hi Eddie,
>>>>>>>>>>
>>>>>>>>>>   I am using cas consumer similar to apache uima example:
>>>>>>>>>>
>>>>>>>>>     "apache-uima/examples/src/org/apache/uima/examples/cpe/
>>>>>>>> PersonTitleDBWriterCasConsumer.java"
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks,
>>>>>>>> Reshu Agarwal
>>>>>>>>
>>>>>>>>
>>>>>>>>    Hi Eddie,
>>>>>>>>
>>>>>>>>  You are right I know this fact. PersonTitleDBWriterCasConsumer is
>>>>>>>
>>>>>> doing a
>>>>>> flush (a commit) in the process method after every 50 documents and if
>>>>>> less
>>>>>> then 50 documents in cas it will do commit or flush by
>>>>>> collectionProcessComplete method. So, If it is not called then those
>>>>>> documents can not be committed. That is why I want ducc calls this
>>>>>> method.
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Reshu Agarwal
>>>>>>
>>>>>>
>>>>>>   Hi,
>>>>>>
>>>>> Destroy method worked for me. It did the same what I wanted from
>>>> CollectionProcessComplete method.
>>>>
>>>>
>>>>
>

Re: Ducc Problems

Posted by "reshu.agarwal" <re...@orkash.com>.
Dear Cwiklik,

There is only 2 seconds delay between the last log message and 
org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl 
quiesceAndStop.

Please have a look on the logs:

>>>>>>>>> Process Received a Message. Is Process target for message:true. Target PID:22640
>>> configFactory.stop() - stopped route:mina:tcp://localhost:52449?transferExchange=true&sync=false
Feb 19, 2015 5:39:54 PM org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl quiesceAndStop
INFO: Stopping Controller: ducc.jd.queue.13202
Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0
Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow false
Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR Command=2002.
Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow false
Feb 19, 2015 5:39:54 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow true
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR Command=2002.
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow true
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow false
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR Command=2002.
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow false
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow true
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR Command=2002.
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow true
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
UIMA-AS Service is Stopping, All CASes Have Been Processed
Feb 19, 2015 5:39:56 PM org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl stop
INFO: Stopping Controller: ducc.jd.queue.13202
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow true
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2000 OR Command=2002.
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Stopping Service JMS Transport. Service: ducc.jd.queue.13202 ShutdownNow true
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsInputChannel stopChannel
INFO: Controller: ducc.jd.queue.13202 Stopped Listener on Endpoint: queue://ducc.jd.queue.13202 Selector:  Selector:Command=2001.
Feb 19, 2015 5:39:56 PM org.apache.uima.adapter.jms.activemq.JmsOutputChannel stop
INFO: Controller: ducc.jd.queue.13202 Output Channel Shutdown Completed

Thanks Reshu.


On 02/20/2015 12:40 AM, Jaroslaw Cwiklik wrote:
> One possible explanation for destroy() not getting called is that a process
> (JP) may be still working on a CAS when Ducc deallocates the process. Ducc
> first asks the process to quiesce and stop and allows it 1 minute to
> terminate on its own. If this does not happen, Ducc kills the process via
> kill -9. In such case the process will be clobbered and destroy() methods
> in UIMA-AS are not called.
> There should be some evidence in JP logs at the very end. Look for
> something like this:
>
>>>>>>>>>> Process Received a Message. Is Process target for message:true.
> Target PID:27520
>>>> configFactory.stop() - stopped
> route:mina:tcp://localhost:49338?transferExchange=true&sync=false
> 01:56:22.735 - 94:
> org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl.quiesceAndStop:
> INFO: Stopping Controller: ducc.jd.queue.226091
> Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0
>
> Look at the timestamp of >>>>>>>>> Process Received a Message. Is Process
> target for message:true.
> and compare it to a timestamp of the last log message. Does it look like
> there is a long delay?
>
>
> Jerry
>
> On Wed, Feb 18, 2015 at 2:03 AM, reshu.agarwal <re...@orkash.com>
> wrote:
>
>> Dear Eddie,
>>
>> This problem has been resolved by using destroy method in ducc version
>> 1.0.0 but when I upgrade my ducc version from 1.0.0 to 1.1.0 DUCC didn't
>> call the destroy method.
>>
>> It also do not call the stop method of CollectionReader as well as
>> finalize method of any java class as well as destroy/collectionProcessComplete
>> method of cas consumer.
>>
>> I want to close my connection to Database after completion of job as well
>> as want to use batch processing at cas consumer level like
>> PersonTitleDBWriterCasConsumer.
>>
>> Thanks in advanced.
>>
>> Reshu.
>>
>>
>>
>>
>> On 03/31/2014 04:14 PM, reshu.agarwal wrote:
>>
>>> On 03/28/2014 05:28 PM, Eddie Epstein wrote:
>>>
>>>> Another alternative would be to do the final flush in the Cas consumer's
>>>> destroy method.
>>>>
>>>> Another issue to be aware of, in order to balance resources between jobs,
>>>> DUCC uses preemption of job processes scheduled in a "fair-share" class.
>>>> This may not be acceptable for jobs which are doing incremental commits.
>>>> The solution is to schedule the job in a non-preemptable class.
>>>>
>>>>
>>>> On Fri, Mar 28, 2014 at 1:22 AM, reshu.agarwal <reshu.agarwal@orkash.com
>>>>> wrote:
>>>>   On 03/28/2014 01:28 AM, Eddie Epstein wrote:
>>>>>   Hi Reshu,
>>>>>> The Job model in DUCC is for the Collection Reader to send "work item
>>>>>> CASes", where a work item represents a collection of work to be done
>>>>>> by a
>>>>>> Job Process. For example, a work item could be a file or a subset of a
>>>>>> file
>>>>>> that contains many documents, where each document would be individually
>>>>>> put
>>>>>> into a CAS by the Cas Multiplier in the Job Process.
>>>>>>
>>>>>> DUCC is designed so that after processing the "mini-collection"
>>>>>> represented
>>>>>> by the work item,  the Cas Consumer should flush any data. This is
>>>>>> done by
>>>>>> routing the "work item CAS" to the Cas Consumer, after all work item
>>>>>> documents are completed, at which point the CC does the flush.
>>>>>>
>>>>>> The sample code described in
>>>>>> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009 uses
>>>>>> the
>>>>>> work item CAS to flush data in exactly this way.
>>>>>>
>>>>>> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a
>>>>>> commit)
>>>>>> in
>>>>>> the process method after every 50 documents.
>>>>>>
>>>>>> Regards
>>>>>> Eddie
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <
>>>>>> reshu.agarwal@orkash.com>
>>>>>> wrote:
>>>>>>
>>>>>>    On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>>>>>
>>>>>>>    Hi Reshu,
>>>>>>>
>>>>>>>> The collectionProcessingComplete() method in UIMA-AS has a
>>>>>>>> limitation: a
>>>>>>>> Collection Processing Complete request sent to the UIMA-AS Analysis
>>>>>>>> Service
>>>>>>>> is cascaded down to all delegates; however, if a particular delegate
>>>>>>>> is
>>>>>>>> scaled-out, only one of the instances of the delegate will get this
>>>>>>>> call.
>>>>>>>>
>>>>>>>> Since DUCC is using UIMA-AS to scale out the Job processes, it has no
>>>>>>>> way
>>>>>>>> to deliver a CPC to all instances.
>>>>>>>>
>>>>>>>> The applications we have been running on DUCC have used the Work Item
>>>>>>>> CAS
>>>>>>>> as a signal to CAS consumers to do CPC level processing. That is
>>>>>>>> discussed
>>>>>>>> in the first reference above, in the paragraph "Flushing Cached
>>>>>>>> Data".
>>>>>>>>
>>>>>>>> Eddie
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <
>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>     On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>>>>>
>>>>>>>>      Are you using standard UIMA interface code to Solr? If so, which
>>>>>>>>> Cas
>>>>>>>>>
>>>>>>>>>   Consumer?
>>>>>>>>>> Taking at quick look at the source code for SolrCASConsumer, the
>>>>>>>>>> batch
>>>>>>>>>> and
>>>>>>>>>> collection process complete methods appear to do nothing.
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Eddie
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>      On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>>>>>>
>>>>>>>>>>       Hence we can not attempt batch processing in cas consumer and
>>>>>>>>>> it
>>>>>>>>>>
>>>>>>>>>>>    increases our process timing. Is there any other option for
>>>>>>>>>>> that or
>>>>>>>>>>>
>>>>>>>>>>>> is
>>>>>>>>>>>> it a
>>>>>>>>>>>> bug in DUCC?
>>>>>>>>>>>>
>>>>>>>>>>>>      Please reply on this problem as if I am sending document in
>>>>>>>>>>>> solr
>>>>>>>>>>>> one by
>>>>>>>>>>>>
>>>>>>>>>>>>    one by cas consumer without using batch process and committing
>>>>>>>>>>>>
>>>>>>>>>>> solr. It
>>>>>>>>>>> is
>>>>>>>>>>> not optimum way to use this. Why ducc is not calling collection
>>>>>>>>>>> Process
>>>>>>>>>>> Complete method of Cas Consumer? And If I want to do that then
>>>>>>>>>>> What
>>>>>>>>>>> is
>>>>>>>>>>> the
>>>>>>>>>>> way to do this?
>>>>>>>>>>>
>>>>>>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>>>>>>
>>>>>>>>>>> Thanks in Advanced.
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Thanks,
>>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>      Hi Eddie,
>>>>>>>>>>>
>>>>>>>>>>>    I am not using standard UIMA interface code to Solr. I create my
>>>>>>>>>>>
>>>>>>>>>> own Cas
>>>>>>>>>>
>>>>>>>>>>   Consumer. I will take a look on that too. But the problem is not
>>>>>>>>> for
>>>>>>>>> particularly to use solr, I can use any source to store my output. I
>>>>>>>>> want
>>>>>>>>> to do batch processing and want to use collectionProcessComplete.
>>>>>>>>> Why
>>>>>>>>> DUCC
>>>>>>>>> is not calling it? I check it with UIMA AS also and my cas consumer
>>>>>>>>> is
>>>>>>>>> working fine with it and also performing batch processing.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Thanks,
>>>>>>>>> Reshu Agarwal
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     Hi Eddie,
>>>>>>>>>
>>>>>>>>>   I am using cas consumer similar to apache uima example:
>>>>>>>     "apache-uima/examples/src/org/apache/uima/examples/cpe/
>>>>>>> PersonTitleDBWriterCasConsumer.java"
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>> Reshu Agarwal
>>>>>>>
>>>>>>>
>>>>>>>    Hi Eddie,
>>>>>>>
>>>>>> You are right I know this fact. PersonTitleDBWriterCasConsumer is
>>>>> doing a
>>>>> flush (a commit) in the process method after every 50 documents and if
>>>>> less
>>>>> then 50 documents in cas it will do commit or flush by
>>>>> collectionProcessComplete method. So, If it is not called then those
>>>>> documents can not be committed. That is why I want ducc calls this
>>>>> method.
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Reshu Agarwal
>>>>>
>>>>>
>>>>>   Hi,
>>> Destroy method worked for me. It did the same what I wanted from
>>> CollectionProcessComplete method.
>>>
>>>


Re: Ducc Problems

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.
One possible explanation for destroy() not getting called is that a process
(JP) may be still working on a CAS when Ducc deallocates the process. Ducc
first asks the process to quiesce and stop and allows it 1 minute to
terminate on its own. If this does not happen, Ducc kills the process via
kill -9. In such case the process will be clobbered and destroy() methods
in UIMA-AS are not called.
There should be some evidence in JP logs at the very end. Look for
something like this:

>>>>>>>>> Process Received a Message. Is Process target for message:true.
Target PID:27520
>>> configFactory.stop() - stopped
route:mina:tcp://localhost:49338?transferExchange=true&sync=false
01:56:22.735 - 94:
org.apache.uima.aae.controller.PrimitiveAnalysisEngineController_impl.quiesceAndStop:
INFO: Stopping Controller: ducc.jd.queue.226091
Quiescing UIMA-AS Service. Remaining Number of CASes to Process:0

Look at the timestamp of >>>>>>>>> Process Received a Message. Is Process
target for message:true.
and compare it to a timestamp of the last log message. Does it look like
there is a long delay?


Jerry

On Wed, Feb 18, 2015 at 2:03 AM, reshu.agarwal <re...@orkash.com>
wrote:

> Dear Eddie,
>
> This problem has been resolved by using destroy method in ducc version
> 1.0.0 but when I upgrade my ducc version from 1.0.0 to 1.1.0 DUCC didn't
> call the destroy method.
>
> It also do not call the stop method of CollectionReader as well as
> finalize method of any java class as well as destroy/collectionProcessComplete
> method of cas consumer.
>
> I want to close my connection to Database after completion of job as well
> as want to use batch processing at cas consumer level like
> PersonTitleDBWriterCasConsumer.
>
> Thanks in advanced.
>
> Reshu.
>
>
>
>
> On 03/31/2014 04:14 PM, reshu.agarwal wrote:
>
>> On 03/28/2014 05:28 PM, Eddie Epstein wrote:
>>
>>> Another alternative would be to do the final flush in the Cas consumer's
>>> destroy method.
>>>
>>> Another issue to be aware of, in order to balance resources between jobs,
>>> DUCC uses preemption of job processes scheduled in a "fair-share" class.
>>> This may not be acceptable for jobs which are doing incremental commits.
>>> The solution is to schedule the job in a non-preemptable class.
>>>
>>>
>>> On Fri, Mar 28, 2014 at 1:22 AM, reshu.agarwal <reshu.agarwal@orkash.com
>>> >wrote:
>>>
>>>  On 03/28/2014 01:28 AM, Eddie Epstein wrote:
>>>>
>>>>  Hi Reshu,
>>>>>
>>>>> The Job model in DUCC is for the Collection Reader to send "work item
>>>>> CASes", where a work item represents a collection of work to be done
>>>>> by a
>>>>> Job Process. For example, a work item could be a file or a subset of a
>>>>> file
>>>>> that contains many documents, where each document would be individually
>>>>> put
>>>>> into a CAS by the Cas Multiplier in the Job Process.
>>>>>
>>>>> DUCC is designed so that after processing the "mini-collection"
>>>>> represented
>>>>> by the work item,  the Cas Consumer should flush any data. This is
>>>>> done by
>>>>> routing the "work item CAS" to the Cas Consumer, after all work item
>>>>> documents are completed, at which point the CC does the flush.
>>>>>
>>>>> The sample code described in
>>>>> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009 uses
>>>>> the
>>>>> work item CAS to flush data in exactly this way.
>>>>>
>>>>> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a
>>>>> commit)
>>>>> in
>>>>> the process method after every 50 documents.
>>>>>
>>>>> Regards
>>>>> Eddie
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <
>>>>> reshu.agarwal@orkash.com>
>>>>> wrote:
>>>>>
>>>>>   On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>>>>
>>>>>>   Hi Reshu,
>>>>>>
>>>>>>> The collectionProcessingComplete() method in UIMA-AS has a
>>>>>>> limitation: a
>>>>>>> Collection Processing Complete request sent to the UIMA-AS Analysis
>>>>>>> Service
>>>>>>> is cascaded down to all delegates; however, if a particular delegate
>>>>>>> is
>>>>>>> scaled-out, only one of the instances of the delegate will get this
>>>>>>> call.
>>>>>>>
>>>>>>> Since DUCC is using UIMA-AS to scale out the Job processes, it has no
>>>>>>> way
>>>>>>> to deliver a CPC to all instances.
>>>>>>>
>>>>>>> The applications we have been running on DUCC have used the Work Item
>>>>>>> CAS
>>>>>>> as a signal to CAS consumers to do CPC level processing. That is
>>>>>>> discussed
>>>>>>> in the first reference above, in the paragraph "Flushing Cached
>>>>>>> Data".
>>>>>>>
>>>>>>> Eddie
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <
>>>>>>> reshu.agarwal@orkash.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>    On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>>>>
>>>>>>>     Are you using standard UIMA interface code to Solr? If so, which
>>>>>>>> Cas
>>>>>>>>
>>>>>>>>  Consumer?
>>>>>>>>>
>>>>>>>>> Taking at quick look at the source code for SolrCASConsumer, the
>>>>>>>>> batch
>>>>>>>>> and
>>>>>>>>> collection process complete methods appear to do nothing.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Eddie
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>     On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>>>>>
>>>>>>>>>      Hence we can not attempt batch processing in cas consumer and
>>>>>>>>> it
>>>>>>>>>
>>>>>>>>>>   increases our process timing. Is there any other option for
>>>>>>>>>> that or
>>>>>>>>>>
>>>>>>>>>>> is
>>>>>>>>>>> it a
>>>>>>>>>>> bug in DUCC?
>>>>>>>>>>>
>>>>>>>>>>>     Please reply on this problem as if I am sending document in
>>>>>>>>>>> solr
>>>>>>>>>>> one by
>>>>>>>>>>>
>>>>>>>>>>>   one by cas consumer without using batch process and committing
>>>>>>>>>>>
>>>>>>>>>> solr. It
>>>>>>>>>> is
>>>>>>>>>> not optimum way to use this. Why ducc is not calling collection
>>>>>>>>>> Process
>>>>>>>>>> Complete method of Cas Consumer? And If I want to do that then
>>>>>>>>>> What
>>>>>>>>>> is
>>>>>>>>>> the
>>>>>>>>>> way to do this?
>>>>>>>>>>
>>>>>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>>>>>
>>>>>>>>>> Thanks in Advanced.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> Thanks,
>>>>>>>>>> Reshu Agarwal
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>     Hi Eddie,
>>>>>>>>>>
>>>>>>>>>>   I am not using standard UIMA interface code to Solr. I create my
>>>>>>>>>>
>>>>>>>>> own Cas
>>>>>>>>>
>>>>>>>>>  Consumer. I will take a look on that too. But the problem is not
>>>>>>>> for
>>>>>>>> particularly to use solr, I can use any source to store my output. I
>>>>>>>> want
>>>>>>>> to do batch processing and want to use collectionProcessComplete.
>>>>>>>> Why
>>>>>>>> DUCC
>>>>>>>> is not calling it? I check it with UIMA AS also and my cas consumer
>>>>>>>> is
>>>>>>>> working fine with it and also performing batch processing.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks,
>>>>>>>> Reshu Agarwal
>>>>>>>>
>>>>>>>>
>>>>>>>>    Hi Eddie,
>>>>>>>>
>>>>>>>>  I am using cas consumer similar to apache uima example:
>>>>>>>
>>>>>>    "apache-uima/examples/src/org/apache/uima/examples/cpe/
>>>>>> PersonTitleDBWriterCasConsumer.java"
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Reshu Agarwal
>>>>>>
>>>>>>
>>>>>>   Hi Eddie,
>>>>>>
>>>>> You are right I know this fact. PersonTitleDBWriterCasConsumer is
>>>> doing a
>>>> flush (a commit) in the process method after every 50 documents and if
>>>> less
>>>> then 50 documents in cas it will do commit or flush by
>>>> collectionProcessComplete method. So, If it is not called then those
>>>> documents can not be committed. That is why I want ducc calls this
>>>> method.
>>>>
>>>> --
>>>> Thanks,
>>>> Reshu Agarwal
>>>>
>>>>
>>>>  Hi,
>>
>> Destroy method worked for me. It did the same what I wanted from
>> CollectionProcessComplete method.
>>
>>
>

Re: Ducc Problems

Posted by "reshu.agarwal" <re...@orkash.com>.
Dear Eddie,

This problem has been resolved by using destroy method in ducc version 
1.0.0 but when I upgrade my ducc version from 1.0.0 to 1.1.0 DUCC didn't 
call the destroy method.

It also do not call the stop method of CollectionReader as well as 
finalize method of any java class as well as 
destroy/collectionProcessComplete method of cas consumer.

I want to close my connection to Database after completion of job as 
well as want to use batch processing at cas consumer level like 
PersonTitleDBWriterCasConsumer.

Thanks in advanced.

Reshu.



On 03/31/2014 04:14 PM, reshu.agarwal wrote:
> On 03/28/2014 05:28 PM, Eddie Epstein wrote:
>> Another alternative would be to do the final flush in the Cas consumer's
>> destroy method.
>>
>> Another issue to be aware of, in order to balance resources between 
>> jobs,
>> DUCC uses preemption of job processes scheduled in a "fair-share" class.
>> This may not be acceptable for jobs which are doing incremental commits.
>> The solution is to schedule the job in a non-preemptable class.
>>
>>
>> On Fri, Mar 28, 2014 at 1:22 AM, reshu.agarwal 
>> <re...@orkash.com>wrote:
>>
>>> On 03/28/2014 01:28 AM, Eddie Epstein wrote:
>>>
>>>> Hi Reshu,
>>>>
>>>> The Job model in DUCC is for the Collection Reader to send "work item
>>>> CASes", where a work item represents a collection of work to be 
>>>> done by a
>>>> Job Process. For example, a work item could be a file or a subset of a
>>>> file
>>>> that contains many documents, where each document would be 
>>>> individually
>>>> put
>>>> into a CAS by the Cas Multiplier in the Job Process.
>>>>
>>>> DUCC is designed so that after processing the "mini-collection"
>>>> represented
>>>> by the work item,  the Cas Consumer should flush any data. This is 
>>>> done by
>>>> routing the "work item CAS" to the Cas Consumer, after all work item
>>>> documents are completed, at which point the CC does the flush.
>>>>
>>>> The sample code described in
>>>> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009 uses
>>>> the
>>>> work item CAS to flush data in exactly this way.
>>>>
>>>> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a 
>>>> commit)
>>>> in
>>>> the process method after every 50 documents.
>>>>
>>>> Regards
>>>> Eddie
>>>>
>>>>
>>>>
>>>> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal 
>>>> <re...@orkash.com>
>>>> wrote:
>>>>
>>>>   On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>>>>   Hi Reshu,
>>>>>> The collectionProcessingComplete() method in UIMA-AS has a 
>>>>>> limitation: a
>>>>>> Collection Processing Complete request sent to the UIMA-AS Analysis
>>>>>> Service
>>>>>> is cascaded down to all delegates; however, if a particular 
>>>>>> delegate is
>>>>>> scaled-out, only one of the instances of the delegate will get this
>>>>>> call.
>>>>>>
>>>>>> Since DUCC is using UIMA-AS to scale out the Job processes, it 
>>>>>> has no
>>>>>> way
>>>>>> to deliver a CPC to all instances.
>>>>>>
>>>>>> The applications we have been running on DUCC have used the Work 
>>>>>> Item
>>>>>> CAS
>>>>>> as a signal to CAS consumers to do CPC level processing. That is
>>>>>> discussed
>>>>>> in the first reference above, in the paragraph "Flushing Cached 
>>>>>> Data".
>>>>>>
>>>>>> Eddie
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <
>>>>>> reshu.agarwal@orkash.com>
>>>>>> wrote:
>>>>>>
>>>>>>    On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>>>
>>>>>>>    Are you using standard UIMA interface code to Solr? If so, 
>>>>>>> which Cas
>>>>>>>
>>>>>>>> Consumer?
>>>>>>>>
>>>>>>>> Taking at quick look at the source code for SolrCASConsumer, 
>>>>>>>> the batch
>>>>>>>> and
>>>>>>>> collection process complete methods appear to do nothing.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Eddie
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>>>>> reshu.agarwal@orkash.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>     On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>>>>
>>>>>>>>      Hence we can not attempt batch processing in cas consumer 
>>>>>>>> and it
>>>>>>>>>   increases our process timing. Is there any other option for 
>>>>>>>>> that or
>>>>>>>>>> is
>>>>>>>>>> it a
>>>>>>>>>> bug in DUCC?
>>>>>>>>>>
>>>>>>>>>>     Please reply on this problem as if I am sending document 
>>>>>>>>>> in solr
>>>>>>>>>> one by
>>>>>>>>>>
>>>>>>>>>>   one by cas consumer without using batch process and committing
>>>>>>>>> solr. It
>>>>>>>>> is
>>>>>>>>> not optimum way to use this. Why ducc is not calling collection
>>>>>>>>> Process
>>>>>>>>> Complete method of Cas Consumer? And If I want to do that then 
>>>>>>>>> What
>>>>>>>>> is
>>>>>>>>> the
>>>>>>>>> way to do this?
>>>>>>>>>
>>>>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>>>>
>>>>>>>>> Thanks in Advanced.
>>>>>>>>>
>>>>>>>>> -- 
>>>>>>>>> Thanks,
>>>>>>>>> Reshu Agarwal
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>     Hi Eddie,
>>>>>>>>>
>>>>>>>>>   I am not using standard UIMA interface code to Solr. I 
>>>>>>>>> create my
>>>>>>>> own Cas
>>>>>>>>
>>>>>>> Consumer. I will take a look on that too. But the problem is not 
>>>>>>> for
>>>>>>> particularly to use solr, I can use any source to store my 
>>>>>>> output. I
>>>>>>> want
>>>>>>> to do batch processing and want to use 
>>>>>>> collectionProcessComplete. Why
>>>>>>> DUCC
>>>>>>> is not calling it? I check it with UIMA AS also and my cas 
>>>>>>> consumer is
>>>>>>> working fine with it and also performing batch processing.
>>>>>>>
>>>>>>> -- 
>>>>>>> Thanks,
>>>>>>> Reshu Agarwal
>>>>>>>
>>>>>>>
>>>>>>>    Hi Eddie,
>>>>>>>
>>>>>> I am using cas consumer similar to apache uima example:
>>>>>    "apache-uima/examples/src/org/apache/uima/examples/cpe/
>>>>> PersonTitleDBWriterCasConsumer.java"
>>>>>
>>>>> -- 
>>>>> Thanks,
>>>>> Reshu Agarwal
>>>>>
>>>>>
>>>>>   Hi Eddie,
>>> You are right I know this fact. PersonTitleDBWriterCasConsumer is 
>>> doing a
>>> flush (a commit) in the process method after every 50 documents and 
>>> if less
>>> then 50 documents in cas it will do commit or flush by
>>> collectionProcessComplete method. So, If it is not called then those
>>> documents can not be committed. That is why I want ducc calls this 
>>> method.
>>>
>>> -- 
>>> Thanks,
>>> Reshu Agarwal
>>>
>>>
> Hi,
>
> Destroy method worked for me. It did the same what I wanted from 
> CollectionProcessComplete method.
>


Re: Ducc Problems

Posted by "reshu.agarwal" <re...@orkash.com>.
On 03/28/2014 05:28 PM, Eddie Epstein wrote:
> Another alternative would be to do the final flush in the Cas consumer's
> destroy method.
>
> Another issue to be aware of, in order to balance resources between jobs,
> DUCC uses preemption of job processes scheduled in a "fair-share" class.
> This may not be acceptable for jobs which are doing incremental commits.
> The solution is to schedule the job in a non-preemptable class.
>
>
> On Fri, Mar 28, 2014 at 1:22 AM, reshu.agarwal <re...@orkash.com>wrote:
>
>> On 03/28/2014 01:28 AM, Eddie Epstein wrote:
>>
>>> Hi Reshu,
>>>
>>> The Job model in DUCC is for the Collection Reader to send "work item
>>> CASes", where a work item represents a collection of work to be done by a
>>> Job Process. For example, a work item could be a file or a subset of a
>>> file
>>> that contains many documents, where each document would be individually
>>> put
>>> into a CAS by the Cas Multiplier in the Job Process.
>>>
>>> DUCC is designed so that after processing the "mini-collection"
>>> represented
>>> by the work item,  the Cas Consumer should flush any data. This is done by
>>> routing the "work item CAS" to the Cas Consumer, after all work item
>>> documents are completed, at which point the CC does the flush.
>>>
>>> The sample code described in
>>> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009 uses
>>> the
>>> work item CAS to flush data in exactly this way.
>>>
>>> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a commit)
>>> in
>>> the process method after every 50 documents.
>>>
>>> Regards
>>> Eddie
>>>
>>>
>>>
>>> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <re...@orkash.com>
>>> wrote:
>>>
>>>   On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>>>   Hi Reshu,
>>>>> The collectionProcessingComplete() method in UIMA-AS has a limitation: a
>>>>> Collection Processing Complete request sent to the UIMA-AS Analysis
>>>>> Service
>>>>> is cascaded down to all delegates; however, if a particular delegate is
>>>>> scaled-out, only one of the instances of the delegate will get this
>>>>> call.
>>>>>
>>>>> Since DUCC is using UIMA-AS to scale out the Job processes, it has no
>>>>> way
>>>>> to deliver a CPC to all instances.
>>>>>
>>>>> The applications we have been running on DUCC have used the Work Item
>>>>> CAS
>>>>> as a signal to CAS consumers to do CPC level processing. That is
>>>>> discussed
>>>>> in the first reference above, in the paragraph "Flushing Cached Data".
>>>>>
>>>>> Eddie
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <
>>>>> reshu.agarwal@orkash.com>
>>>>> wrote:
>>>>>
>>>>>    On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>>
>>>>>>    Are you using standard UIMA interface code to Solr? If so, which Cas
>>>>>>
>>>>>>> Consumer?
>>>>>>>
>>>>>>> Taking at quick look at the source code for SolrCASConsumer, the batch
>>>>>>> and
>>>>>>> collection process complete methods appear to do nothing.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Eddie
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>>>> reshu.agarwal@orkash.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>     On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>>>
>>>>>>>      Hence we can not attempt batch processing in cas consumer and it
>>>>>>>>   increases our process timing. Is there any other option for that or
>>>>>>>>> is
>>>>>>>>> it a
>>>>>>>>> bug in DUCC?
>>>>>>>>>
>>>>>>>>>     Please reply on this problem as if I am sending document in solr
>>>>>>>>> one by
>>>>>>>>>
>>>>>>>>>   one by cas consumer without using batch process and committing
>>>>>>>> solr. It
>>>>>>>> is
>>>>>>>> not optimum way to use this. Why ducc is not calling collection
>>>>>>>> Process
>>>>>>>> Complete method of Cas Consumer? And If I want to do that then What
>>>>>>>> is
>>>>>>>> the
>>>>>>>> way to do this?
>>>>>>>>
>>>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>>>
>>>>>>>> Thanks in Advanced.
>>>>>>>>
>>>>>>>> --
>>>>>>>> Thanks,
>>>>>>>> Reshu Agarwal
>>>>>>>>
>>>>>>>>
>>>>>>>>     Hi Eddie,
>>>>>>>>
>>>>>>>>   I am not using standard UIMA interface code to Solr. I create my
>>>>>>> own Cas
>>>>>>>
>>>>>> Consumer. I will take a look on that too. But the problem is not for
>>>>>> particularly to use solr, I can use any source to store my output. I
>>>>>> want
>>>>>> to do batch processing and want to use collectionProcessComplete. Why
>>>>>> DUCC
>>>>>> is not calling it? I check it with UIMA AS also and my cas consumer is
>>>>>> working fine with it and also performing batch processing.
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Reshu Agarwal
>>>>>>
>>>>>>
>>>>>>    Hi Eddie,
>>>>>>
>>>>> I am using cas consumer similar to apache uima example:
>>>>    "apache-uima/examples/src/org/apache/uima/examples/cpe/
>>>> PersonTitleDBWriterCasConsumer.java"
>>>>
>>>> --
>>>> Thanks,
>>>> Reshu Agarwal
>>>>
>>>>
>>>>   Hi Eddie,
>> You are right I know this fact. PersonTitleDBWriterCasConsumer is doing a
>> flush (a commit) in the process method after every 50 documents and if less
>> then 50 documents in cas it will do commit or flush by
>> collectionProcessComplete method. So, If it is not called then those
>> documents can not be committed. That is why I want ducc calls this method.
>>
>> --
>> Thanks,
>> Reshu Agarwal
>>
>>
Hi,

Destroy method worked for me. It did the same what I wanted from 
CollectionProcessComplete method.

-- 
Thanks,
Reshu Agarwal


Re: Ducc Problems

Posted by Eddie Epstein <ea...@gmail.com>.
Another alternative would be to do the final flush in the Cas consumer's
destroy method.

Another issue to be aware of, in order to balance resources between jobs,
DUCC uses preemption of job processes scheduled in a "fair-share" class.
This may not be acceptable for jobs which are doing incremental commits.
The solution is to schedule the job in a non-preemptable class.


On Fri, Mar 28, 2014 at 1:22 AM, reshu.agarwal <re...@orkash.com>wrote:

> On 03/28/2014 01:28 AM, Eddie Epstein wrote:
>
>> Hi Reshu,
>>
>> The Job model in DUCC is for the Collection Reader to send "work item
>> CASes", where a work item represents a collection of work to be done by a
>> Job Process. For example, a work item could be a file or a subset of a
>> file
>> that contains many documents, where each document would be individually
>> put
>> into a CAS by the Cas Multiplier in the Job Process.
>>
>> DUCC is designed so that after processing the "mini-collection"
>> represented
>> by the work item,  the Cas Consumer should flush any data. This is done by
>> routing the "work item CAS" to the Cas Consumer, after all work item
>> documents are completed, at which point the CC does the flush.
>>
>> The sample code described in
>> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009 uses
>> the
>> work item CAS to flush data in exactly this way.
>>
>> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a commit)
>> in
>> the process method after every 50 documents.
>>
>> Regards
>> Eddie
>>
>>
>>
>> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <re...@orkash.com>
>> wrote:
>>
>>  On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>>
>>>  Hi Reshu,
>>>>
>>>> The collectionProcessingComplete() method in UIMA-AS has a limitation: a
>>>> Collection Processing Complete request sent to the UIMA-AS Analysis
>>>> Service
>>>> is cascaded down to all delegates; however, if a particular delegate is
>>>> scaled-out, only one of the instances of the delegate will get this
>>>> call.
>>>>
>>>> Since DUCC is using UIMA-AS to scale out the Job processes, it has no
>>>> way
>>>> to deliver a CPC to all instances.
>>>>
>>>> The applications we have been running on DUCC have used the Work Item
>>>> CAS
>>>> as a signal to CAS consumers to do CPC level processing. That is
>>>> discussed
>>>> in the first reference above, in the paragraph "Flushing Cached Data".
>>>>
>>>> Eddie
>>>>
>>>>
>>>>
>>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <
>>>> reshu.agarwal@orkash.com>
>>>> wrote:
>>>>
>>>>   On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>
>>>>>   Are you using standard UIMA interface code to Solr? If so, which Cas
>>>>>
>>>>>> Consumer?
>>>>>>
>>>>>> Taking at quick look at the source code for SolrCASConsumer, the batch
>>>>>> and
>>>>>> collection process complete methods appear to do nothing.
>>>>>>
>>>>>> Thanks,
>>>>>> Eddie
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>>> reshu.agarwal@orkash.com>
>>>>>> wrote:
>>>>>>
>>>>>>    On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>>
>>>>>>     Hence we can not attempt batch processing in cas consumer and it
>>>>>>>
>>>>>>>  increases our process timing. Is there any other option for that or
>>>>>>>> is
>>>>>>>> it a
>>>>>>>> bug in DUCC?
>>>>>>>>
>>>>>>>>    Please reply on this problem as if I am sending document in solr
>>>>>>>> one by
>>>>>>>>
>>>>>>>>  one by cas consumer without using batch process and committing
>>>>>>> solr. It
>>>>>>> is
>>>>>>> not optimum way to use this. Why ducc is not calling collection
>>>>>>> Process
>>>>>>> Complete method of Cas Consumer? And If I want to do that then What
>>>>>>> is
>>>>>>> the
>>>>>>> way to do this?
>>>>>>>
>>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>>
>>>>>>> Thanks in Advanced.
>>>>>>>
>>>>>>> --
>>>>>>> Thanks,
>>>>>>> Reshu Agarwal
>>>>>>>
>>>>>>>
>>>>>>>    Hi Eddie,
>>>>>>>
>>>>>>>  I am not using standard UIMA interface code to Solr. I create my
>>>>>> own Cas
>>>>>>
>>>>> Consumer. I will take a look on that too. But the problem is not for
>>>>> particularly to use solr, I can use any source to store my output. I
>>>>> want
>>>>> to do batch processing and want to use collectionProcessComplete. Why
>>>>> DUCC
>>>>> is not calling it? I check it with UIMA AS also and my cas consumer is
>>>>> working fine with it and also performing batch processing.
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Reshu Agarwal
>>>>>
>>>>>
>>>>>   Hi Eddie,
>>>>>
>>>> I am using cas consumer similar to apache uima example:
>>>
>>>   "apache-uima/examples/src/org/apache/uima/examples/cpe/
>>> PersonTitleDBWriterCasConsumer.java"
>>>
>>> --
>>> Thanks,
>>> Reshu Agarwal
>>>
>>>
>>>  Hi Eddie,
>
> You are right I know this fact. PersonTitleDBWriterCasConsumer is doing a
> flush (a commit) in the process method after every 50 documents and if less
> then 50 documents in cas it will do commit or flush by
> collectionProcessComplete method. So, If it is not called then those
> documents can not be committed. That is why I want ducc calls this method.
>
> --
> Thanks,
> Reshu Agarwal
>
>

Re: Ducc Problems

Posted by "reshu.agarwal" <re...@orkash.com>.
On 03/28/2014 01:28 AM, Eddie Epstein wrote:
> Hi Reshu,
>
> The Job model in DUCC is for the Collection Reader to send "work item
> CASes", where a work item represents a collection of work to be done by a
> Job Process. For example, a work item could be a file or a subset of a file
> that contains many documents, where each document would be individually put
> into a CAS by the Cas Multiplier in the Job Process.
>
> DUCC is designed so that after processing the "mini-collection" represented
> by the work item,  the Cas Consumer should flush any data. This is done by
> routing the "work item CAS" to the Cas Consumer, after all work item
> documents are completed, at which point the CC does the flush.
>
> The sample code described in
> http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009 uses the
> work item CAS to flush data in exactly this way.
>
> Note that the PersonTitleDBWriterCasConsumer is doing a flush (a commit) in
> the process method after every 50 documents.
>
> Regards
> Eddie
>
>
>
> On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <re...@orkash.com>wrote:
>
>> On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>>
>>> Hi Reshu,
>>>
>>> The collectionProcessingComplete() method in UIMA-AS has a limitation: a
>>> Collection Processing Complete request sent to the UIMA-AS Analysis
>>> Service
>>> is cascaded down to all delegates; however, if a particular delegate is
>>> scaled-out, only one of the instances of the delegate will get this call.
>>>
>>> Since DUCC is using UIMA-AS to scale out the Job processes, it has no way
>>> to deliver a CPC to all instances.
>>>
>>> The applications we have been running on DUCC have used the Work Item CAS
>>> as a signal to CAS consumers to do CPC level processing. That is discussed
>>> in the first reference above, in the paragraph "Flushing Cached Data".
>>>
>>> Eddie
>>>
>>>
>>>
>>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <re...@orkash.com>
>>> wrote:
>>>
>>>   On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>>   Are you using standard UIMA interface code to Solr? If so, which Cas
>>>>> Consumer?
>>>>>
>>>>> Taking at quick look at the source code for SolrCASConsumer, the batch
>>>>> and
>>>>> collection process complete methods appear to do nothing.
>>>>>
>>>>> Thanks,
>>>>> Eddie
>>>>>
>>>>>
>>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>>> reshu.agarwal@orkash.com>
>>>>> wrote:
>>>>>
>>>>>    On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>>
>>>>>>    Hence we can not attempt batch processing in cas consumer and it
>>>>>>
>>>>>>> increases our process timing. Is there any other option for that or is
>>>>>>> it a
>>>>>>> bug in DUCC?
>>>>>>>
>>>>>>>    Please reply on this problem as if I am sending document in solr
>>>>>>> one by
>>>>>>>
>>>>>> one by cas consumer without using batch process and committing solr. It
>>>>>> is
>>>>>> not optimum way to use this. Why ducc is not calling collection Process
>>>>>> Complete method of Cas Consumer? And If I want to do that then What is
>>>>>> the
>>>>>> way to do this?
>>>>>>
>>>>>> I am not able to find any thing about this in DUCC book.
>>>>>>
>>>>>> Thanks in Advanced.
>>>>>>
>>>>>> --
>>>>>> Thanks,
>>>>>> Reshu Agarwal
>>>>>>
>>>>>>
>>>>>>    Hi Eddie,
>>>>>>
>>>>> I am not using standard UIMA interface code to Solr. I create my own Cas
>>>> Consumer. I will take a look on that too. But the problem is not for
>>>> particularly to use solr, I can use any source to store my output. I want
>>>> to do batch processing and want to use collectionProcessComplete. Why
>>>> DUCC
>>>> is not calling it? I check it with UIMA AS also and my cas consumer is
>>>> working fine with it and also performing batch processing.
>>>>
>>>> --
>>>> Thanks,
>>>> Reshu Agarwal
>>>>
>>>>
>>>>   Hi Eddie,
>> I am using cas consumer similar to apache uima example:
>>
>>   "apache-uima/examples/src/org/apache/uima/examples/cpe/
>> PersonTitleDBWriterCasConsumer.java"
>>
>> --
>> Thanks,
>> Reshu Agarwal
>>
>>
Hi Eddie,

You are right I know this fact. PersonTitleDBWriterCasConsumer is doing 
a flush (a commit) in the process method after every 50 documents and if 
less then 50 documents in cas it will do commit or flush by 
collectionProcessComplete method. So, If it is not called then those 
documents can not be committed. That is why I want ducc calls this method.

-- 
Thanks,
Reshu Agarwal


Re: Ducc Problems

Posted by Eddie Epstein <ea...@gmail.com>.
Hi Reshu,

The Job model in DUCC is for the Collection Reader to send "work item
CASes", where a work item represents a collection of work to be done by a
Job Process. For example, a work item could be a file or a subset of a file
that contains many documents, where each document would be individually put
into a CAS by the Cas Multiplier in the Job Process.

DUCC is designed so that after processing the "mini-collection" represented
by the work item,  the Cas Consumer should flush any data. This is done by
routing the "work item CAS" to the Cas Consumer, after all work item
documents are completed, at which point the CC does the flush.

The sample code described in
http://uima.apache.org/d/uima-ducc-1.0.0/duccbook.html#x1-1380009 uses the
work item CAS to flush data in exactly this way.

Note that the PersonTitleDBWriterCasConsumer is doing a flush (a commit) in
the process method after every 50 documents.

Regards
Eddie



On Thu, Mar 27, 2014 at 1:35 AM, reshu.agarwal <re...@orkash.com>wrote:

> On 03/26/2014 11:34 PM, Eddie Epstein wrote:
>
>> Hi Reshu,
>>
>> The collectionProcessingComplete() method in UIMA-AS has a limitation: a
>> Collection Processing Complete request sent to the UIMA-AS Analysis
>> Service
>> is cascaded down to all delegates; however, if a particular delegate is
>> scaled-out, only one of the instances of the delegate will get this call.
>>
>> Since DUCC is using UIMA-AS to scale out the Job processes, it has no way
>> to deliver a CPC to all instances.
>>
>> The applications we have been running on DUCC have used the Work Item CAS
>> as a signal to CAS consumers to do CPC level processing. That is discussed
>> in the first reference above, in the paragraph "Flushing Cached Data".
>>
>> Eddie
>>
>>
>>
>> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <re...@orkash.com>
>> wrote:
>>
>>  On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>>
>>>  Are you using standard UIMA interface code to Solr? If so, which Cas
>>>> Consumer?
>>>>
>>>> Taking at quick look at the source code for SolrCASConsumer, the batch
>>>> and
>>>> collection process complete methods appear to do nothing.
>>>>
>>>> Thanks,
>>>> Eddie
>>>>
>>>>
>>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <
>>>> reshu.agarwal@orkash.com>
>>>> wrote:
>>>>
>>>>   On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>
>>>>>   Hence we can not attempt batch processing in cas consumer and it
>>>>>
>>>>>> increases our process timing. Is there any other option for that or is
>>>>>> it a
>>>>>> bug in DUCC?
>>>>>>
>>>>>>   Please reply on this problem as if I am sending document in solr
>>>>>> one by
>>>>>>
>>>>> one by cas consumer without using batch process and committing solr. It
>>>>> is
>>>>> not optimum way to use this. Why ducc is not calling collection Process
>>>>> Complete method of Cas Consumer? And If I want to do that then What is
>>>>> the
>>>>> way to do this?
>>>>>
>>>>> I am not able to find any thing about this in DUCC book.
>>>>>
>>>>> Thanks in Advanced.
>>>>>
>>>>> --
>>>>> Thanks,
>>>>> Reshu Agarwal
>>>>>
>>>>>
>>>>>   Hi Eddie,
>>>>>
>>>> I am not using standard UIMA interface code to Solr. I create my own Cas
>>> Consumer. I will take a look on that too. But the problem is not for
>>> particularly to use solr, I can use any source to store my output. I want
>>> to do batch processing and want to use collectionProcessComplete. Why
>>> DUCC
>>> is not calling it? I check it with UIMA AS also and my cas consumer is
>>> working fine with it and also performing batch processing.
>>>
>>> --
>>> Thanks,
>>> Reshu Agarwal
>>>
>>>
>>>  Hi Eddie,
>
> I am using cas consumer similar to apache uima example:
>
>  "apache-uima/examples/src/org/apache/uima/examples/cpe/
> PersonTitleDBWriterCasConsumer.java"
>
> --
> Thanks,
> Reshu Agarwal
>
>

Re: Ducc Problems

Posted by "reshu.agarwal" <re...@orkash.com>.
On 03/26/2014 11:34 PM, Eddie Epstein wrote:
> Hi Reshu,
>
> The collectionProcessingComplete() method in UIMA-AS has a limitation: a
> Collection Processing Complete request sent to the UIMA-AS Analysis Service
> is cascaded down to all delegates; however, if a particular delegate is
> scaled-out, only one of the instances of the delegate will get this call.
>
> Since DUCC is using UIMA-AS to scale out the Job processes, it has no way
> to deliver a CPC to all instances.
>
> The applications we have been running on DUCC have used the Work Item CAS
> as a signal to CAS consumers to do CPC level processing. That is discussed
> in the first reference above, in the paragraph "Flushing Cached Data".
>
> Eddie
>
>
>
> On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <re...@orkash.com>wrote:
>
>> On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>>
>>> Are you using standard UIMA interface code to Solr? If so, which Cas
>>> Consumer?
>>>
>>> Taking at quick look at the source code for SolrCASConsumer, the batch and
>>> collection process complete methods appear to do nothing.
>>>
>>> Thanks,
>>> Eddie
>>>
>>>
>>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <re...@orkash.com>
>>> wrote:
>>>
>>>   On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>>   Hence we can not attempt batch processing in cas consumer and it
>>>>> increases our process timing. Is there any other option for that or is
>>>>> it a
>>>>> bug in DUCC?
>>>>>
>>>>>   Please reply on this problem as if I am sending document in solr one by
>>>> one by cas consumer without using batch process and committing solr. It
>>>> is
>>>> not optimum way to use this. Why ducc is not calling collection Process
>>>> Complete method of Cas Consumer? And If I want to do that then What is
>>>> the
>>>> way to do this?
>>>>
>>>> I am not able to find any thing about this in DUCC book.
>>>>
>>>> Thanks in Advanced.
>>>>
>>>> --
>>>> Thanks,
>>>> Reshu Agarwal
>>>>
>>>>
>>>>   Hi Eddie,
>> I am not using standard UIMA interface code to Solr. I create my own Cas
>> Consumer. I will take a look on that too. But the problem is not for
>> particularly to use solr, I can use any source to store my output. I want
>> to do batch processing and want to use collectionProcessComplete. Why DUCC
>> is not calling it? I check it with UIMA AS also and my cas consumer is
>> working fine with it and also performing batch processing.
>>
>> --
>> Thanks,
>> Reshu Agarwal
>>
>>
Hi Eddie,

I am using cas consumer similar to apache uima example:

  "apache-uima/examples/src/org/apache/uima/examples/cpe/PersonTitleDBWriterCasConsumer.java"

-- 
Thanks,
Reshu Agarwal


Re: Ducc Problems

Posted by Eddie Epstein <ea...@gmail.com>.
Hi Reshu,

The collectionProcessingComplete() method in UIMA-AS has a limitation: a
Collection Processing Complete request sent to the UIMA-AS Analysis Service
is cascaded down to all delegates; however, if a particular delegate is
scaled-out, only one of the instances of the delegate will get this call.

Since DUCC is using UIMA-AS to scale out the Job processes, it has no way
to deliver a CPC to all instances.

The applications we have been running on DUCC have used the Work Item CAS
as a signal to CAS consumers to do CPC level processing. That is discussed
in the first reference above, in the paragraph "Flushing Cached Data".

Eddie



On Wed, Mar 26, 2014 at 9:48 AM, reshu.agarwal <re...@orkash.com>wrote:

> On 03/26/2014 06:43 PM, Eddie Epstein wrote:
>
>> Are you using standard UIMA interface code to Solr? If so, which Cas
>> Consumer?
>>
>> Taking at quick look at the source code for SolrCASConsumer, the batch and
>> collection process complete methods appear to do nothing.
>>
>> Thanks,
>> Eddie
>>
>>
>> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <re...@orkash.com>
>> wrote:
>>
>>  On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>>
>>>  Hence we can not attempt batch processing in cas consumer and it
>>>> increases our process timing. Is there any other option for that or is
>>>> it a
>>>> bug in DUCC?
>>>>
>>>>  Please reply on this problem as if I am sending document in solr one by
>>> one by cas consumer without using batch process and committing solr. It
>>> is
>>> not optimum way to use this. Why ducc is not calling collection Process
>>> Complete method of Cas Consumer? And If I want to do that then What is
>>> the
>>> way to do this?
>>>
>>> I am not able to find any thing about this in DUCC book.
>>>
>>> Thanks in Advanced.
>>>
>>> --
>>> Thanks,
>>> Reshu Agarwal
>>>
>>>
>>>  Hi Eddie,
>
> I am not using standard UIMA interface code to Solr. I create my own Cas
> Consumer. I will take a look on that too. But the problem is not for
> particularly to use solr, I can use any source to store my output. I want
> to do batch processing and want to use collectionProcessComplete. Why DUCC
> is not calling it? I check it with UIMA AS also and my cas consumer is
> working fine with it and also performing batch processing.
>
> --
> Thanks,
> Reshu Agarwal
>
>

Re: Ducc Problems

Posted by "reshu.agarwal" <re...@orkash.com>.
On 03/26/2014 06:43 PM, Eddie Epstein wrote:
> Are you using standard UIMA interface code to Solr? If so, which Cas
> Consumer?
>
> Taking at quick look at the source code for SolrCASConsumer, the batch and
> collection process complete methods appear to do nothing.
>
> Thanks,
> Eddie
>
>
> On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <re...@orkash.com>wrote:
>
>> On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>>
>>> Hence we can not attempt batch processing in cas consumer and it
>>> increases our process timing. Is there any other option for that or is it a
>>> bug in DUCC?
>>>
>> Please reply on this problem as if I am sending document in solr one by
>> one by cas consumer without using batch process and committing solr. It is
>> not optimum way to use this. Why ducc is not calling collection Process
>> Complete method of Cas Consumer? And If I want to do that then What is the
>> way to do this?
>>
>> I am not able to find any thing about this in DUCC book.
>>
>> Thanks in Advanced.
>>
>> --
>> Thanks,
>> Reshu Agarwal
>>
>>
Hi Eddie,

I am not using standard UIMA interface code to Solr. I create my own Cas 
Consumer. I will take a look on that too. But the problem is not for 
particularly to use solr, I can use any source to store my output. I 
want to do batch processing and want to use collectionProcessComplete. 
Why DUCC is not calling it? I check it with UIMA AS also and my cas 
consumer is working fine with it and also performing batch processing.

-- 
Thanks,
Reshu Agarwal


Re: Ducc Problems

Posted by Eddie Epstein <ea...@gmail.com>.
Are you using standard UIMA interface code to Solr? If so, which Cas
Consumer?

Taking at quick look at the source code for SolrCASConsumer, the batch and
collection process complete methods appear to do nothing.

Thanks,
Eddie


On Wed, Mar 26, 2014 at 6:08 AM, reshu.agarwal <re...@orkash.com>wrote:

> On 03/21/2014 11:42 AM, reshu.agarwal wrote:
>
>> Hence we can not attempt batch processing in cas consumer and it
>> increases our process timing. Is there any other option for that or is it a
>> bug in DUCC?
>>
> Please reply on this problem as if I am sending document in solr one by
> one by cas consumer without using batch process and committing solr. It is
> not optimum way to use this. Why ducc is not calling collection Process
> Complete method of Cas Consumer? And If I want to do that then What is the
> way to do this?
>
> I am not able to find any thing about this in DUCC book.
>
> Thanks in Advanced.
>
> --
> Thanks,
> Reshu Agarwal
>
>

Re: Ducc Problems

Posted by "reshu.agarwal" <re...@orkash.com>.
On 03/21/2014 11:42 AM, reshu.agarwal wrote:
> Hence we can not attempt batch processing in cas consumer and it 
> increases our process timing. Is there any other option for that or is 
> it a bug in DUCC? 
Please reply on this problem as if I am sending document in solr one by 
one by cas consumer without using batch process and committing solr. It 
is not optimum way to use this. Why ducc is not calling collection 
Process Complete method of Cas Consumer? And If I want to do that then 
What is the way to do this?

I am not able to find any thing about this in DUCC book.

Thanks in Advanced.

-- 
Thanks,
Reshu Agarwal