You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@uima.apache.org by Mario Gazzo <ma...@gmail.com> on 2015/04/24 11:37:35 UTC

Error handling in flow control

I am trying to get error handling to work with a custom flow control. I need to send status information back to a service after the flow completed either with or without errors but I can only do this once for any workflow item because it changes the state of the job, at least without error replies and wasteful requests. The problem is that I need to do several retries before finally failing and reporting the status to a service. First I tried to let the CPE do the retry for me by setting the max error count but then a new flow object is created every time and I loose track of the number of retries before this. This means that I don’t know when to report the status to the service because it should only happen after the final retry.

I then tried to let the flow instance manage the retries by moving back to the previous step again but then I get the error “org.apache.uima.cas.CASRuntimeException: Data for Sofa feature setLocalSofaData() has already been set”, which is because the document text is set in this particular test case. I then also tried to reset the CAS completely before retrying the pipeline from scratch and this of course throws the error “CASAdminException: Can't flush CAS, flushing is disabled.”. It would be less wasteful if only the failed step is retried instead of the whole pipeline but this requires clean up, which in some cases might be impossible. It appears that managing errors can be rather complex because the CAS can be in an unknown state and an analysis engine operation is not idempotent. I probably need to start the whole pipeline from the start if I want more than a single attempt, which gets me back to the problem of tracking the number of attempts before reporting back to the service.

Does anyone have any good suggestion on how to do this in UIMA e.g. passing state information from a failed flow to the next flow attempt?

Re: Error handling in flow control

Posted by Mario Gazzo <ma...@gmail.com>.

Thanks Eddie,

I think I need to look deeper into CasMultipliers and UIMA-AS but it sounds more complicated than I hoped. I got something without CAS multipliers working now and it can get me all the way if I initially just combine it with snapshots optimised in the new compressed binary CAS format. I will therefore now be working on other more crucial parts to get all things wired up first before digging deeper into this but I might get back to you about it once I have investigated it further. Your input has been valuable and gives me something to work with.

Much appreciated,
Mario

> On 26 Apr 2015, at 18:26 , Eddie Epstein <ea...@gmail.com> wrote:
> 
> Very clear, thanks. A CasMultiplier has the ability to deserialize a CAS
> from file and emit it as a child CAS. A parent CAS could have a
> FeatureStructure identifying it as one to be rerun from some specific state
> (CAS file), the CM would trigger on the FS and produce the child CAS to be
> reprocessed, the flow controller configured to return the child from the
> aggregate, and the client would then use the child and ignore the parent.
> 
> An ideal threading solution would be to use UIMA-AS. Unfortunately a
> UIMA-AS service currently requires an AMQ broker for service input and
> output. It is possible to embed both broker and service in process, just a
> complication and with serialization overhead.
> 
> Another thing to consider is to use the relatively new binary compressed
> CAS form 6, which can save considerable space over zip compressed XmiCas.
> Form 6 has the same ability as XmiCas to be deserialized into a CAS with
> different but compatible typesystem.
> 
> Hope this helps,
> Eddie
> 
> 
> On Sat, Apr 25, 2015 at 2:58 AM, Mario Gazzo <ma...@gmail.com> wrote:
> 
>> My apologies for not being very clear.
>> 
>> I managed to get the basic flow control to work after modifying some AE to
>> check for a previous installed sofa before just adding another.
>> 
>> The services I mentioned are not UIMA related but we are migrating
>> existing text analysis components to UIMA and these need to integrate with
>> a larger existing setup that rely on various AWS services such as S3,
>> DynamoDB, Simple Workflow and EMR. We don’t have as such plans to use
>> UIMA-AS or Vinci but instead we already use AWS Simple Workflow (SWF) to
>> orchestrate all our workers. This means that we just wanted to run multiple
>> UIMA pipelines inside some of these workers using multithreaded CPE. I am
>> now trying to implement this integration by consuming activity tasks from
>> SWF through a collection reader and then have a flow control manage the
>> logic and respond back when the AAE pipeline has completed or failed. This
>> is where I had problems when experimenting with failure handling.
>> 
>> We are storing output from these workers on S3 and in DynamoDB tables for
>> use further downstream in our workflow and online applications. We also
>> store intermediate results (snapshots) on S3 so that we can at any point go
>> back to a previous step and resume, retry or redo processing but it also
>> allows us to inspect data for debugging/analysis purposes. I thought that I
>> might be able to do something similar within the CPE using the CAS but this
>> isn't that simple. E.g. running the same AE twice against the same CAS
>> would result in those annotations occurring twice without carefully
>> designing around this. I can still serialize snapshot CAS to XMI on S3 but
>> I can’t just load them again in order to restore them back to a previous
>> state within the same CPE flow. Instead I would have to fail and initiate a
>> retry through SWF, which would cause the previous state to be loaded from
>> S3 into a new CAS via the next worker that receives the retry activity task
>> through its collection reader. However, storing many snapshot CAS outputs
>> will even compressed take a lot more space than the format we are using in
>> our production setup now, so I am considering whether there are alternative
>> approaches but they so far all appear much more complex and brittle.
>> 
>> Indeed CAS multipliers would be useful for us but the limitations of the
>> CPE and the general difficulties I have experienced so far have made me
>> consider implementing a custom multithreaded collection processor but I
>> wanted to avoid this.
>> 
>> Hope this clarifies what I am trying to do. Cheers :)
>> 
>>> On 24 Apr 2015, at 16:50 , Eddie Epstein <ea...@gmail.com> wrote:
>>> 
>>> Can you give more details on the overall pipeline deployment? The initial
>>> description mentions a CPE and it mentions services. The CPE was created
>>> before flow controllers or CasMutipliers existed and has no support of
>>> them. Services could be Vinci services for the CPE or UIMA-AS services or
>>> ???
>>> 
>>> On Fri, Apr 24, 2015 at 5:37 AM, Mario Gazzo <ma...@gmail.com>
>> wrote:
>>> 
>>>> I am trying to get error handling to work with a custom flow control. I
>>>> need to send status information back to a service after the flow
>> completed
>>>> either with or without errors but I can only do this once for any
>> workflow
>>>> item because it changes the state of the job, at least without error
>>>> replies and wasteful requests. The problem is that I need to do several
>>>> retries before finally failing and reporting the status to a service.
>> First
>>>> I tried to let the CPE do the retry for me by setting the max error
>> count
>>>> but then a new flow object is created every time and I loose track of
>> the
>>>> number of retries before this. This means that I don’t know when to
>> report
>>>> the status to the service because it should only happen after the final
>>>> retry.
>>>> 
>>>> I then tried to let the flow instance manage the retries by moving back
>> to
>>>> the previous step again but then I get the error
>>>> “org.apache.uima.cas.CASRuntimeException: Data for Sofa feature
>>>> setLocalSofaData() has already been set”, which is because the document
>>>> text is set in this particular test case. I then also tried to reset the
>>>> CAS completely before retrying the pipeline from scratch and this of
>> course
>>>> throws the error “CASAdminException: Can't flush CAS, flushing is
>>>> disabled.”. It would be less wasteful if only the failed step is retried
>>>> instead of the whole pipeline but this requires clean up, which in some
>>>> cases might be impossible. It appears that managing errors can be rather
>>>> complex because the CAS can be in an unknown state and an analysis
>> engine
>>>> operation is not idempotent. I probably need to start the whole pipeline
>>>> from the start if I want more than a single attempt, which gets me back
>> to
>>>> the problem of tracking the number of attempts before reporting back to
>> the
>>>> service.
>>>> 
>>>> Does anyone have any good suggestion on how to do this in UIMA e.g.
>>>> passing state information from a failed flow to the next flow attempt?
>>>> 
>>>> 
>> 
>>

Re: Error handling in flow control

Posted by Eddie Epstein <ea...@gmail.com>.

Very clear, thanks. A CasMultiplier has the ability to deserialize a CAS
from file and emit it as a child CAS. A parent CAS could have a
FeatureStructure identifying it as one to be rerun from some specific state
(CAS file), the CM would trigger on the FS and produce the child CAS to be
reprocessed, the flow controller configured to return the child from the
aggregate, and the client would then use the child and ignore the parent.

An ideal threading solution would be to use UIMA-AS. Unfortunately a
UIMA-AS service currently requires an AMQ broker for service input and
output. It is possible to embed both broker and service in process, just a
complication and with serialization overhead.

Another thing to consider is to use the relatively new binary compressed
CAS form 6, which can save considerable space over zip compressed XmiCas.
Form 6 has the same ability as XmiCas to be deserialized into a CAS with
different but compatible typesystem.

Hope this helps,
Eddie


On Sat, Apr 25, 2015 at 2:58 AM, Mario Gazzo <ma...@gmail.com> wrote:

> My apologies for not being very clear.
>
> I managed to get the basic flow control to work after modifying some AE to
> check for a previous installed sofa before just adding another.
>
> The services I mentioned are not UIMA related but we are migrating
> existing text analysis components to UIMA and these need to integrate with
> a larger existing setup that rely on various AWS services such as S3,
> DynamoDB, Simple Workflow and EMR. We don’t have as such plans to use
> UIMA-AS or Vinci but instead we already use AWS Simple Workflow (SWF) to
> orchestrate all our workers. This means that we just wanted to run multiple
> UIMA pipelines inside some of these workers using multithreaded CPE. I am
> now trying to implement this integration by consuming activity tasks from
> SWF through a collection reader and then have a flow control manage the
> logic and respond back when the AAE pipeline has completed or failed. This
> is where I had problems when experimenting with failure handling.
>
> We are storing output from these workers on S3 and in DynamoDB tables for
> use further downstream in our workflow and online applications. We also
> store intermediate results (snapshots) on S3 so that we can at any point go
> back to a previous step and resume, retry or redo processing but it also
> allows us to inspect data for debugging/analysis purposes. I thought that I
> might be able to do something similar within the CPE using the CAS but this
> isn't that simple. E.g. running the same AE twice against the same CAS
> would result in those annotations occurring twice without carefully
> designing around this. I can still serialize snapshot CAS to XMI on S3 but
> I can’t just load them again in order to restore them back to a previous
> state within the same CPE flow. Instead I would have to fail and initiate a
> retry through SWF, which would cause the previous state to be loaded from
> S3 into a new CAS via the next worker that receives the retry activity task
> through its collection reader. However, storing many snapshot CAS outputs
> will even compressed take a lot more space than the format we are using in
> our production setup now, so I am considering whether there are alternative
> approaches but they so far all appear much more complex and brittle.
>
> Indeed CAS multipliers would be useful for us but the limitations of the
> CPE and the general difficulties I have experienced so far have made me
> consider implementing a custom multithreaded collection processor but I
> wanted to avoid this.
>
> Hope this clarifies what I am trying to do. Cheers :)
>
> > On 24 Apr 2015, at 16:50 , Eddie Epstein <ea...@gmail.com> wrote:
> >
> > Can you give more details on the overall pipeline deployment? The initial
> > description mentions a CPE and it mentions services. The CPE was created
> > before flow controllers or CasMutipliers existed and has no support of
> > them. Services could be Vinci services for the CPE or UIMA-AS services or
> > ???
> >
> > On Fri, Apr 24, 2015 at 5:37 AM, Mario Gazzo <ma...@gmail.com>
> wrote:
> >
> >> I am trying to get error handling to work with a custom flow control. I
> >> need to send status information back to a service after the flow
> completed
> >> either with or without errors but I can only do this once for any
> workflow
> >> item because it changes the state of the job, at least without error
> >> replies and wasteful requests. The problem is that I need to do several
> >> retries before finally failing and reporting the status to a service.
> First
> >> I tried to let the CPE do the retry for me by setting the max error
> count
> >> but then a new flow object is created every time and I loose track of
> the
> >> number of retries before this. This means that I don’t know when to
> report
> >> the status to the service because it should only happen after the final
> >> retry.
> >>
> >> I then tried to let the flow instance manage the retries by moving back
> to
> >> the previous step again but then I get the error
> >> “org.apache.uima.cas.CASRuntimeException: Data for Sofa feature
> >> setLocalSofaData() has already been set”, which is because the document
> >> text is set in this particular test case. I then also tried to reset the
> >> CAS completely before retrying the pipeline from scratch and this of
> course
> >> throws the error “CASAdminException: Can't flush CAS, flushing is
> >> disabled.”. It would be less wasteful if only the failed step is retried
> >> instead of the whole pipeline but this requires clean up, which in some
> >> cases might be impossible. It appears that managing errors can be rather
> >> complex because the CAS can be in an unknown state and an analysis
> engine
> >> operation is not idempotent. I probably need to start the whole pipeline
> >> from the start if I want more than a single attempt, which gets me back
> to
> >> the problem of tracking the number of attempts before reporting back to
> the
> >> service.
> >>
> >> Does anyone have any good suggestion on how to do this in UIMA e.g.
> >> passing state information from a failed flow to the next flow attempt?
> >>
> >>
>
>

Re: Error handling in flow control

Posted by Mario Gazzo <ma...@gmail.com>.

My apologies for not being very clear.

I managed to get the basic flow control to work after modifying some AE to check for a previous installed sofa before just adding another.

The services I mentioned are not UIMA related but we are migrating existing text analysis components to UIMA and these need to integrate with a larger existing setup that rely on various AWS services such as S3, DynamoDB, Simple Workflow and EMR. We don’t have as such plans to use UIMA-AS or Vinci but instead we already use AWS Simple Workflow (SWF) to orchestrate all our workers. This means that we just wanted to run multiple UIMA pipelines inside some of these workers using multithreaded CPE. I am now trying to implement this integration by consuming activity tasks from SWF through a collection reader and then have a flow control manage the logic and respond back when the AAE pipeline has completed or failed. This is where I had problems when experimenting with failure handling.

We are storing output from these workers on S3 and in DynamoDB tables for use further downstream in our workflow and online applications. We also store intermediate results (snapshots) on S3 so that we can at any point go back to a previous step and resume, retry or redo processing but it also allows us to inspect data for debugging/analysis purposes. I thought that I might be able to do something similar within the CPE using the CAS but this isn't that simple. E.g. running the same AE twice against the same CAS would result in those annotations occurring twice without carefully designing around this. I can still serialize snapshot CAS to XMI on S3 but I can’t just load them again in order to restore them back to a previous state within the same CPE flow. Instead I would have to fail and initiate a retry through SWF, which would cause the previous state to be loaded from S3 into a new CAS via the next worker that receives the retry activity task through its collection reader. However, storing many snapshot CAS outputs will even compressed take a lot more space than the format we are using in our production setup now, so I am considering whether there are alternative approaches but they so far all appear much more complex and brittle.

Indeed CAS multipliers would be useful for us but the limitations of the CPE and the general difficulties I have experienced so far have made me consider implementing a custom multithreaded collection processor but I wanted to avoid this.

Hope this clarifies what I am trying to do. Cheers :)

> On 24 Apr 2015, at 16:50 , Eddie Epstein <ea...@gmail.com> wrote:
> 
> Can you give more details on the overall pipeline deployment? The initial
> description mentions a CPE and it mentions services. The CPE was created
> before flow controllers or CasMutipliers existed and has no support of
> them. Services could be Vinci services for the CPE or UIMA-AS services or
> ???
> 
> On Fri, Apr 24, 2015 at 5:37 AM, Mario Gazzo <ma...@gmail.com> wrote:
> 
>> I am trying to get error handling to work with a custom flow control. I
>> need to send status information back to a service after the flow completed
>> either with or without errors but I can only do this once for any workflow
>> item because it changes the state of the job, at least without error
>> replies and wasteful requests. The problem is that I need to do several
>> retries before finally failing and reporting the status to a service. First
>> I tried to let the CPE do the retry for me by setting the max error count
>> but then a new flow object is created every time and I loose track of the
>> number of retries before this. This means that I don’t know when to report
>> the status to the service because it should only happen after the final
>> retry.
>> 
>> I then tried to let the flow instance manage the retries by moving back to
>> the previous step again but then I get the error
>> “org.apache.uima.cas.CASRuntimeException: Data for Sofa feature
>> setLocalSofaData() has already been set”, which is because the document
>> text is set in this particular test case. I then also tried to reset the
>> CAS completely before retrying the pipeline from scratch and this of course
>> throws the error “CASAdminException: Can't flush CAS, flushing is
>> disabled.”. It would be less wasteful if only the failed step is retried
>> instead of the whole pipeline but this requires clean up, which in some
>> cases might be impossible. It appears that managing errors can be rather
>> complex because the CAS can be in an unknown state and an analysis engine
>> operation is not idempotent. I probably need to start the whole pipeline
>> from the start if I want more than a single attempt, which gets me back to
>> the problem of tracking the number of attempts before reporting back to the
>> service.
>> 
>> Does anyone have any good suggestion on how to do this in UIMA e.g.
>> passing state information from a failed flow to the next flow attempt?
>> 
>>

Re: Error handling in flow control

Posted by Eddie Epstein <ea...@gmail.com>.

Can you give more details on the overall pipeline deployment? The initial
description mentions a CPE and it mentions services. The CPE was created
before flow controllers or CasMutipliers existed and has no support of
them. Services could be Vinci services for the CPE or UIMA-AS services or
???

On Fri, Apr 24, 2015 at 5:37 AM, Mario Gazzo <ma...@gmail.com> wrote:

> I am trying to get error handling to work with a custom flow control. I
> need to send status information back to a service after the flow completed
> either with or without errors but I can only do this once for any workflow
> item because it changes the state of the job, at least without error
> replies and wasteful requests. The problem is that I need to do several
> retries before finally failing and reporting the status to a service. First
> I tried to let the CPE do the retry for me by setting the max error count
> but then a new flow object is created every time and I loose track of the
> number of retries before this. This means that I don’t know when to report
> the status to the service because it should only happen after the final
> retry.
>
> I then tried to let the flow instance manage the retries by moving back to
> the previous step again but then I get the error
> “org.apache.uima.cas.CASRuntimeException: Data for Sofa feature
> setLocalSofaData() has already been set”, which is because the document
> text is set in this particular test case. I then also tried to reset the
> CAS completely before retrying the pipeline from scratch and this of course
> throws the error “CASAdminException: Can't flush CAS, flushing is
> disabled.”. It would be less wasteful if only the failed step is retried
> instead of the whole pipeline but this requires clean up, which in some
> cases might be impossible. It appears that managing errors can be rather
> complex because the CAS can be in an unknown state and an analysis engine
> operation is not idempotent. I probably need to start the whole pipeline
> from the start if I want more than a single attempt, which gets me back to
> the problem of tracking the number of attempts before reporting back to the
> service.
>
> Does anyone have any good suggestion on how to do this in UIMA e.g.
> passing state information from a failed flow to the next flow attempt?
>
>