You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@uima.apache.org by Egbert van der Wal <ew...@pointpro.nl> on 2016/08/25 14:18:00 UTC

Debugging a NullPointerException in UIMA AS / processing timeouts

Hi,

I'm having a problem using UIMA-AS. I have a pipeline set up that 
processes HTML documents in ~= 10 ms. The total time out value was 
initially 20 seconds, but I increased it to 120 ms at some point to 
avoid this problem, it seemed to help.

However, sometimes the 2 minutes is still hit and a warning is shown. 
When this occurs, it will usually be accompanied with 
NullPointerExceptions in combination with Xerces, somewhere in the 
internals of UIMA. See the attached log-file excerpt for the errors I'm 
seeing. The first 5 lines are the 'normal' output, which was repeated 
for several thousand lines before during the succesful operation of the 
pipeline.

The SOFA that is being sent out during this particular exception is a 
quite small HTML-document, just a couple of kilobytes, and it's not 
actually reproducible with the same document; if I run the program again 
it will eventually fail, but at some other point.

How can I go about solving this issue? Since the part of my own code in 
the stacktrace is limited to the point where 'sendCAS' is called, I 
can't really think of any additional debugging I can do.

Any suggestions are highly appreciated!

Thanks,

Egbert van der Wal

Re: Debugging a NullPointerException in UIMA AS / processing timeouts

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.
Egbert, the UIMA-AS only supports timeouts on Remotes. You could deploy the
annotator in question as a remote
and define a process timeout on it. You would get a timeout on that
component and a log will show it. The
UIMA-AS does not support timeouts on co-located (in-process) parts of the
pipeline.

To eliminate UIMA-AS as a cause of the NPE, can you define a CasConsumer at
the end of your pipeline and write
out a CAS to disk, and in the reply just send a filename to that serialized
CAS?


From what I understand so far your service can serialize a CAS in a reply,
but the UIMA-AS client fails to
deserialize it. Not sure if this is caused by a race condition where one
thread begins to deserialize a reply
while another is handling a timeout which resets a CAS.

-jerry





On Mon, Sep 26, 2016 at 6:01 AM, Egbert van der Wal <ew...@pointpro.nl>
wrote:

> Hi again,
>
> After some days of testing and debugging I've been able to narrow down the
> problem somewhat.
>
> It turned out that, after all, it was reproducable by feeding a certain
> document. After attemping to disable the enabled annotators one by one, I
> was able to localize the problem to a specific annotator, which, when,
> disabled, 'fixed' the issue. I still don't know exactly what is causing it
> within this annotator, but now I've got a lead so I should be able to fix
> it.
>
> However, the FINEST loglevel doesn't really give any useful information.
> It shows the current waiting queue of CAS ID's that are in the pipeline.
> However, it still doesn't give me a useful error message, it's still an
> exception somewhere in the internals of UIMA, either a
> 'NullPointerException' on
>
> org.apache.uima.cas.impl.CASImpl.ll_getFSForRef(CASImpl.java:3478)
>
> or a a 'IndexOutOfBoundsException' on:
>
> org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializ
> erHandler.getIndexRepo(XmiCasDeserializer.java:628)
>
>
> It seems both are related to a faulty annotator choking on a too large
> payload and not returning results in time.
>
> However, in this situation I would expect a exception that the annotator
> doesn't respond in time, rather than these messages, so that I would know
> where to look right away. Have I misconfigured some debugging / pipeline
> settings or is this something for the UIMA wishlist?
>
> Thanks again!
>
> Regards,
>
> Egbert
>
>
>
>
>
>
> On 29-08-16 15:28, Jaroslaw Cwiklik wrote:
>
>> Egbert, thanks. I forgot to ask, what version of UIMA-AS are you using?
>> Also, are you using sendCAS() or sendAndReceive() API?
>>
>> Have a great vacation!
>>
>> -jerry
>>
>> On Sun, Aug 28, 2016 at 9:39 AM, Egbert van der Wal <ew...@pointpro.nl>
>> wrote:
>>
>> Hi Jerry,
>>>
>>> Thanks for the suggestion. I have the feeling that it's a race condition,
>>> too, but since I'm doing no multi-threading myself, basically all the
>>> threading and synchronization should be UIMA-internal.
>>>
>>> Anyway, I'll have to postpone researching the issue due to going on
>>> vacation. When I get back I'll try to get more information with a
>>> increased
>>> log level, and get back to you.
>>>
>>> Thanks again!
>>>
>>> Regards,
>>>
>>> Egbert
>>>
>>>
>>> Op 25-8-2016 om 17:17 schreef Jaroslaw Cwiklik:
>>>
>>> Hi, I have a feeling that there might be a race condition here. In the
>>>
>>>> client, the timer pops and at the same time a reply is received.
>>>> The timout logic is resetting the CAS while its being deserialized which
>>>> may lead to NPE. Not 100% certain but this might be the problem.
>>>>
>>>> Any chance you can increase UIMA log level to FINEST on the client side?
>>>> It
>>>> would log important information like the internal CAS ID  on each reply
>>>> which can be used to correlate events in the log.
>>>>
>>>> -jerry
>>>>
>>>> On Thu, Aug 25, 2016 at 10:18 AM, Egbert van der Wal <ew...@pointpro.nl>
>>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>>>
>>>>> I'm having a problem using UIMA-AS. I have a pipeline set up that
>>>>> processes HTML documents in ~= 10 ms. The total time out value was
>>>>> initially 20 seconds, but I increased it to 120 ms at some point to
>>>>> avoid
>>>>> this problem, it seemed to help.
>>>>>
>>>>> However, sometimes the 2 minutes is still hit and a warning is shown.
>>>>> When
>>>>> this occurs, it will usually be accompanied with NullPointerExceptions
>>>>> in
>>>>> combination with Xerces, somewhere in the internals of UIMA. See the
>>>>> attached log-file excerpt for the errors I'm seeing. The first 5 lines
>>>>> are
>>>>> the 'normal' output, which was repeated for several thousand lines
>>>>> before
>>>>> during the succesful operation of the pipeline.
>>>>>
>>>>> The SOFA that is being sent out during this particular exception is a
>>>>> quite small HTML-document, just a couple of kilobytes, and it's not
>>>>> actually reproducible with the same document; if I run the program
>>>>> again
>>>>> it
>>>>> will eventually fail, but at some other point.
>>>>>
>>>>> How can I go about solving this issue? Since the part of my own code in
>>>>> the stacktrace is limited to the point where 'sendCAS' is called, I
>>>>> can't
>>>>> really think of any additional debugging I can do.
>>>>>
>>>>> Any suggestions are highly appreciated!
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Egbert van der Wal
>>>>>
>>>>>
>>>>>
>>>>
>>

Re: Debugging a NullPointerException in UIMA AS / processing timeouts

Posted by Egbert van der Wal <ew...@pointpro.nl>.
Hi again,

After some days of testing and debugging I've been able to narrow down 
the problem somewhat.

It turned out that, after all, it was reproducable by feeding a certain 
document. After attemping to disable the enabled annotators one by one, 
I was able to localize the problem to a specific annotator, which, when, 
disabled, 'fixed' the issue. I still don't know exactly what is causing 
it within this annotator, but now I've got a lead so I should be able to 
fix it.

However, the FINEST loglevel doesn't really give any useful information. 
It shows the current waiting queue of CAS ID's that are in the pipeline. 
However, it still doesn't give me a useful error message, it's still an 
exception somewhere in the internals of UIMA, either a 
'NullPointerException' on

org.apache.uima.cas.impl.CASImpl.ll_getFSForRef(CASImpl.java:3478)

or a a 'IndexOutOfBoundsException' on:

org.apache.uima.cas.impl.XmiCasDeserializer$XmiCasDeserializerHandler.getIndexRepo(XmiCasDeserializer.java:628)


It seems both are related to a faulty annotator choking on a too large 
payload and not returning results in time.

However, in this situation I would expect a exception that the annotator 
doesn't respond in time, rather than these messages, so that I would 
know where to look right away. Have I misconfigured some debugging / 
pipeline settings or is this something for the UIMA wishlist?

Thanks again!

Regards,

Egbert





On 29-08-16 15:28, Jaroslaw Cwiklik wrote:
> Egbert, thanks. I forgot to ask, what version of UIMA-AS are you using?
> Also, are you using sendCAS() or sendAndReceive() API?
>
> Have a great vacation!
>
> -jerry
>
> On Sun, Aug 28, 2016 at 9:39 AM, Egbert van der Wal <ew...@pointpro.nl>
> wrote:
>
>> Hi Jerry,
>>
>> Thanks for the suggestion. I have the feeling that it's a race condition,
>> too, but since I'm doing no multi-threading myself, basically all the
>> threading and synchronization should be UIMA-internal.
>>
>> Anyway, I'll have to postpone researching the issue due to going on
>> vacation. When I get back I'll try to get more information with a increased
>> log level, and get back to you.
>>
>> Thanks again!
>>
>> Regards,
>>
>> Egbert
>>
>>
>> Op 25-8-2016 om 17:17 schreef Jaroslaw Cwiklik:
>>
>> Hi, I have a feeling that there might be a race condition here. In the
>>> client, the timer pops and at the same time a reply is received.
>>> The timout logic is resetting the CAS while its being deserialized which
>>> may lead to NPE. Not 100% certain but this might be the problem.
>>>
>>> Any chance you can increase UIMA log level to FINEST on the client side?
>>> It
>>> would log important information like the internal CAS ID  on each reply
>>> which can be used to correlate events in the log.
>>>
>>> -jerry
>>>
>>> On Thu, Aug 25, 2016 at 10:18 AM, Egbert van der Wal <ew...@pointpro.nl>
>>> wrote:
>>>
>>> Hi,
>>>>
>>>> I'm having a problem using UIMA-AS. I have a pipeline set up that
>>>> processes HTML documents in ~= 10 ms. The total time out value was
>>>> initially 20 seconds, but I increased it to 120 ms at some point to avoid
>>>> this problem, it seemed to help.
>>>>
>>>> However, sometimes the 2 minutes is still hit and a warning is shown.
>>>> When
>>>> this occurs, it will usually be accompanied with NullPointerExceptions in
>>>> combination with Xerces, somewhere in the internals of UIMA. See the
>>>> attached log-file excerpt for the errors I'm seeing. The first 5 lines
>>>> are
>>>> the 'normal' output, which was repeated for several thousand lines before
>>>> during the succesful operation of the pipeline.
>>>>
>>>> The SOFA that is being sent out during this particular exception is a
>>>> quite small HTML-document, just a couple of kilobytes, and it's not
>>>> actually reproducible with the same document; if I run the program again
>>>> it
>>>> will eventually fail, but at some other point.
>>>>
>>>> How can I go about solving this issue? Since the part of my own code in
>>>> the stacktrace is limited to the point where 'sendCAS' is called, I can't
>>>> really think of any additional debugging I can do.
>>>>
>>>> Any suggestions are highly appreciated!
>>>>
>>>> Thanks,
>>>>
>>>> Egbert van der Wal
>>>>
>>>>
>>>
>

Re: Debugging a NullPointerException in UIMA AS / processing timeouts

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.
Egbert, thanks. I forgot to ask, what version of UIMA-AS are you using?
Also, are you using sendCAS() or sendAndReceive() API?

Have a great vacation!

-jerry

On Sun, Aug 28, 2016 at 9:39 AM, Egbert van der Wal <ew...@pointpro.nl>
wrote:

> Hi Jerry,
>
> Thanks for the suggestion. I have the feeling that it's a race condition,
> too, but since I'm doing no multi-threading myself, basically all the
> threading and synchronization should be UIMA-internal.
>
> Anyway, I'll have to postpone researching the issue due to going on
> vacation. When I get back I'll try to get more information with a increased
> log level, and get back to you.
>
> Thanks again!
>
> Regards,
>
> Egbert
>
>
> Op 25-8-2016 om 17:17 schreef Jaroslaw Cwiklik:
>
> Hi, I have a feeling that there might be a race condition here. In the
>> client, the timer pops and at the same time a reply is received.
>> The timout logic is resetting the CAS while its being deserialized which
>> may lead to NPE. Not 100% certain but this might be the problem.
>>
>> Any chance you can increase UIMA log level to FINEST on the client side?
>> It
>> would log important information like the internal CAS ID  on each reply
>> which can be used to correlate events in the log.
>>
>> -jerry
>>
>> On Thu, Aug 25, 2016 at 10:18 AM, Egbert van der Wal <ew...@pointpro.nl>
>> wrote:
>>
>> Hi,
>>>
>>> I'm having a problem using UIMA-AS. I have a pipeline set up that
>>> processes HTML documents in ~= 10 ms. The total time out value was
>>> initially 20 seconds, but I increased it to 120 ms at some point to avoid
>>> this problem, it seemed to help.
>>>
>>> However, sometimes the 2 minutes is still hit and a warning is shown.
>>> When
>>> this occurs, it will usually be accompanied with NullPointerExceptions in
>>> combination with Xerces, somewhere in the internals of UIMA. See the
>>> attached log-file excerpt for the errors I'm seeing. The first 5 lines
>>> are
>>> the 'normal' output, which was repeated for several thousand lines before
>>> during the succesful operation of the pipeline.
>>>
>>> The SOFA that is being sent out during this particular exception is a
>>> quite small HTML-document, just a couple of kilobytes, and it's not
>>> actually reproducible with the same document; if I run the program again
>>> it
>>> will eventually fail, but at some other point.
>>>
>>> How can I go about solving this issue? Since the part of my own code in
>>> the stacktrace is limited to the point where 'sendCAS' is called, I can't
>>> really think of any additional debugging I can do.
>>>
>>> Any suggestions are highly appreciated!
>>>
>>> Thanks,
>>>
>>> Egbert van der Wal
>>>
>>>
>>

Re: Debugging a NullPointerException in UIMA AS / processing timeouts

Posted by Egbert van der Wal <ew...@pointpro.nl>.
Hi Jerry,

Thanks for the suggestion. I have the feeling that it's a race 
condition, too, but since I'm doing no multi-threading myself, basically 
all the threading and synchronization should be UIMA-internal.

Anyway, I'll have to postpone researching the issue due to going on 
vacation. When I get back I'll try to get more information with a 
increased log level, and get back to you.

Thanks again!

Regards,

Egbert


Op 25-8-2016 om 17:17 schreef Jaroslaw Cwiklik:
> Hi, I have a feeling that there might be a race condition here. In the
> client, the timer pops and at the same time a reply is received.
> The timout logic is resetting the CAS while its being deserialized which
> may lead to NPE. Not 100% certain but this might be the problem.
>
> Any chance you can increase UIMA log level to FINEST on the client side? It
> would log important information like the internal CAS ID  on each reply
> which can be used to correlate events in the log.
>
> -jerry
>
> On Thu, Aug 25, 2016 at 10:18 AM, Egbert van der Wal <ew...@pointpro.nl>
> wrote:
>
>> Hi,
>>
>> I'm having a problem using UIMA-AS. I have a pipeline set up that
>> processes HTML documents in ~= 10 ms. The total time out value was
>> initially 20 seconds, but I increased it to 120 ms at some point to avoid
>> this problem, it seemed to help.
>>
>> However, sometimes the 2 minutes is still hit and a warning is shown. When
>> this occurs, it will usually be accompanied with NullPointerExceptions in
>> combination with Xerces, somewhere in the internals of UIMA. See the
>> attached log-file excerpt for the errors I'm seeing. The first 5 lines are
>> the 'normal' output, which was repeated for several thousand lines before
>> during the succesful operation of the pipeline.
>>
>> The SOFA that is being sent out during this particular exception is a
>> quite small HTML-document, just a couple of kilobytes, and it's not
>> actually reproducible with the same document; if I run the program again it
>> will eventually fail, but at some other point.
>>
>> How can I go about solving this issue? Since the part of my own code in
>> the stacktrace is limited to the point where 'sendCAS' is called, I can't
>> really think of any additional debugging I can do.
>>
>> Any suggestions are highly appreciated!
>>
>> Thanks,
>>
>> Egbert van der Wal
>>
>

Re: Debugging a NullPointerException in UIMA AS / processing timeouts

Posted by Jaroslaw Cwiklik <ui...@gmail.com>.
Hi, I have a feeling that there might be a race condition here. In the
client, the timer pops and at the same time a reply is received.
The timout logic is resetting the CAS while its being deserialized which
may lead to NPE. Not 100% certain but this might be the problem.

Any chance you can increase UIMA log level to FINEST on the client side? It
would log important information like the internal CAS ID  on each reply
which can be used to correlate events in the log.

-jerry

On Thu, Aug 25, 2016 at 10:18 AM, Egbert van der Wal <ew...@pointpro.nl>
wrote:

> Hi,
>
> I'm having a problem using UIMA-AS. I have a pipeline set up that
> processes HTML documents in ~= 10 ms. The total time out value was
> initially 20 seconds, but I increased it to 120 ms at some point to avoid
> this problem, it seemed to help.
>
> However, sometimes the 2 minutes is still hit and a warning is shown. When
> this occurs, it will usually be accompanied with NullPointerExceptions in
> combination with Xerces, somewhere in the internals of UIMA. See the
> attached log-file excerpt for the errors I'm seeing. The first 5 lines are
> the 'normal' output, which was repeated for several thousand lines before
> during the succesful operation of the pipeline.
>
> The SOFA that is being sent out during this particular exception is a
> quite small HTML-document, just a couple of kilobytes, and it's not
> actually reproducible with the same document; if I run the program again it
> will eventually fail, but at some other point.
>
> How can I go about solving this issue? Since the part of my own code in
> the stacktrace is limited to the point where 'sendCAS' is called, I can't
> really think of any additional debugging I can do.
>
> Any suggestions are highly appreciated!
>
> Thanks,
>
> Egbert van der Wal
>