You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@manifoldcf.apache.org by Julien Massiera <ju...@francelabs.com> on 2022/02/24 13:47:39 UTC

WorkerThread runtime exceptions

Hi,

 

I have faced a situation where the MCF agent was still up but was not doing
anything after a runtime exception. 

 

My use case was the following :
I have updated the libs used by a repository connector but forgot one.
During doc processing, a runtime exception < java.lang.NoSuchMethodError >
has been throwed because the sub dependency lib was not up to date and thus
the method called was missing. The exception was catched by the WorkerThread
and displayed < Error tossed: .. > but then nothing and the job stayed in
running status and I was not able to abort it until I killed and I restarted
the agent.

 

The catching clause is located in the WorkerThread class at lines 853-857. I
know this is a particular case but I am not sure that the fact the agent
hangs after this exception is a normal behavior and furthermore I can
imagine that it can happen with other unkown runtime exceptions. Is there
something we can do to avoid the agent to be hanging in those cases ? 

 

Regards,

Julien

Re: WorkerThread runtime exceptions

Posted by Karl Wright <da...@gmail.com>.

So, when something goes wrong during document processing, the usual thing
that happens is that the document is pushed back onto the document queue
but with a processing time some distance in the future.  How far in the
future depends on the details of exception handling within the connector.
But a FATAL Error / RuntimeException, in this case due to a linkage error
(which is what you have here - you are missing a jar that you need) will
not always recover properly. I will need to look at the code to see what
happens with non-exception Throwables in the worker thread to see the
details, but the problem with runtime exceptions like that is that this is
a broad class of problem that may mean that the database is not even
working properly, or you cannot reach it, etc.  We probably are not going
to be able to make ManifoldCF robust against such errors, out of memory
conditions, etc., and it would probably not be a realistic goal to make
that so either.

Karl


On Wed, Mar 23, 2022 at 12:10 PM Julien Massiera <
julien.massiera@francelabs.com> wrote:

> Yes sorry I wrongly described the 1/:  the runtime exception happens in
> the processDocument on the first and only document found in the seeding
> phase.
>
> Here is the stack trace:
>
> WARN 2022-03-22T13:07:03,761 (Worker thread '13') -
> MCF|MCF-agent|apache.manifoldcf.connectors|JCIFS: Possibly transient
> exception detected on attempt 1 while checking if file
> smb://localhost/test/ exists: null
> jcifs.smb.SmbException: null
>         at jcifs.smb.SmbTreeImpl.waitForState(SmbTreeImpl.java:774)
> ~[jcifs-ng-2.1.7.jar:?]
>         at jcifs.smb.SmbTreeImpl.treeConnect(SmbTreeImpl.java:540)
> ~[jcifs-ng-2.1.7.jar:?]
>         at
> jcifs.smb.SmbTreeConnection.connectTree(SmbTreeConnection.java:614)
> ~[jcifs-ng-2.1.7.jar:?]
>         at
> jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:568)
> ~[jcifs-ng-2.1.7.jar:?]
>         at
> jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:489)
> ~[jcifs-ng-2.1.7.jar:?]
>         at jcifs.smb.SmbTreeConnection.connect(SmbTreeConnection.java:465)
> ~[jcifs-ng-2.1.7.jar:?]
>         at
> jcifs.smb.SmbTreeConnection.connectWrapException(SmbTreeConnection.java:426)
> ~[jcifs-ng-2.1.7.jar:?]
>         at jcifs.smb.SmbFile.ensureTreeConnected(SmbFile.java:559)
> ~[jcifs-ng-2.1.7.jar:?]
>         at jcifs.smb.SmbFile.exists(SmbFile.java:859)
> ~[jcifs-ng-2.1.7.jar:?]
>         at
> com.francelabs.datafari.connectors.share.SharedDriveConnector.fileExists(SharedDriveConnector.java:2129)
> [datafari-share-connector-6.0-dev-Community.jar:?]
>         at
> com.francelabs.datafari.connectors.share.SharedDriveConnector.processDocuments(SharedDriveConnector.java:677)
> [datafari-share-connector-6.0-dev-Community.jar:?]
>         at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
> [mcf-pull-agent.jar:?]
> Caused by: java.lang.InterruptedException
>         at java.lang.Object.wait(Native Method) ~[?:?]
>         at java.lang.Object.wait(Object.java:328) ~[?:?]
>         at jcifs.smb.SmbTreeImpl.waitForState(SmbTreeImpl.java:771)
> ~[jcifs-ng-2.1.7.jar:?]
>         ... 11 more
> FATAL 2022-03-22T13:07:03,777 (Worker thread '13') -
> MCF|MCF-agent|apache.manifoldcf.crawlerthreads|Error tossed: 'boolean
> org.bouncycastle.asn1.ASN1ObjectIdentifier.equals(org.bouncycastle.asn1.ASN1Primitive)'
> java.lang.NoSuchMethodError: 'boolean
> org.bouncycastle.asn1.ASN1ObjectIdentifier.equals(org.bouncycastle.asn1.ASN1Primitive)'
>         at jcifs.spnego.NegTokenInit.parse(NegTokenInit.java:167) ~[?:?]
>         at jcifs.spnego.NegTokenInit.<init>(NegTokenInit.java:66) ~[?:?]
>         at
> jcifs.smb.NtlmPasswordAuthenticator.createContext(NtlmPasswordAuthenticator.java:243)
> ~[?:?]
>         at jcifs.smb.SmbSessionImpl.createContext(SmbSessionImpl.java:706)
> ~[?:?]
>         at
> jcifs.smb.SmbSessionImpl.sessionSetupSMB2(SmbSessionImpl.java:544) ~[?:?]
>         at jcifs.smb.SmbSessionImpl.sessionSetup(SmbSessionImpl.java:491)
> ~[?:?]
>         at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:369) ~[?:?]
>         at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:347) ~[?:?]
>         at jcifs.smb.SmbTreeImpl.treeConnect(SmbTreeImpl.java:611) ~[?:?]
>         at
> jcifs.smb.SmbTreeConnection.connectTree(SmbTreeConnection.java:614) ~[?:?]
>         at
> jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:568) ~[?:?]
>         at
> jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:489) ~[?:?]
>         at jcifs.smb.SmbTreeConnection.connect(SmbTreeConnection.java:465)
> ~[?:?]
>         at
> jcifs.smb.SmbTreeConnection.connectWrapException(SmbTreeConnection.java:426)
> ~[?:?]
>         at jcifs.smb.SmbFile.ensureTreeConnected(SmbFile.java:559) ~[?:?]
>         at jcifs.smb.SmbFile.exists(SmbFile.java:859) ~[?:?]
>         at
> com.francelabs.datafari.connectors.share.SharedDriveConnector.fileExists(SharedDriveConnector.java:2129)
> ~[?:?]
>         at
> com.francelabs.datafari.connectors.share.SharedDriveConnector.processDocuments(SharedDriveConnector.java:677)
> ~[?:?]
>         at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
> [mcf-pull-agent.jar:?]
>
>
> So after that FATAL exception, we end up in the 'wait' state of the
> documentQueue
>
> Julien
>
> -----Message d'origine-----
> De : Karl Wright <da...@gmail.com>
> Envoyé : mercredi 23 mars 2022 16:32
> À : dev <de...@manifoldcf.apache.org>
> Objet : Re: WorkerThread runtime exceptions
>
> Specifically, the stuffer thread is responsible for finding documents to
> process and getting them to the worker threads via the internal queue that
> the worker threads wait on.  The stuffer thread uses a query to do this.
> Either the query is not finding any documents, or the stuffer thread is
> down.  Probably it is the former, and the reason it is not finding any
> documents is because the job is in the wrong state due to that runtime
> exception.
>
> Can you describe what code is throwing that runtime exception?  It would
> be very helpful if you could provide a stack trace for it from the log.
>
> Karl
>
>
> On Wed, Mar 23, 2022 at 11:27 AM Karl Wright <da...@gmail.com> wrote:
>
> > ' 1/ On the first and only one document of the seeding phase
> > encountered, a runtime exception is triggered'
> >
> > The worker threads do not handle seeding.  If a runtime exception
> > takes place during seeding, no documents will be queued, and that is
> > the problem.  The state of the job must be incorrectly updated even
> > though the seeding failed.  OR the job's state is properly updated but
> > the corresponding thread that is supposed to know when the job is
> > completed (by looking at the job queue) doesn't properly trigger.
> >
> > The architecture of ManifoldCF has many threads that are individually
> > responsible for transitioning the job state based on the jobqueue.  If
> > somehow the jobstate winds up not in the right state then those
> > threads will not do the right thing.
> >
> > Karl
> >
> >
> > On Wed, Mar 23, 2022 at 11:08 AM Julien Massiera <
> > julien.massiera@francelabs.com> wrote:
> >
> >> Hi Karl,
> >>
> >> I had some time to investigate the problem I exposed in my first
> >> mail, and here is the behavior I observed:
> >>
> >> 1/ On the first and only one document of the seeding phase
> >> encountered, a runtime exception is triggered 2/ The runtime
> >> exception is catched by the WorkerThread, logged, and the
> >> WorkerThread stays alive (line 856 of the WorkerThread class) 3/ The
> >> WorkerThread calls the getDocument method of its documentQueue (line
> >> 121 of the WorkerThread class) 4/ The documentQueue ends in an
> >> infinite 'wait' state because the queue size is 0 and the resetFlag
> >> is false (lines 109 and 110 of the DocumentQueue class) 5/ Because of
> >> the infinite 'wait' state of the documentQueue, the job stays freezed
> >> on the 'running' state and it is impossible to stop it until the
> >> Agent is restarted
> >>
> >> I don't know much about the WorkerThread and the DocumentQueue logic,
> >> so from there, I really need your help to understand this behavior
> >> and to figure out what can be done to prevent the job from hanging in
> >> that case, which, I assume, can happen in other circumstances with
> >> other repository connectors
> >>
> >> Regards,
> >> Julien
> >>
> >> -----Message d'origine-----
> >> De : Julien Massiera <ju...@francelabs.com> Envoyé : jeudi
> >> 24 février 2022 15:08 À : dev@manifoldcf.apache.org Objet : RE:
> >> WorkerThread runtime exceptions
> >>
> >> Yes I understand
> >>
> >> -----Message d'origine-----
> >> De : Karl Wright <da...@gmail.com> Envoyé : jeudi 24 février 2022
> >> 14:59 À : dev <de...@manifoldcf.apache.org> Objet : Re: WorkerThread
> >> runtime exceptions
> >>
> >> I'm currently completely consumed with upgrading dependencies for
> >> Tika and CXF.  This is a massive job and won't be done for probably
> >> another week or two.  Once that is done I can try to look into your
> concern.
> >>
> >> Karl
> >>
> >>
> >> On Thu, Feb 24, 2022 at 8:47 AM Julien Massiera <
> >> julien.massiera@francelabs.com> wrote:
> >>
> >> > Hi,
> >> >
> >> >
> >> >
> >> > I have faced a situation where the MCF agent was still up but was
> >> > not doing anything after a runtime exception.
> >> >
> >> >
> >> >
> >> > My use case was the following :
> >> > I have updated the libs used by a repository connector but forgot one.
> >> > During doc processing, a runtime exception <
> >> > java.lang.NoSuchMethodError > has been throwed because the sub
> >> > dependency lib was not up to date and thus the method called was
> >> > missing. The exception was catched by the WorkerThread and
> >> > displayed < Error tossed: .. > but then nothing and the job stayed
> >> > in running status and I was not able to abort it until I killed and
> >> > I restarted the agent.
> >> >
> >> >
> >> >
> >> > The catching clause is located in the WorkerThread class at lines
> >> 853-857.
> >> > I
> >> > know this is a particular case but I am not sure that the fact the
> >> > agent hangs after this exception is a normal behavior and
> >> > furthermore I can imagine that it can happen with other unkown
> runtime exceptions.
> >> > Is there something we can do to avoid the agent to be hanging in
> >> > those
> >> cases ?
> >> >
> >> >
> >> >
> >> > Regards,
> >> >
> >> > Julien
> >> >
> >> >
> >> >
> >> >
> >>
> >>
> >>
>
>

RE: WorkerThread runtime exceptions

Posted by Julien Massiera <ju...@francelabs.com>.

Yes sorry I wrongly described the 1/:  the runtime exception happens in the processDocument on the first and only document found in the seeding phase. 

Here is the stack trace: 

WARN 2022-03-22T13:07:03,761 (Worker thread '13') - MCF|MCF-agent|apache.manifoldcf.connectors|JCIFS: Possibly transient exception detected on attempt 1 while checking if file smb://localhost/test/ exists: null
jcifs.smb.SmbException: null
        at jcifs.smb.SmbTreeImpl.waitForState(SmbTreeImpl.java:774) ~[jcifs-ng-2.1.7.jar:?]
        at jcifs.smb.SmbTreeImpl.treeConnect(SmbTreeImpl.java:540) ~[jcifs-ng-2.1.7.jar:?]
        at jcifs.smb.SmbTreeConnection.connectTree(SmbTreeConnection.java:614) ~[jcifs-ng-2.1.7.jar:?]
        at jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:568) ~[jcifs-ng-2.1.7.jar:?]
        at jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:489) ~[jcifs-ng-2.1.7.jar:?]
        at jcifs.smb.SmbTreeConnection.connect(SmbTreeConnection.java:465) ~[jcifs-ng-2.1.7.jar:?]
        at jcifs.smb.SmbTreeConnection.connectWrapException(SmbTreeConnection.java:426) ~[jcifs-ng-2.1.7.jar:?]
        at jcifs.smb.SmbFile.ensureTreeConnected(SmbFile.java:559) ~[jcifs-ng-2.1.7.jar:?]
        at jcifs.smb.SmbFile.exists(SmbFile.java:859) ~[jcifs-ng-2.1.7.jar:?]
        at com.francelabs.datafari.connectors.share.SharedDriveConnector.fileExists(SharedDriveConnector.java:2129) [datafari-share-connector-6.0-dev-Community.jar:?]
        at com.francelabs.datafari.connectors.share.SharedDriveConnector.processDocuments(SharedDriveConnector.java:677) [datafari-share-connector-6.0-dev-Community.jar:?]
        at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) [mcf-pull-agent.jar:?]
Caused by: java.lang.InterruptedException
        at java.lang.Object.wait(Native Method) ~[?:?]
        at java.lang.Object.wait(Object.java:328) ~[?:?]
        at jcifs.smb.SmbTreeImpl.waitForState(SmbTreeImpl.java:771) ~[jcifs-ng-2.1.7.jar:?]
        ... 11 more
FATAL 2022-03-22T13:07:03,777 (Worker thread '13') - MCF|MCF-agent|apache.manifoldcf.crawlerthreads|Error tossed: 'boolean org.bouncycastle.asn1.ASN1ObjectIdentifier.equals(org.bouncycastle.asn1.ASN1Primitive)'
java.lang.NoSuchMethodError: 'boolean org.bouncycastle.asn1.ASN1ObjectIdentifier.equals(org.bouncycastle.asn1.ASN1Primitive)'
        at jcifs.spnego.NegTokenInit.parse(NegTokenInit.java:167) ~[?:?]
        at jcifs.spnego.NegTokenInit.<init>(NegTokenInit.java:66) ~[?:?]
        at jcifs.smb.NtlmPasswordAuthenticator.createContext(NtlmPasswordAuthenticator.java:243) ~[?:?]
        at jcifs.smb.SmbSessionImpl.createContext(SmbSessionImpl.java:706) ~[?:?]
        at jcifs.smb.SmbSessionImpl.sessionSetupSMB2(SmbSessionImpl.java:544) ~[?:?]
        at jcifs.smb.SmbSessionImpl.sessionSetup(SmbSessionImpl.java:491) ~[?:?]
        at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:369) ~[?:?]
        at jcifs.smb.SmbSessionImpl.send(SmbSessionImpl.java:347) ~[?:?]
        at jcifs.smb.SmbTreeImpl.treeConnect(SmbTreeImpl.java:611) ~[?:?]
        at jcifs.smb.SmbTreeConnection.connectTree(SmbTreeConnection.java:614) ~[?:?]
        at jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:568) ~[?:?]
        at jcifs.smb.SmbTreeConnection.connectHost(SmbTreeConnection.java:489) ~[?:?]
        at jcifs.smb.SmbTreeConnection.connect(SmbTreeConnection.java:465) ~[?:?]
        at jcifs.smb.SmbTreeConnection.connectWrapException(SmbTreeConnection.java:426) ~[?:?]
        at jcifs.smb.SmbFile.ensureTreeConnected(SmbFile.java:559) ~[?:?]
        at jcifs.smb.SmbFile.exists(SmbFile.java:859) ~[?:?]
        at com.francelabs.datafari.connectors.share.SharedDriveConnector.fileExists(SharedDriveConnector.java:2129) ~[?:?]
        at com.francelabs.datafari.connectors.share.SharedDriveConnector.processDocuments(SharedDriveConnector.java:677) ~[?:?]
        at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399) [mcf-pull-agent.jar:?] 


So after that FATAL exception, we end up in the 'wait' state of the documentQueue 

Julien

-----Message d'origine-----
De : Karl Wright <da...@gmail.com> 
Envoyé : mercredi 23 mars 2022 16:32
À : dev <de...@manifoldcf.apache.org>
Objet : Re: WorkerThread runtime exceptions

Specifically, the stuffer thread is responsible for finding documents to process and getting them to the worker threads via the internal queue that the worker threads wait on.  The stuffer thread uses a query to do this.
Either the query is not finding any documents, or the stuffer thread is down.  Probably it is the former, and the reason it is not finding any documents is because the job is in the wrong state due to that runtime exception.

Can you describe what code is throwing that runtime exception?  It would be very helpful if you could provide a stack trace for it from the log.

Karl


On Wed, Mar 23, 2022 at 11:27 AM Karl Wright <da...@gmail.com> wrote:

> ' 1/ On the first and only one document of the seeding phase 
> encountered, a runtime exception is triggered'
>
> The worker threads do not handle seeding.  If a runtime exception 
> takes place during seeding, no documents will be queued, and that is 
> the problem.  The state of the job must be incorrectly updated even 
> though the seeding failed.  OR the job's state is properly updated but 
> the corresponding thread that is supposed to know when the job is 
> completed (by looking at the job queue) doesn't properly trigger.
>
> The architecture of ManifoldCF has many threads that are individually 
> responsible for transitioning the job state based on the jobqueue.  If 
> somehow the jobstate winds up not in the right state then those 
> threads will not do the right thing.
>
> Karl
>
>
> On Wed, Mar 23, 2022 at 11:08 AM Julien Massiera < 
> julien.massiera@francelabs.com> wrote:
>
>> Hi Karl,
>>
>> I had some time to investigate the problem I exposed in my first 
>> mail, and here is the behavior I observed:
>>
>> 1/ On the first and only one document of the seeding phase 
>> encountered, a runtime exception is triggered 2/ The runtime 
>> exception is catched by the WorkerThread, logged, and the 
>> WorkerThread stays alive (line 856 of the WorkerThread class) 3/ The 
>> WorkerThread calls the getDocument method of its documentQueue (line 
>> 121 of the WorkerThread class) 4/ The documentQueue ends in an 
>> infinite 'wait' state because the queue size is 0 and the resetFlag 
>> is false (lines 109 and 110 of the DocumentQueue class) 5/ Because of 
>> the infinite 'wait' state of the documentQueue, the job stays freezed 
>> on the 'running' state and it is impossible to stop it until the 
>> Agent is restarted
>>
>> I don't know much about the WorkerThread and the DocumentQueue logic, 
>> so from there, I really need your help to understand this behavior 
>> and to figure out what can be done to prevent the job from hanging in 
>> that case, which, I assume, can happen in other circumstances with 
>> other repository connectors
>>
>> Regards,
>> Julien
>>
>> -----Message d'origine-----
>> De : Julien Massiera <ju...@francelabs.com> Envoyé : jeudi 
>> 24 février 2022 15:08 À : dev@manifoldcf.apache.org Objet : RE: 
>> WorkerThread runtime exceptions
>>
>> Yes I understand
>>
>> -----Message d'origine-----
>> De : Karl Wright <da...@gmail.com> Envoyé : jeudi 24 février 2022 
>> 14:59 À : dev <de...@manifoldcf.apache.org> Objet : Re: WorkerThread 
>> runtime exceptions
>>
>> I'm currently completely consumed with upgrading dependencies for 
>> Tika and CXF.  This is a massive job and won't be done for probably 
>> another week or two.  Once that is done I can try to look into your concern.
>>
>> Karl
>>
>>
>> On Thu, Feb 24, 2022 at 8:47 AM Julien Massiera < 
>> julien.massiera@francelabs.com> wrote:
>>
>> > Hi,
>> >
>> >
>> >
>> > I have faced a situation where the MCF agent was still up but was 
>> > not doing anything after a runtime exception.
>> >
>> >
>> >
>> > My use case was the following :
>> > I have updated the libs used by a repository connector but forgot one.
>> > During doc processing, a runtime exception < 
>> > java.lang.NoSuchMethodError > has been throwed because the sub 
>> > dependency lib was not up to date and thus the method called was 
>> > missing. The exception was catched by the WorkerThread and 
>> > displayed < Error tossed: .. > but then nothing and the job stayed 
>> > in running status and I was not able to abort it until I killed and 
>> > I restarted the agent.
>> >
>> >
>> >
>> > The catching clause is located in the WorkerThread class at lines
>> 853-857.
>> > I
>> > know this is a particular case but I am not sure that the fact the 
>> > agent hangs after this exception is a normal behavior and 
>> > furthermore I can imagine that it can happen with other unkown runtime exceptions.
>> > Is there something we can do to avoid the agent to be hanging in 
>> > those
>> cases ?
>> >
>> >
>> >
>> > Regards,
>> >
>> > Julien
>> >
>> >
>> >
>> >
>>
>>
>>

Re: WorkerThread runtime exceptions

Posted by Karl Wright <da...@gmail.com>.

Specifically, the stuffer thread is responsible for finding documents to
process and getting them to the worker threads via the internal queue that
the worker threads wait on.  The stuffer thread uses a query to do this.
Either the query is not finding any documents, or the stuffer thread is
down.  Probably it is the former, and the reason it is not finding any
documents is because the job is in the wrong state due to that runtime
exception.

Can you describe what code is throwing that runtime exception?  It would be
very helpful if you could provide a stack trace for it from the log.

Karl


On Wed, Mar 23, 2022 at 11:27 AM Karl Wright <da...@gmail.com> wrote:

> ' 1/ On the first and only one document of the seeding phase encountered,
> a runtime exception is triggered'
>
> The worker threads do not handle seeding.  If a runtime exception takes
> place during seeding, no documents will be queued, and that is the
> problem.  The state of the job must be incorrectly updated even though the
> seeding failed.  OR the job's state is properly updated but the
> corresponding thread that is supposed to know when the job is completed (by
> looking at the job queue) doesn't properly trigger.
>
> The architecture of ManifoldCF has many threads that are individually
> responsible for transitioning the job state based on the jobqueue.  If
> somehow the jobstate winds up not in the right state then those threads
> will not do the right thing.
>
> Karl
>
>
> On Wed, Mar 23, 2022 at 11:08 AM Julien Massiera <
> julien.massiera@francelabs.com> wrote:
>
>> Hi Karl,
>>
>> I had some time to investigate the problem I exposed in my first mail,
>> and here is the behavior I observed:
>>
>> 1/ On the first and only one document of the seeding phase encountered, a
>> runtime exception is triggered
>> 2/ The runtime exception is catched by the WorkerThread, logged, and the
>> WorkerThread stays alive (line 856 of the WorkerThread class)
>> 3/ The WorkerThread calls the getDocument method of its documentQueue
>> (line 121 of the WorkerThread class)
>> 4/ The documentQueue ends in an infinite 'wait' state because the queue
>> size is 0 and the resetFlag is false (lines 109 and 110 of the
>> DocumentQueue class)
>> 5/ Because of the infinite 'wait' state of the documentQueue, the job
>> stays freezed on the 'running' state and it is impossible to stop it until
>> the Agent is restarted
>>
>> I don't know much about the WorkerThread and the DocumentQueue logic, so
>> from there, I really need your help to understand this behavior and to
>> figure out what can be done to prevent the job from hanging in that case,
>> which, I assume, can happen in other circumstances with other repository
>> connectors
>>
>> Regards,
>> Julien
>>
>> -----Message d'origine-----
>> De : Julien Massiera <ju...@francelabs.com>
>> Envoyé : jeudi 24 février 2022 15:08
>> À : dev@manifoldcf.apache.org
>> Objet : RE: WorkerThread runtime exceptions
>>
>> Yes I understand
>>
>> -----Message d'origine-----
>> De : Karl Wright <da...@gmail.com>
>> Envoyé : jeudi 24 février 2022 14:59
>> À : dev <de...@manifoldcf.apache.org>
>> Objet : Re: WorkerThread runtime exceptions
>>
>> I'm currently completely consumed with upgrading dependencies for Tika
>> and CXF.  This is a massive job and won't be done for probably another week
>> or two.  Once that is done I can try to look into your concern.
>>
>> Karl
>>
>>
>> On Thu, Feb 24, 2022 at 8:47 AM Julien Massiera <
>> julien.massiera@francelabs.com> wrote:
>>
>> > Hi,
>> >
>> >
>> >
>> > I have faced a situation where the MCF agent was still up but was not
>> > doing anything after a runtime exception.
>> >
>> >
>> >
>> > My use case was the following :
>> > I have updated the libs used by a repository connector but forgot one.
>> > During doc processing, a runtime exception <
>> > java.lang.NoSuchMethodError > has been throwed because the sub
>> > dependency lib was not up to date and thus the method called was
>> > missing. The exception was catched by the WorkerThread and displayed <
>> > Error tossed: .. > but then nothing and the job stayed in running
>> > status and I was not able to abort it until I killed and I restarted
>> > the agent.
>> >
>> >
>> >
>> > The catching clause is located in the WorkerThread class at lines
>> 853-857.
>> > I
>> > know this is a particular case but I am not sure that the fact the
>> > agent hangs after this exception is a normal behavior and furthermore
>> > I can imagine that it can happen with other unkown runtime exceptions.
>> > Is there something we can do to avoid the agent to be hanging in those
>> cases ?
>> >
>> >
>> >
>> > Regards,
>> >
>> > Julien
>> >
>> >
>> >
>> >
>>
>>
>>

Re: WorkerThread runtime exceptions

Posted by Karl Wright <da...@gmail.com>.

' 1/ On the first and only one document of the seeding phase encountered, a
runtime exception is triggered'

The worker threads do not handle seeding.  If a runtime exception takes
place during seeding, no documents will be queued, and that is the
problem.  The state of the job must be incorrectly updated even though the
seeding failed.  OR the job's state is properly updated but the
corresponding thread that is supposed to know when the job is completed (by
looking at the job queue) doesn't properly trigger.

The architecture of ManifoldCF has many threads that are individually
responsible for transitioning the job state based on the jobqueue.  If
somehow the jobstate winds up not in the right state then those threads
will not do the right thing.

Karl


On Wed, Mar 23, 2022 at 11:08 AM Julien Massiera <
julien.massiera@francelabs.com> wrote:

> Hi Karl,
>
> I had some time to investigate the problem I exposed in my first mail, and
> here is the behavior I observed:
>
> 1/ On the first and only one document of the seeding phase encountered, a
> runtime exception is triggered
> 2/ The runtime exception is catched by the WorkerThread, logged, and the
> WorkerThread stays alive (line 856 of the WorkerThread class)
> 3/ The WorkerThread calls the getDocument method of its documentQueue
> (line 121 of the WorkerThread class)
> 4/ The documentQueue ends in an infinite 'wait' state because the queue
> size is 0 and the resetFlag is false (lines 109 and 110 of the
> DocumentQueue class)
> 5/ Because of the infinite 'wait' state of the documentQueue, the job
> stays freezed on the 'running' state and it is impossible to stop it until
> the Agent is restarted
>
> I don't know much about the WorkerThread and the DocumentQueue logic, so
> from there, I really need your help to understand this behavior and to
> figure out what can be done to prevent the job from hanging in that case,
> which, I assume, can happen in other circumstances with other repository
> connectors
>
> Regards,
> Julien
>
> -----Message d'origine-----
> De : Julien Massiera <ju...@francelabs.com>
> Envoyé : jeudi 24 février 2022 15:08
> À : dev@manifoldcf.apache.org
> Objet : RE: WorkerThread runtime exceptions
>
> Yes I understand
>
> -----Message d'origine-----
> De : Karl Wright <da...@gmail.com>
> Envoyé : jeudi 24 février 2022 14:59
> À : dev <de...@manifoldcf.apache.org>
> Objet : Re: WorkerThread runtime exceptions
>
> I'm currently completely consumed with upgrading dependencies for Tika and
> CXF.  This is a massive job and won't be done for probably another week or
> two.  Once that is done I can try to look into your concern.
>
> Karl
>
>
> On Thu, Feb 24, 2022 at 8:47 AM Julien Massiera <
> julien.massiera@francelabs.com> wrote:
>
> > Hi,
> >
> >
> >
> > I have faced a situation where the MCF agent was still up but was not
> > doing anything after a runtime exception.
> >
> >
> >
> > My use case was the following :
> > I have updated the libs used by a repository connector but forgot one.
> > During doc processing, a runtime exception <
> > java.lang.NoSuchMethodError > has been throwed because the sub
> > dependency lib was not up to date and thus the method called was
> > missing. The exception was catched by the WorkerThread and displayed <
> > Error tossed: .. > but then nothing and the job stayed in running
> > status and I was not able to abort it until I killed and I restarted
> > the agent.
> >
> >
> >
> > The catching clause is located in the WorkerThread class at lines
> 853-857.
> > I
> > know this is a particular case but I am not sure that the fact the
> > agent hangs after this exception is a normal behavior and furthermore
> > I can imagine that it can happen with other unkown runtime exceptions.
> > Is there something we can do to avoid the agent to be hanging in those
> cases ?
> >
> >
> >
> > Regards,
> >
> > Julien
> >
> >
> >
> >
>
>
>

RE: WorkerThread runtime exceptions

Posted by Julien Massiera <ju...@francelabs.com>.

Hi Karl,

I had some time to investigate the problem I exposed in my first mail, and here is the behavior I observed:

1/ On the first and only one document of the seeding phase encountered, a runtime exception is triggered
2/ The runtime exception is catched by the WorkerThread, logged, and the WorkerThread stays alive (line 856 of the WorkerThread class)
3/ The WorkerThread calls the getDocument method of its documentQueue (line 121 of the WorkerThread class)
4/ The documentQueue ends in an infinite 'wait' state because the queue size is 0 and the resetFlag is false (lines 109 and 110 of the DocumentQueue class)
5/ Because of the infinite 'wait' state of the documentQueue, the job stays freezed on the 'running' state and it is impossible to stop it until the Agent is restarted

I don't know much about the WorkerThread and the DocumentQueue logic, so from there, I really need your help to understand this behavior and to figure out what can be done to prevent the job from hanging in that case, which, I assume, can happen in other circumstances with other repository connectors

Regards,
Julien

-----Message d'origine-----
De : Julien Massiera <ju...@francelabs.com> 
Envoyé : jeudi 24 février 2022 15:08
À : dev@manifoldcf.apache.org
Objet : RE: WorkerThread runtime exceptions

Yes I understand

-----Message d'origine-----
De : Karl Wright <da...@gmail.com>
Envoyé : jeudi 24 février 2022 14:59
À : dev <de...@manifoldcf.apache.org>
Objet : Re: WorkerThread runtime exceptions

I'm currently completely consumed with upgrading dependencies for Tika and CXF.  This is a massive job and won't be done for probably another week or two.  Once that is done I can try to look into your concern.

Karl


On Thu, Feb 24, 2022 at 8:47 AM Julien Massiera < julien.massiera@francelabs.com> wrote:

> Hi,
>
>
>
> I have faced a situation where the MCF agent was still up but was not 
> doing anything after a runtime exception.
>
>
>
> My use case was the following :
> I have updated the libs used by a repository connector but forgot one.
> During doc processing, a runtime exception < 
> java.lang.NoSuchMethodError > has been throwed because the sub 
> dependency lib was not up to date and thus the method called was 
> missing. The exception was catched by the WorkerThread and displayed < 
> Error tossed: .. > but then nothing and the job stayed in running 
> status and I was not able to abort it until I killed and I restarted 
> the agent.
>
>
>
> The catching clause is located in the WorkerThread class at lines 853-857.
> I
> know this is a particular case but I am not sure that the fact the 
> agent hangs after this exception is a normal behavior and furthermore 
> I can imagine that it can happen with other unkown runtime exceptions.
> Is there something we can do to avoid the agent to be hanging in those cases ?
>
>
>
> Regards,
>
> Julien
>
>
>
>

RE: WorkerThread runtime exceptions

Posted by Julien Massiera <ju...@francelabs.com>.

Yes I understand

-----Message d'origine-----
De : Karl Wright <da...@gmail.com> 
Envoyé : jeudi 24 février 2022 14:59
À : dev <de...@manifoldcf.apache.org>
Objet : Re: WorkerThread runtime exceptions

I'm currently completely consumed with upgrading dependencies for Tika and CXF.  This is a massive job and won't be done for probably another week or two.  Once that is done I can try to look into your concern.

Karl


On Thu, Feb 24, 2022 at 8:47 AM Julien Massiera < julien.massiera@francelabs.com> wrote:

> Hi,
>
>
>
> I have faced a situation where the MCF agent was still up but was not 
> doing anything after a runtime exception.
>
>
>
> My use case was the following :
> I have updated the libs used by a repository connector but forgot one.
> During doc processing, a runtime exception < 
> java.lang.NoSuchMethodError > has been throwed because the sub 
> dependency lib was not up to date and thus the method called was 
> missing. The exception was catched by the WorkerThread and displayed < 
> Error tossed: .. > but then nothing and the job stayed in running 
> status and I was not able to abort it until I killed and I restarted 
> the agent.
>
>
>
> The catching clause is located in the WorkerThread class at lines 853-857.
> I
> know this is a particular case but I am not sure that the fact the 
> agent hangs after this exception is a normal behavior and furthermore 
> I can imagine that it can happen with other unkown runtime exceptions. 
> Is there something we can do to avoid the agent to be hanging in those cases ?
>
>
>
> Regards,
>
> Julien
>
>
>
>

Re: WorkerThread runtime exceptions

Posted by Karl Wright <da...@gmail.com>.

I'm currently completely consumed with upgrading dependencies for Tika and
CXF.  This is a massive job and won't be done for probably another week or
two.  Once that is done I can try to look into your concern.

Karl


On Thu, Feb 24, 2022 at 8:47 AM Julien Massiera <
julien.massiera@francelabs.com> wrote:

> Hi,
>
>
>
> I have faced a situation where the MCF agent was still up but was not doing
> anything after a runtime exception.
>
>
>
> My use case was the following :
> I have updated the libs used by a repository connector but forgot one.
> During doc processing, a runtime exception < java.lang.NoSuchMethodError >
> has been throwed because the sub dependency lib was not up to date and thus
> the method called was missing. The exception was catched by the
> WorkerThread
> and displayed < Error tossed: .. > but then nothing and the job stayed in
> running status and I was not able to abort it until I killed and I
> restarted
> the agent.
>
>
>
> The catching clause is located in the WorkerThread class at lines 853-857.
> I
> know this is a particular case but I am not sure that the fact the agent
> hangs after this exception is a normal behavior and furthermore I can
> imagine that it can happen with other unkown runtime exceptions. Is there
> something we can do to avoid the agent to be hanging in those cases ?
>
>
>
> Regards,
>
> Julien
>
>
>
>