You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by Farzad Valad <ho...@farzad.net> on 2011/06/03 19:02:26 UTC

Exception Handling

So my output connector connects to another repository.  If I can't login 
to that repository, I execute the following line "throw new 
ManifoldCFException("txn [" + txn + "] failed with error " + 
e.toString(), e, ManifoldCFException.REPOSITORY_CONNECTION_ERROR);"

ManifoldCF continues the crawl and actually puts out a WARN message.  I 
expected ManifoldCF to hault the job and show the error in the UI, at 
least that is my desired out come.  Do I need a different exception type 
to throw besides "Repository Connection Error"?  Here is what I get in 
the log file:

  WARN 2011-06-01 15:51:42,276 [Worker thread '27'] 
(WorkerThread.java:855) - Connection service interruption reported for 
job 1306961303236 connection 'FileShare': txn [login] failed with error 
org.apache.http.conn.HttpHostConnectException: Connection to 
http://valadbld:34544 refused
org.apache.manifoldcf.core.interfaces.ManifoldCFException: txn [login] 
failed with error org.apache.http.conn.HttpHostConnectException: 
Connection to http://valadbld:34544 refused
     at 
org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:266)
     at 
org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:318)
     at 
org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:314)
     at 
org.apache.manifoldcf.agents.output.dupfinder.CIConnector.Login(CIConnector.java:134)
     at 
org.apache.manifoldcf.agents.output.dupfinder.CIConnector.initialize(CIConnector.java:114)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.getSession(DupFinderConnector.java:261)
     at 
org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:137)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
     at 
org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
     at 
org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
     at 
org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
     at 
org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
Caused by: org.apache.http.conn.HttpHostConnectException: Connection to 
http://valadbld:34544 refused
     at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:158)
     at 
org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
     at 
org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
     at 
org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:561)
     at 
org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
     at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
     at 
org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
     at 
org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:202)
     ... 13 more
Caused by: java.net.ConnectException: Connection timed out: connect
     at java.net.PlainSocketImpl.socketConnect(Native Method)
     at java.net.PlainSocketImpl.doConnect(Unknown Source)
     at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
     at java.net.PlainSocketImpl.connect(Unknown Source)
     at java.net.SocksSocketImpl.connect(Unknown Source)
     at java.net.Socket.connect(Unknown Source)
     at 
org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:123)
     at 
org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
     ... 20 more

Re: Exception Handling

Posted by Karl Wright <da...@gmail.com>.
The retryTime parameter is the interval it will retry in.  The
failTime parameter is when the retries should give up.  Only AFTER
that is the decision made to skip or abort.

Karl

On Fri, Jun 3, 2011 at 2:04 PM, Farzad Valad <ho...@farzad.net> wrote:
> Got it working, question about retryTime and failTime.  From your reply I
> got the impression that the user will get the choice to skip or abort, then
> what do you set these parms to? 0?  Thanks!
>
> On 6/3/2011 12:11 PM, Karl Wright wrote:
>>
>> Your choice of exception would have been fine if this was a repository
>> connector, but output connectors do not have the same ability to abort
>> jobs via ManifoldCFExceptions at this time.  (You can create a ticket
>> if you think this is how it should work).  But if you want the job to
>> abort, you probably want to throw a ServiceInterruption exception,
>> with zero retries.  You have a choice of "skip" or "abort job" as
>> actions.  I recently made this work, so let me know if you encounter
>> any problems.
>>
>>
>> http://svn.apache.org/repos/asf/incubator/lcf/trunk/framework/agents/src/main/java/org/apache/manifoldcf/agents/interfaces/ServiceInterruption.java
>>
>> Karl
>>
>> On Fri, Jun 3, 2011 at 1:02 PM, Farzad Valad<ho...@farzad.net>  wrote:
>>>
>>> So my output connector connects to another repository.  If I can't login
>>> to
>>> that repository, I execute the following line "throw new
>>> ManifoldCFException("txn [" + txn + "] failed with error " +
>>> e.toString(),
>>> e, ManifoldCFException.REPOSITORY_CONNECTION_ERROR);"
>>>
>>> ManifoldCF continues the crawl and actually puts out a WARN message.  I
>>> expected ManifoldCF to hault the job and show the error in the UI, at
>>> least
>>> that is my desired out come.  Do I need a different exception type to
>>> throw
>>> besides "Repository Connection Error"?  Here is what I get in the log
>>> file:
>>>
>>>  WARN 2011-06-01 15:51:42,276 [Worker thread '27']
>>> (WorkerThread.java:855) -
>>> Connection service interruption reported for job 1306961303236 connection
>>> 'FileShare': txn [login] failed with error
>>> org.apache.http.conn.HttpHostConnectException: Connection to
>>> http://valadbld:34544 refused
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: txn [login]
>>> failed with error org.apache.http.conn.HttpHostConnectException:
>>> Connection
>>> to http://valadbld:34544 refused
>>>    at
>>>
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:266)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:318)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:314)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.Login(CIConnector.java:134)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.initialize(CIConnector.java:114)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.getSession(DupFinderConnector.java:261)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:137)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>    at
>>>
>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>    at
>>>
>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>    at
>>>
>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>    at
>>>
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>> Caused by: org.apache.http.conn.HttpHostConnectException: Connection to
>>> http://valadbld:34544 refused
>>>    at
>>>
>>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:158)
>>>    at
>>>
>>> org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
>>>    at
>>>
>>> org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
>>>    at
>>>
>>> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:561)
>>>    at
>>>
>>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
>>>    at
>>>
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
>>>    at
>>>
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
>>>    at
>>>
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:202)
>>>    ... 13 more
>>> Caused by: java.net.ConnectException: Connection timed out: connect
>>>    at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>    at java.net.PlainSocketImpl.doConnect(Unknown Source)
>>>    at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
>>>    at java.net.PlainSocketImpl.connect(Unknown Source)
>>>    at java.net.SocksSocketImpl.connect(Unknown Source)
>>>    at java.net.Socket.connect(Unknown Source)
>>>    at
>>>
>>> org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:123)
>>>    at
>>>
>>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
>>>    ... 20 more
>>>
>
>

Re: Exception Handling

Posted by Farzad Valad <ho...@farzad.net>.
Got it working, question about retryTime and failTime.  From your reply 
I got the impression that the user will get the choice to skip or abort, 
then what do you set these parms to? 0?  Thanks!

On 6/3/2011 12:11 PM, Karl Wright wrote:
> Your choice of exception would have been fine if this was a repository
> connector, but output connectors do not have the same ability to abort
> jobs via ManifoldCFExceptions at this time.  (You can create a ticket
> if you think this is how it should work).  But if you want the job to
> abort, you probably want to throw a ServiceInterruption exception,
> with zero retries.  You have a choice of "skip" or "abort job" as
> actions.  I recently made this work, so let me know if you encounter
> any problems.
>
> http://svn.apache.org/repos/asf/incubator/lcf/trunk/framework/agents/src/main/java/org/apache/manifoldcf/agents/interfaces/ServiceInterruption.java
>
> Karl
>
> On Fri, Jun 3, 2011 at 1:02 PM, Farzad Valad<ho...@farzad.net>  wrote:
>> So my output connector connects to another repository.  If I can't login to
>> that repository, I execute the following line "throw new
>> ManifoldCFException("txn [" + txn + "] failed with error " + e.toString(),
>> e, ManifoldCFException.REPOSITORY_CONNECTION_ERROR);"
>>
>> ManifoldCF continues the crawl and actually puts out a WARN message.  I
>> expected ManifoldCF to hault the job and show the error in the UI, at least
>> that is my desired out come.  Do I need a different exception type to throw
>> besides "Repository Connection Error"?  Here is what I get in the log file:
>>
>>   WARN 2011-06-01 15:51:42,276 [Worker thread '27'] (WorkerThread.java:855) -
>> Connection service interruption reported for job 1306961303236 connection
>> 'FileShare': txn [login] failed with error
>> org.apache.http.conn.HttpHostConnectException: Connection to
>> http://valadbld:34544 refused
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: txn [login]
>> failed with error org.apache.http.conn.HttpHostConnectException: Connection
>> to http://valadbld:34544 refused
>>     at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:266)
>>     at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:318)
>>     at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:314)
>>     at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.Login(CIConnector.java:134)
>>     at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.initialize(CIConnector.java:114)
>>     at
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.getSession(DupFinderConnector.java:261)
>>     at
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:137)
>>     at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>     at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>     at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>     at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>     at
>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>     at
>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>     at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>> Caused by: org.apache.http.conn.HttpHostConnectException: Connection to
>> http://valadbld:34544 refused
>>     at
>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:158)
>>     at
>> org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
>>     at
>> org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
>>     at
>> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:561)
>>     at
>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
>>     at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
>>     at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
>>     at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:202)
>>     ... 13 more
>> Caused by: java.net.ConnectException: Connection timed out: connect
>>     at java.net.PlainSocketImpl.socketConnect(Native Method)
>>     at java.net.PlainSocketImpl.doConnect(Unknown Source)
>>     at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
>>     at java.net.PlainSocketImpl.connect(Unknown Source)
>>     at java.net.SocksSocketImpl.connect(Unknown Source)
>>     at java.net.Socket.connect(Unknown Source)
>>     at
>> org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:123)
>>     at
>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
>>     ... 20 more
>>


Re: Exception Handling

Posted by Karl Wright <da...@gmail.com>.
CONNECTORS-207 describes the situation.
Karl

On Fri, Jun 3, 2011 at 1:41 PM, Karl Wright <da...@gmail.com> wrote:
> I remember now.
> The problem was that the LiveLink API code, under certain conditions,
> "lied" about the error it got back from the server.  Under these
> conditions, therefore, a job would sometimes abort if a transient
> error occurred.  The fix for this problem was made at the framework
> level because the CIFS connector also suffers from this same kind of
> problem, where a network glitch could cause a job to incorrectly abort
> for connection reasons
>
> In both cases, the underlying problems were resolved eventually by
> other means - in the case of Livelink, by periodically restarting the
> livelink server, and in the case of CIFS, by fixing a too-short
> timeout in jcifs.  So, in theory, this retry logic could be removed.
>
> I'll create a ticket to research this further.
>
> Karl
>
> On Fri, Jun 3, 2011 at 1:29 PM, Karl Wright <da...@gmail.com> wrote:
>> Actually, looking at the code, the REPOSITORY_CONNECTION type
>> ManifoldCFException error is retried very specifically in this way for
>> both repository and output connectors.  Any other ManifoldCFException
>> type (except INTERRUPTED) will cause the job to abort.  The reason for
>> this special behavior for this ManifoldCFException type I'm having a
>> hard time recollecting; but I seem to recall vaguely it had something
>> to do with the LiveLink connector.  I'll post later if it comes back
>> to me.
>>
>> Karl
>>
>> On Fri, Jun 3, 2011 at 1:11 PM, Karl Wright <da...@gmail.com> wrote:
>>> Your choice of exception would have been fine if this was a repository
>>> connector, but output connectors do not have the same ability to abort
>>> jobs via ManifoldCFExceptions at this time.  (You can create a ticket
>>> if you think this is how it should work).  But if you want the job to
>>> abort, you probably want to throw a ServiceInterruption exception,
>>> with zero retries.  You have a choice of "skip" or "abort job" as
>>> actions.  I recently made this work, so let me know if you encounter
>>> any problems.
>>>
>>> http://svn.apache.org/repos/asf/incubator/lcf/trunk/framework/agents/src/main/java/org/apache/manifoldcf/agents/interfaces/ServiceInterruption.java
>>>
>>> Karl
>>>
>>> On Fri, Jun 3, 2011 at 1:02 PM, Farzad Valad <ho...@farzad.net> wrote:
>>>> So my output connector connects to another repository.  If I can't login to
>>>> that repository, I execute the following line "throw new
>>>> ManifoldCFException("txn [" + txn + "] failed with error " + e.toString(),
>>>> e, ManifoldCFException.REPOSITORY_CONNECTION_ERROR);"
>>>>
>>>> ManifoldCF continues the crawl and actually puts out a WARN message.  I
>>>> expected ManifoldCF to hault the job and show the error in the UI, at least
>>>> that is my desired out come.  Do I need a different exception type to throw
>>>> besides "Repository Connection Error"?  Here is what I get in the log file:
>>>>
>>>>  WARN 2011-06-01 15:51:42,276 [Worker thread '27'] (WorkerThread.java:855) -
>>>> Connection service interruption reported for job 1306961303236 connection
>>>> 'FileShare': txn [login] failed with error
>>>> org.apache.http.conn.HttpHostConnectException: Connection to
>>>> http://valadbld:34544 refused
>>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: txn [login]
>>>> failed with error org.apache.http.conn.HttpHostConnectException: Connection
>>>> to http://valadbld:34544 refused
>>>>    at
>>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:266)
>>>>    at
>>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:318)
>>>>    at
>>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:314)
>>>>    at
>>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.Login(CIConnector.java:134)
>>>>    at
>>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.initialize(CIConnector.java:114)
>>>>    at
>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.getSession(DupFinderConnector.java:261)
>>>>    at
>>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:137)
>>>>    at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>>    at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>>    at
>>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>>    at
>>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>>    at
>>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>>    at
>>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>>    at
>>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>>> Caused by: org.apache.http.conn.HttpHostConnectException: Connection to
>>>> http://valadbld:34544 refused
>>>>    at
>>>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:158)
>>>>    at
>>>> org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
>>>>    at
>>>> org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
>>>>    at
>>>> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:561)
>>>>    at
>>>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
>>>>    at
>>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
>>>>    at
>>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
>>>>    at
>>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:202)
>>>>    ... 13 more
>>>> Caused by: java.net.ConnectException: Connection timed out: connect
>>>>    at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>>    at java.net.PlainSocketImpl.doConnect(Unknown Source)
>>>>    at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
>>>>    at java.net.PlainSocketImpl.connect(Unknown Source)
>>>>    at java.net.SocksSocketImpl.connect(Unknown Source)
>>>>    at java.net.Socket.connect(Unknown Source)
>>>>    at
>>>> org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:123)
>>>>    at
>>>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
>>>>    ... 20 more
>>>>
>>>
>>
>

Re: Exception Handling

Posted by Karl Wright <da...@gmail.com>.
I remember now.
The problem was that the LiveLink API code, under certain conditions,
"lied" about the error it got back from the server.  Under these
conditions, therefore, a job would sometimes abort if a transient
error occurred.  The fix for this problem was made at the framework
level because the CIFS connector also suffers from this same kind of
problem, where a network glitch could cause a job to incorrectly abort
for connection reasons

In both cases, the underlying problems were resolved eventually by
other means - in the case of Livelink, by periodically restarting the
livelink server, and in the case of CIFS, by fixing a too-short
timeout in jcifs.  So, in theory, this retry logic could be removed.

I'll create a ticket to research this further.

Karl

On Fri, Jun 3, 2011 at 1:29 PM, Karl Wright <da...@gmail.com> wrote:
> Actually, looking at the code, the REPOSITORY_CONNECTION type
> ManifoldCFException error is retried very specifically in this way for
> both repository and output connectors.  Any other ManifoldCFException
> type (except INTERRUPTED) will cause the job to abort.  The reason for
> this special behavior for this ManifoldCFException type I'm having a
> hard time recollecting; but I seem to recall vaguely it had something
> to do with the LiveLink connector.  I'll post later if it comes back
> to me.
>
> Karl
>
> On Fri, Jun 3, 2011 at 1:11 PM, Karl Wright <da...@gmail.com> wrote:
>> Your choice of exception would have been fine if this was a repository
>> connector, but output connectors do not have the same ability to abort
>> jobs via ManifoldCFExceptions at this time.  (You can create a ticket
>> if you think this is how it should work).  But if you want the job to
>> abort, you probably want to throw a ServiceInterruption exception,
>> with zero retries.  You have a choice of "skip" or "abort job" as
>> actions.  I recently made this work, so let me know if you encounter
>> any problems.
>>
>> http://svn.apache.org/repos/asf/incubator/lcf/trunk/framework/agents/src/main/java/org/apache/manifoldcf/agents/interfaces/ServiceInterruption.java
>>
>> Karl
>>
>> On Fri, Jun 3, 2011 at 1:02 PM, Farzad Valad <ho...@farzad.net> wrote:
>>> So my output connector connects to another repository.  If I can't login to
>>> that repository, I execute the following line "throw new
>>> ManifoldCFException("txn [" + txn + "] failed with error " + e.toString(),
>>> e, ManifoldCFException.REPOSITORY_CONNECTION_ERROR);"
>>>
>>> ManifoldCF continues the crawl and actually puts out a WARN message.  I
>>> expected ManifoldCF to hault the job and show the error in the UI, at least
>>> that is my desired out come.  Do I need a different exception type to throw
>>> besides "Repository Connection Error"?  Here is what I get in the log file:
>>>
>>>  WARN 2011-06-01 15:51:42,276 [Worker thread '27'] (WorkerThread.java:855) -
>>> Connection service interruption reported for job 1306961303236 connection
>>> 'FileShare': txn [login] failed with error
>>> org.apache.http.conn.HttpHostConnectException: Connection to
>>> http://valadbld:34544 refused
>>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: txn [login]
>>> failed with error org.apache.http.conn.HttpHostConnectException: Connection
>>> to http://valadbld:34544 refused
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:266)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:318)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:314)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.Login(CIConnector.java:134)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.initialize(CIConnector.java:114)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.getSession(DupFinderConnector.java:261)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:137)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>>    at
>>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>>    at
>>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>>    at
>>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>>    at
>>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>>    at
>>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>>> Caused by: org.apache.http.conn.HttpHostConnectException: Connection to
>>> http://valadbld:34544 refused
>>>    at
>>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:158)
>>>    at
>>> org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
>>>    at
>>> org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
>>>    at
>>> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:561)
>>>    at
>>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
>>>    at
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
>>>    at
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
>>>    at
>>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:202)
>>>    ... 13 more
>>> Caused by: java.net.ConnectException: Connection timed out: connect
>>>    at java.net.PlainSocketImpl.socketConnect(Native Method)
>>>    at java.net.PlainSocketImpl.doConnect(Unknown Source)
>>>    at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
>>>    at java.net.PlainSocketImpl.connect(Unknown Source)
>>>    at java.net.SocksSocketImpl.connect(Unknown Source)
>>>    at java.net.Socket.connect(Unknown Source)
>>>    at
>>> org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:123)
>>>    at
>>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
>>>    ... 20 more
>>>
>>
>

Re: Exception Handling

Posted by Karl Wright <da...@gmail.com>.
Actually, looking at the code, the REPOSITORY_CONNECTION type
ManifoldCFException error is retried very specifically in this way for
both repository and output connectors.  Any other ManifoldCFException
type (except INTERRUPTED) will cause the job to abort.  The reason for
this special behavior for this ManifoldCFException type I'm having a
hard time recollecting; but I seem to recall vaguely it had something
to do with the LiveLink connector.  I'll post later if it comes back
to me.

Karl

On Fri, Jun 3, 2011 at 1:11 PM, Karl Wright <da...@gmail.com> wrote:
> Your choice of exception would have been fine if this was a repository
> connector, but output connectors do not have the same ability to abort
> jobs via ManifoldCFExceptions at this time.  (You can create a ticket
> if you think this is how it should work).  But if you want the job to
> abort, you probably want to throw a ServiceInterruption exception,
> with zero retries.  You have a choice of "skip" or "abort job" as
> actions.  I recently made this work, so let me know if you encounter
> any problems.
>
> http://svn.apache.org/repos/asf/incubator/lcf/trunk/framework/agents/src/main/java/org/apache/manifoldcf/agents/interfaces/ServiceInterruption.java
>
> Karl
>
> On Fri, Jun 3, 2011 at 1:02 PM, Farzad Valad <ho...@farzad.net> wrote:
>> So my output connector connects to another repository.  If I can't login to
>> that repository, I execute the following line "throw new
>> ManifoldCFException("txn [" + txn + "] failed with error " + e.toString(),
>> e, ManifoldCFException.REPOSITORY_CONNECTION_ERROR);"
>>
>> ManifoldCF continues the crawl and actually puts out a WARN message.  I
>> expected ManifoldCF to hault the job and show the error in the UI, at least
>> that is my desired out come.  Do I need a different exception type to throw
>> besides "Repository Connection Error"?  Here is what I get in the log file:
>>
>>  WARN 2011-06-01 15:51:42,276 [Worker thread '27'] (WorkerThread.java:855) -
>> Connection service interruption reported for job 1306961303236 connection
>> 'FileShare': txn [login] failed with error
>> org.apache.http.conn.HttpHostConnectException: Connection to
>> http://valadbld:34544 refused
>> org.apache.manifoldcf.core.interfaces.ManifoldCFException: txn [login]
>> failed with error org.apache.http.conn.HttpHostConnectException: Connection
>> to http://valadbld:34544 refused
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:266)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:318)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:314)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.Login(CIConnector.java:134)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.initialize(CIConnector.java:114)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.getSession(DupFinderConnector.java:261)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:137)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>>    at
>> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>>    at
>> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>>    at
>> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>>    at
>> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>>    at
>> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
>> Caused by: org.apache.http.conn.HttpHostConnectException: Connection to
>> http://valadbld:34544 refused
>>    at
>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:158)
>>    at
>> org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
>>    at
>> org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
>>    at
>> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:561)
>>    at
>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
>>    at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
>>    at
>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
>>    at
>> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:202)
>>    ... 13 more
>> Caused by: java.net.ConnectException: Connection timed out: connect
>>    at java.net.PlainSocketImpl.socketConnect(Native Method)
>>    at java.net.PlainSocketImpl.doConnect(Unknown Source)
>>    at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
>>    at java.net.PlainSocketImpl.connect(Unknown Source)
>>    at java.net.SocksSocketImpl.connect(Unknown Source)
>>    at java.net.Socket.connect(Unknown Source)
>>    at
>> org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:123)
>>    at
>> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
>>    ... 20 more
>>
>

Re: Exception Handling

Posted by Karl Wright <da...@gmail.com>.
Your choice of exception would have been fine if this was a repository
connector, but output connectors do not have the same ability to abort
jobs via ManifoldCFExceptions at this time.  (You can create a ticket
if you think this is how it should work).  But if you want the job to
abort, you probably want to throw a ServiceInterruption exception,
with zero retries.  You have a choice of "skip" or "abort job" as
actions.  I recently made this work, so let me know if you encounter
any problems.

http://svn.apache.org/repos/asf/incubator/lcf/trunk/framework/agents/src/main/java/org/apache/manifoldcf/agents/interfaces/ServiceInterruption.java

Karl

On Fri, Jun 3, 2011 at 1:02 PM, Farzad Valad <ho...@farzad.net> wrote:
> So my output connector connects to another repository.  If I can't login to
> that repository, I execute the following line "throw new
> ManifoldCFException("txn [" + txn + "] failed with error " + e.toString(),
> e, ManifoldCFException.REPOSITORY_CONNECTION_ERROR);"
>
> ManifoldCF continues the crawl and actually puts out a WARN message.  I
> expected ManifoldCF to hault the job and show the error in the UI, at least
> that is my desired out come.  Do I need a different exception type to throw
> besides "Repository Connection Error"?  Here is what I get in the log file:
>
>  WARN 2011-06-01 15:51:42,276 [Worker thread '27'] (WorkerThread.java:855) -
> Connection service interruption reported for job 1306961303236 connection
> 'FileShare': txn [login] failed with error
> org.apache.http.conn.HttpHostConnectException: Connection to
> http://valadbld:34544 refused
> org.apache.manifoldcf.core.interfaces.ManifoldCFException: txn [login]
> failed with error org.apache.http.conn.HttpHostConnectException: Connection
> to http://valadbld:34544 refused
>    at
> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:266)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:318)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:314)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.Login(CIConnector.java:134)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.initialize(CIConnector.java:114)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.getSession(DupFinderConnector.java:261)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.DupFinderConnector.addOrReplaceDocument(DupFinderConnector.java:137)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.addOrReplaceDocument(IncrementalIngester.java:1433)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.performIngestion(IncrementalIngester.java:418)
>    at
> org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:313)
>    at
> org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocument(WorkerThread.java:1565)
>    at
> org.apache.manifoldcf.crawler.connectors.filesystem.FileConnector.processDocuments(FileConnector.java:275)
>    at
> org.apache.manifoldcf.crawler.connectors.BaseRepositoryConnector.processDocuments(BaseRepositoryConnector.java:423)
>    at
> org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:564)
> Caused by: org.apache.http.conn.HttpHostConnectException: Connection to
> http://valadbld:34544 refused
>    at
> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:158)
>    at
> org.apache.http.impl.conn.AbstractPoolEntry.open(AbstractPoolEntry.java:149)
>    at
> org.apache.http.impl.conn.AbstractPooledConnAdapter.open(AbstractPooledConnAdapter.java:121)
>    at
> org.apache.http.impl.client.DefaultRequestDirector.tryConnect(DefaultRequestDirector.java:561)
>    at
> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:415)
>    at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
>    at
> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
>    at
> org.apache.manifoldcf.agents.output.dupfinder.CIConnector.sendTxn(CIConnector.java:202)
>    ... 13 more
> Caused by: java.net.ConnectException: Connection timed out: connect
>    at java.net.PlainSocketImpl.socketConnect(Native Method)
>    at java.net.PlainSocketImpl.doConnect(Unknown Source)
>    at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
>    at java.net.PlainSocketImpl.connect(Unknown Source)
>    at java.net.SocksSocketImpl.connect(Unknown Source)
>    at java.net.Socket.connect(Unknown Source)
>    at
> org.apache.http.conn.scheme.PlainSocketFactory.connectSocket(PlainSocketFactory.java:123)
>    at
> org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:148)
>    ... 20 more
>