You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@manifoldcf.apache.org by "Karl Wright (JIRA)" <ji...@apache.org> on 2016/05/04 21:45:13 UTC

[jira] [Commented] (CONNECTORS-1312) jcifs.smb.SmbException: Connection reset by peer: socket write error

    [ https://issues.apache.org/jira/browse/CONNECTORS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15271490#comment-15271490 ] 

Karl Wright commented on CONNECTORS-1312:
-----------------------------------------

SmbAuthException errors we *do* want to skip the document and continue.  This is because individual documents have individual authorization and failing on one document does not mean failing on all.  We have caught this situation for many years:

{code}
      catch (jcifs.smb.SmbAuthException e)
      {
        Logging.connectors.warn("JCIFS: Authorization exception reading version information for "+documentIdentifier+" - skipping");
        if(e.getMessage().equals("Logon failure: unknown user name or bad password."))
            throw new ManifoldCFException( "SmbAuthException thrown: " + e.getMessage(), e );
        else {
            activities.deleteDocument(documentIdentifier );
            continue;
          }
      }

{code}


> jcifs.smb.SmbException: Connection reset by peer: socket write error
> --------------------------------------------------------------------
>
>                 Key: CONNECTORS-1312
>                 URL: https://issues.apache.org/jira/browse/CONNECTORS-1312
>             Project: ManifoldCF
>          Issue Type: Bug
>          Components: JCIFS connector
>    Affects Versions: ManifoldCF 2.5
>         Environment: Windows x64, java 1.8.x
>            Reporter: Konstantin Avdeev
>
> hi Karl,
> we've found another JCIFS exception: Windows share jobs stop when encountering a "Connection reset by peer" error, e.g.:
> {code}
> ERROR 2016-05-03 15:29:24,209 (Worker thread '80') - JCIFS: SmbException tossed processing smb://server.domain.com/path/file.ppt
> jcifs.smb.SmbException: Connection reset by peer: socket write error
> java.net.SocketException: Connection reset by peer: socket write error
> 	at java.net.SocketOutputStream.socketWrite0(Native Method)
> 	at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
> 	at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> 	at jcifs.smb.SmbTransport.doSend(SmbTransport.java:453)
> 	at jcifs.util.transport.Transport.sendrecv(Transport.java:67)
> 	at jcifs.smb.SmbTransport.send(SmbTransport.java:655)
> 	at jcifs.smb.SmbSession.send(SmbSession.java:238)
> 	at jcifs.smb.SmbTree.send(SmbTree.java:119)
> 	at jcifs.smb.SmbFile.send(SmbFile.java:775)
> 	at jcifs.smb.SmbFileInputStream.readDirect(SmbFileInputStream.java:181)
> 	at jcifs.smb.SmbFileInputStream.read(SmbFileInputStream.java:142)
> 	at java.io.BufferedInputStream.read1(BufferedInputStream.java:284)
> 	at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
> 	at java.io.FilterInputStream.read(FilterInputStream.java:107)
> 	at java.nio.file.Files.copy(Files.java:2908)
> 	at java.nio.file.Files.copy(Files.java:3027)
> 	at org.apache.tika.io.TikaInputStream.getPath(TikaInputStream.java:587)
> 	at org.apache.tika.io.TikaInputStream.getFile(TikaInputStream.java:615)
> 	at org.apache.tika.parser.microsoft.POIFSContainerDetector.getTopLevelNames(POIFSContainerDetector.java:358)
> 	at org.apache.tika.parser.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:424)
> 	at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:77)
> 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:112)
> 	at org.apache.manifoldcf.agents.transformation.tika.TikaParser.parse(TikaParser.java:48)
> 	at org.apache.manifoldcf.agents.transformation.tika.TikaExtractor.addOrReplaceDocumentWithException(TikaExtractor.java:227)
> 	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddEntryPoint.addOrReplaceDocumentWithException(IncrementalIngester.java:3224)
> 	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineAddFanout.sendDocument(IncrementalIngester.java:3075)
> 	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester$PipelineObjectWithVersions.addOrReplaceDocumentWithException(IncrementalIngester.java:2706)
> 	at org.apache.manifoldcf.agents.incrementalingest.IncrementalIngester.documentIngest(IncrementalIngester.java:756)
> 	at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1583)
> 	at org.apache.manifoldcf.crawler.system.WorkerThread$ProcessActivity.ingestDocumentWithException(WorkerThread.java:1548)
> 	at org.apache.manifoldcf.crawler.connectors.sharedrive.SharedDriveConnector.processDocuments(SharedDriveConnector.java:979)
> 	at org.apache.manifoldcf.crawler.system.WorkerThread.run(WorkerThread.java:399)
> {code}
> Current workaround - to start the job again (manually or by the scheduler).
> It is clear, that there are many errors, when it makes no sense to skip a failed URL and continue the job, e.g.:
> {code}
> Error: SmbAuthException thrown: Logon failure: unknown user name or bad password.
> {code}
> I'm thinking about a general solution, like defining a list (through the UI or properties.xml) with non severe exceptions, like "file busy" or "symlink detected" etc, so the admins would be able to specify, when the crawler should stop and when it should retry, skip and go further.
> What do you think?
> Thank you!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)