You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tyler Palsulich (JIRA)" <ji...@apache.org> on 2015/03/15 23:12:38 UTC

[jira] [Commented] (TIKA-1191) ForkParser / ClassLoaderProxy does not define package

    [ https://issues.apache.org/jira/browse/TIKA-1191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14362588#comment-14362588 ] 

Tyler Palsulich commented on TIKA-1191:
---------------------------------------

Here is an updated stacktrace for Tika 1.8-SNAPSHOT. It looks like something is trying to mark/reset a stream that doesn't support it:
{code}
➜  trunk  tika -z https://issues.apache.org/jira/secure/attachment/12657409/test.eml
Exception in thread "main" org.apache.tika.exception.TikaException: Failed to parse an email message
	at org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:79)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:270)
	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:270)
	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
	at org.apache.tika.cli.TikaCLI$OutputType.process(TikaCLI.java:153)
	at org.apache.tika.cli.TikaCLI.process(TikaCLI.java:450)
	at org.apache.tika.cli.TikaCLI.main(TikaCLI.java:123)
Caused by: java.io.IOException: mark/reset not supported
	at java.io.InputStream.reset(InputStream.java:347)
	at org.apache.tika.parser.microsoft.POIFSContainerDetector.detect(POIFSContainerDetector.java:161)
	at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:61)
	at org.apache.tika.cli.TikaCLI$FileEmbeddedDocumentExtractor.parseEmbedded(TikaCLI.java:918)
	at org.apache.tika.parser.mail.MailContentHandler.body(MailContentHandler.java:110)
	at org.apache.james.mime4j.parser.MimeStreamParser.parse(MimeStreamParser.java:133)
	at org.apache.tika.parser.mail.RFC822Parser.parse(RFC822Parser.java:76)
	... 6 more
{code}.

> ForkParser / ClassLoaderProxy does not define package
> -----------------------------------------------------
>
>                 Key: TIKA-1191
>                 URL: https://issues.apache.org/jira/browse/TIKA-1191
>             Project: Tika
>          Issue Type: Bug
>          Components: parser
>    Affects Versions: 1.4, 1.5
>            Reporter: Nicolas Belisle
>         Attachments: ClassLoaderProxy.java.patch, Test.java, test.eml
>
>
> ForkParser will throw an Exception in some cases : 
> org.apache.tika.exception.TikaException: Invalid embedded resource
> 	at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:189)
> 	at org.apache.tika.parser.microsoft.WordExtractor.parse(WordExtractor.java:135)
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:186)
> 	at org.apache.tika.parser.microsoft.OfficeParser.parse(OfficeParser.java:161)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.tika.fork.ForkServer.call(ForkServer.java:144)
> 	at org.apache.tika.fork.ForkServer.processRequests(ForkServer.java:124)
> 	at org.apache.tika.fork.ForkServer.main(ForkServer.java:69)
> Caused by: java.lang.NullPointerException
> 	at org.apache.tika.mime.MimeTypesFactory.create(MimeTypesFactory.java:136)
> 	at org.apache.tika.mime.MimeTypes.getDefaultMimeTypes(MimeTypes.java:499)
> 	at org.apache.tika.config.TikaConfig.getDefaultMimeTypes(TikaConfig.java:60)
> 	at org.apache.tika.config.TikaConfig.<init>(TikaConfig.java:169)
> 	at org.apache.tika.config.TikaConfig.getDefaultConfig(TikaConfig.java:268)
> 	at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getTikaConfig(AbstractPOIFSExtractor.java:72)
> 	at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.getDetector(AbstractPOIFSExtractor.java:79)
> 	at org.apache.tika.parser.microsoft.AbstractPOIFSExtractor.handleEmbeddedOfficeDoc(AbstractPOIFSExtractor.java:176)
> 	... 10 more
> A patch will follow



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)