You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Arthur Meneau <am...@xetus.com> on 2011/12/05 23:43:35 UTC
Apple iWork document parsing
I am having trouble parsing iWork documents with Tika 1.0. These documents are being saved with the appropriate versions specified by Tika's API (Keynote 5.1.1, Numbers 2.1, Pages 4.1). I have copy and pasted the error I am receiving below. How can I get iWork documents to correctly parse?
Thanks,
-Arthur Meneau
Stack Trace:
java.lang.NullPointerException
java.lang.NullPointerException
at org.apache.tika.parser.iwork.IWorkPackageParser$IWORKDocumentType.detectType(IWorkPackageParser.java:125)
at org.apache.tika.parser.iwork.IWorkPackageParser$IWORKDocumentType.detectType(IWorkPackageParser.java:106)
at org.apache.tika.parser.pkg.ZipContainerDetector.detectIWork(ZipContainerDetector.java:163)
at org.apache.tika.parser.pkg.ZipContainerDetector.detect(ZipContainerDetector.java:76)
at org.apache.tika.detect.CompositeDetector.detect(CompositeDetector.java:60)
at org.apache.tika.Tika.detect(Tika.java:133)
at org.apache.tika.Tika.detect(Tika.java:267)
at org.apache.tika.Tika.detect(Tika.java:248)
at xetus.util.io.FileAnalyzer.getMetadata(FileAnalyzer.java:156)
at xetus.util.io.FileAnalyzer.getMetadata(FileAnalyzer.java:72)
at xetus.util.io.BulkFileAnalyzerTest.testBulkFileTypeDetection(BulkFileAnalyzerTest.java:137)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at junit.framework.TestCase.runTest(TestCase.java:154)
at junit.framework.TestCase.runBare(TestCase.java:127)
at junit.framework.TestResult$1.protect(TestResult.java:106)
at junit.framework.TestResult.runProtected(TestResult.java:124)
at junit.framework.TestResult.run(TestResult.java:109)
at junit.framework.TestCase.run(TestCase.java:118)
at junit.framework.TestSuite.runTest(TestSuite.java:208)
at junit.framework.TestSuite.run(TestSuite.java:203)
at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
Re: Apple iWork document parsing
Posted by Arthur Meneau <am...@xetus.com>.
Nick,
This is done. The files I had used originally were very small test files, I included all three so you can test keynote, pages and numbers.
Thanks for the quick response,
-Arthur
On Dec 5, 2011, at 5:02 PM, Nick Burch wrote:
> On Mon, 5 Dec 2011, Arthur Meneau wrote:
>> I am having trouble parsing iWork documents with Tika 1.0. These documents are being saved with the appropriate versions specified by Tika's API (Keynote 5.1.1, Numbers 2.1, Pages 4.1). I have copy and pasted the error I am receiving below. How can I get iWork documents to correctly parse?
>
> Any chance that you could create a new issue in JIRA, and upload a small sample file that causes the error? (Ideally the smallest file you can create that gives the problem)
>
> Cheers
> Nick
Re: Apple iWork document parsing
Posted by Nick Burch <ni...@alfresco.com>.
On Mon, 5 Dec 2011, Arthur Meneau wrote:
> I am having trouble parsing iWork documents with Tika 1.0. These
> documents are being saved with the appropriate versions specified by
> Tika's API (Keynote 5.1.1, Numbers 2.1, Pages 4.1). I have copy and
> pasted the error I am receiving below. How can I get iWork documents to
> correctly parse?
Any chance that you could create a new issue in JIRA, and upload a small
sample file that causes the error? (Ideally the smallest file you can
create that gives the problem)
Cheers
Nick