You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Nick Burch <ni...@apache.org> on 2014/10/29 21:39:47 UTC

PDF test failing on trunk

Hi All

Just tried to build trunk, and got a test failure:

Tests in error:
   testSequentialParser(org.apache.tika.parser.pdf.PDFParserTest): Unable 
to extract PDF content

Tests run: 547, Failures: 0, Errors: 1, Skipped: 7


The exception in the log is:

Caused by: java.io.IOException: javax.crypto.IllegalBlockSizeException: 
Input length must be multiple of 16 when decrypting with padded cipher
         at 
javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:115)
         at javax.crypto.CipherInputStream.read(CipherInputStream.java:236)
         at javax.crypto.CipherInputStream.read(CipherInputStream.java:212)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(SecurityHandler.java:316)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptStream(SecurityHandler.java:421)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decrypt(SecurityHandler.java:390)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptObject(SecurityHandler.java:365)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.proceedDecryption(SecurityHandler.java:196)
         at 
org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:158)
         at 
org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1595)

Caused by: javax.crypto.IllegalBlockSizeException: Input length must be 
multiple of 16 when decrypting with padded cipher
         at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:750)
         at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:676)
         at 
com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:423)
         at javax.crypto.Cipher.doFinal(Cipher.java:1708)
         at 
javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:112)


Is anyone else seeing this one?

Nick

RE: PDF test failing on trunk

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Dunno about that, but happy to.   It looks like your trunk still has testPDF_acroform.pdf hanging around...I removed that a bit ago, and there should only be a testPDF_acroform3.pdf.  I wonder if that's why Jenkins and I aren't seeing the problem.

Will dig out testPDF_acroform.pdf from the svn and give it a test on my version of 1.6.




-----Original Message-----
From: Nick Burch [mailto:apache@gagravarr.org] 
Sent: Thursday, October 30, 2014 12:01 PM
To: dev@tika.apache.org
Subject: RE: PDF test failing on trunk

On Thu, 30 Oct 2014, Allison, Timothy B. wrote:
> I think so.  Would you like the honors?

You're more of a pdf expert than I am, so maybe you'd be best :)

Nick

RE: PDF test failing on trunk

Posted by Nick Burch <ap...@gagravarr.org>.
On Thu, 30 Oct 2014, Allison, Timothy B. wrote:
> I think so.  Would you like the honors?

You're more of a pdf expert than I am, so maybe you'd be best :)

Nick

RE: PDF test failing on trunk

Posted by "Allison, Timothy B." <ta...@mitre.org>.
I think so.  Would you like the honors?

-----Original Message-----
From: Nick Burch [mailto:apache@gagravarr.org] 
Sent: Thursday, October 30, 2014 9:23 AM
To: dev@tika.apache.org
Subject: RE: PDF test failing on trunk

On Thu, 30 Oct 2014, Allison, Timothy B. wrote:
> Ha.  Works with an older version of 1.6:
>
> java version "1.6.0_30"
> OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64)
> OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)

Joy. Full stracktrace below, maybe one that needs reporting to pdfbox?

Nick

-------------------------------------------------------------------------------
Test set: org.apache.tika.parser.pdf.PDFParserTest
-------------------------------------------------------------------------------
Tests run: 29, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 13.611 sec <<< FAILURE!
testSequentialParser(org.apache.tika.parser.pdf.PDFParserTest)  Time elapsed: 3.428 sec  <<< ERROR!
org.apache.tika.exception.TikaException: Sequential Parser failed on test file /home/nick/java/apache-tika/tika-parsers/target/test-classes/test-documents/testPDF_acroForm.pdf
 	at org.apache.tika.parser.pdf.PDFParserTest.testSequentialParser(PDFParserTest.java:589)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 	at java.lang.reflect.Method.invoke(Method.java:622)
 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
 	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236)
 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134)
 	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 	at java.lang.reflect.Method.invoke(Method.java:622)
 	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
 	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
 	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103)
 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74)
Caused by: org.apache.tika.exception.TikaException: Unable to extract PDF content
 	at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:150)
 	at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:160)
 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247)
 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247)
 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
 	at org.apache.tika.TikaTest.getText(TikaTest.java:127)
 	at org.apache.tika.parser.pdf.PDFParserTest.testSequentialParser(PDFParserTest.java:586)
 	... 29 more
Caused by: java.io.IOException: javax.crypto.IllegalBlockSizeException: Input length must be multiple of 16 when decrypting with padded cipher
 	at javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:115)
 	at javax.crypto.CipherInputStream.read(CipherInputStream.java:236)
 	at javax.crypto.CipherInputStream.read(CipherInputStream.java:212)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(SecurityHandler.java:316)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptStream(SecurityHandler.java:421)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decrypt(SecurityHandler.java:390)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptObject(SecurityHandler.java:365)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.proceedDecryption(SecurityHandler.java:196)
 	at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:158)
 	at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1595)
 	at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:942)
 	at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:337)
 	at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:134)
 	... 35 more
Caused by: javax.crypto.IllegalBlockSizeException: Input length must be multiple of 16 when decrypting with padded cipher
 	at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:750)
 	at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:676)
 	at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:423)
 	at javax.crypto.Cipher.doFinal(Cipher.java:1708)
 	at javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:112)
 	... 47 more

RE: PDF test failing on trunk

Posted by Nick Burch <ap...@gagravarr.org>.
On Thu, 30 Oct 2014, Allison, Timothy B. wrote:
> Ha.  Works with an older version of 1.6:
>
> java version "1.6.0_30"
> OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64)
> OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)

Joy. Full stracktrace below, maybe one that needs reporting to pdfbox?

Nick

-------------------------------------------------------------------------------
Test set: org.apache.tika.parser.pdf.PDFParserTest
-------------------------------------------------------------------------------
Tests run: 29, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 13.611 sec <<< FAILURE!
testSequentialParser(org.apache.tika.parser.pdf.PDFParserTest)  Time elapsed: 3.428 sec  <<< ERROR!
org.apache.tika.exception.TikaException: Sequential Parser failed on test file /home/nick/java/apache-tika/tika-parsers/target/test-classes/test-documents/testPDF_acroForm.pdf
 	at org.apache.tika.parser.pdf.PDFParserTest.testSequentialParser(PDFParserTest.java:589)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 	at java.lang.reflect.Method.invoke(Method.java:622)
 	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 	at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271)
 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 	at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 	at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238)
 	at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63)
 	at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236)
 	at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53)
 	at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229)
 	at org.junit.runners.ParentRunner.run(ParentRunner.java:309)
 	at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236)
 	at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134)
 	at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113)
 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 	at java.lang.reflect.Method.invoke(Method.java:622)
 	at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
 	at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
 	at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
 	at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103)
 	at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74)
Caused by: org.apache.tika.exception.TikaException: Unable to extract PDF content
 	at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:150)
 	at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:160)
 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247)
 	at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:247)
 	at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:120)
 	at org.apache.tika.TikaTest.getText(TikaTest.java:127)
 	at org.apache.tika.parser.pdf.PDFParserTest.testSequentialParser(PDFParserTest.java:586)
 	... 29 more
Caused by: java.io.IOException: javax.crypto.IllegalBlockSizeException: Input length must be multiple of 16 when decrypting with padded cipher
 	at javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:115)
 	at javax.crypto.CipherInputStream.read(CipherInputStream.java:236)
 	at javax.crypto.CipherInputStream.read(CipherInputStream.java:212)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(SecurityHandler.java:316)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptStream(SecurityHandler.java:421)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decrypt(SecurityHandler.java:390)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptObject(SecurityHandler.java:365)
 	at org.apache.pdfbox.pdmodel.encryption.SecurityHandler.proceedDecryption(SecurityHandler.java:196)
 	at org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:158)
 	at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1595)
 	at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:942)
 	at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:337)
 	at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:134)
 	... 35 more
Caused by: javax.crypto.IllegalBlockSizeException: Input length must be multiple of 16 when decrypting with padded cipher
 	at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:750)
 	at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:676)
 	at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:423)
 	at javax.crypto.Cipher.doFinal(Cipher.java:1708)
 	at javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:112)
 	... 47 more

RE: PDF test failing on trunk

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Ha.  Works with an older version of 1.6:

java version "1.6.0_30"
OpenJDK Runtime Environment (IcedTea6 1.13.1) (rhel-3.1.13.1.el6_5-x86_64)
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)


-----Original Message-----
From: Nick Burch [mailto:apache@gagravarr.org] 
Sent: Thursday, October 30, 2014 9:00 AM
To: dev@tika.apache.org
Subject: RE: PDF test failing on trunk

On Thu, 30 Oct 2014, Allison, Timothy B. wrote:
> The build is working for me on linux and Windows with Java 1.7.  Can 
> you tell which file is causing the problem?  I wonder if the upgrade to 
> PDFBox 1.8.7 caused the issue?

I've just tried with Java 7, and that passes!

The JVM it's failing on is:
java version "1.6.0_33"
OpenJDK Runtime Environment (IcedTea6 1.13.5) (6b33-1.13.5-1ubuntu0.12.04)
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)

I've added a catch/throws to the test, to report what file failed, that
reports it as testPDF_acroForm.pdf

Nick

RE: PDF test failing on trunk

Posted by Nick Burch <ap...@gagravarr.org>.
On Thu, 30 Oct 2014, Allison, Timothy B. wrote:
> The build is working for me on linux and Windows with Java 1.7.  Can 
> you tell which file is causing the problem?  I wonder if the upgrade to 
> PDFBox 1.8.7 caused the issue?

I've just tried with Java 7, and that passes!

The JVM it's failing on is:
java version "1.6.0_33"
OpenJDK Runtime Environment (IcedTea6 1.13.5) (6b33-1.13.5-1ubuntu0.12.04)
OpenJDK 64-Bit Server VM (build 23.25-b01, mixed mode)

I've added a catch/throws to the test, to report what file failed, that
reports it as testPDF_acroForm.pdf

Nick

RE: PDF test failing on trunk

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Hi Nick,
  The build is working for me on linux and Windows with Java 1.7.  Can you tell which file is causing the problem?  I wonder if the upgrade to PDFBox 1.8.7 caused the issue?

-----Original Message-----
From: Nick Burch [mailto:nick@apache.org] 
Sent: Wednesday, October 29, 2014 4:40 PM
To: dev@tika.apache.org
Subject: PDF test failing on trunk

Hi All

Just tried to build trunk, and got a test failure:

Tests in error:
   testSequentialParser(org.apache.tika.parser.pdf.PDFParserTest): Unable 
to extract PDF content

Tests run: 547, Failures: 0, Errors: 1, Skipped: 7


The exception in the log is:

Caused by: java.io.IOException: javax.crypto.IllegalBlockSizeException: 
Input length must be multiple of 16 when decrypting with padded cipher
         at 
javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:115)
         at javax.crypto.CipherInputStream.read(CipherInputStream.java:236)
         at javax.crypto.CipherInputStream.read(CipherInputStream.java:212)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(SecurityHandler.java:316)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptStream(SecurityHandler.java:421)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decrypt(SecurityHandler.java:390)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptObject(SecurityHandler.java:365)
         at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.proceedDecryption(SecurityHandler.java:196)
         at 
org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:158)
         at 
org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1595)

Caused by: javax.crypto.IllegalBlockSizeException: Input length must be 
multiple of 16 when decrypting with padded cipher
         at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:750)
         at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:676)
         at 
com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:423)
         at javax.crypto.Cipher.doFinal(Cipher.java:1708)
         at 
javax.crypto.CipherInputStream.getMoreData(CipherInputStream.java:112)


Is anyone else seeing this one?

Nick