You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by Aaron Kaplan <li...@aaronkaplan.info> on 2009/11/10 20:15:04 UTC

Unit tests

I checked out pdfbox 0.8.0, built it with ant, and ran the tests.  Six 
of them are failing:

Failed tests:
   testExtract(org.apache.pdfbox.util.TestTextStripper)
   testRenderImage(org.apache.pdfbox.util.TestPDFToImage)

Tests in error:
   testProtectionError(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
   testProtection(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
 
testMultipleRecipients(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
   testParsingTroublePDFs(org.apache.pdfbox.pdfparser.TestPDFParser)


I looked at the output of TestTextStripper, and most of the differences 
involve the glyph names circlecopyrt, angbracketleft, and 
angbracketright, which were removed from glyphlist.txt in this commit:

http://svn.apache.org/viewvc?view=revision&revision=793058

So my first question: how should these glyphs be getting resolved now 
that they're not in glyphlist.txt; or do the tests need to be updated?

The remaining errors in TestTextStripper are all in the file 
solidconvertor.pdf .  The expected output file appears to be in UTF-16, 
but the actual output file is a strange mixture of UTF-8 and corrupt 
UTF-16.  Second question: any idea why a corrupt output file is being 
generated?

I also looked into TestPDFParser and the problem was a missing input 
file.  I gather from an old mailing list post that it was removed 
because of copyright problems.

By this point I was getting the impression that these tests weren't 
intended for me to run, so I didn't bother trying to figure out what was 
going wrong in the other cases.  My third question: is it expected that 
the tests I listed above fail, or are there any that I should look into 
as potential indicators of bugs?

Thanks
-Aaron