You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Aaron Kaplan <li...@aaronkaplan.info> on 2009/11/10 20:15:04 UTC
Unit tests
I checked out pdfbox 0.8.0, built it with ant, and ran the tests. Six
of them are failing:
Failed tests:
testExtract(org.apache.pdfbox.util.TestTextStripper)
testRenderImage(org.apache.pdfbox.util.TestPDFToImage)
Tests in error:
testProtectionError(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
testProtection(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
testMultipleRecipients(org.apache.pdfbox.encryption.TestPublicKeyEncryption)
testParsingTroublePDFs(org.apache.pdfbox.pdfparser.TestPDFParser)
I looked at the output of TestTextStripper, and most of the differences
involve the glyph names circlecopyrt, angbracketleft, and
angbracketright, which were removed from glyphlist.txt in this commit:
http://svn.apache.org/viewvc?view=revision&revision=793058
So my first question: how should these glyphs be getting resolved now
that they're not in glyphlist.txt; or do the tests need to be updated?
The remaining errors in TestTextStripper are all in the file
solidconvertor.pdf . The expected output file appears to be in UTF-16,
but the actual output file is a strange mixture of UTF-8 and corrupt
UTF-16. Second question: any idea why a corrupt output file is being
generated?
I also looked into TestPDFParser and the problem was a missing input
file. I gather from an old mailing list post that it was removed
because of copyright problems.
By this point I was getting the impression that these tests weren't
intended for me to run, so I didn't bother trying to figure out what was
going wrong in the other cases. My third question: is it expected that
the tests I listed above fail, or are there any that I should look into
as potential indicators of bugs?
Thanks
-Aaron