You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Brian Carrier (JIRA)" <ji...@apache.org> on 2008/11/18 15:37:44 UTC
[jira] Commented: (PDFBOX-388) Store expected test output as UTF-8
text files with native line endings
[ https://issues.apache.org/jira/browse/PDFBOX-388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648612#action_12648612 ]
Brian Carrier commented on PDFBOX-388:
--------------------------------------
I agree that the current setup is difficult to debug and review failures. Another approach that I was looking at (but have not yet tried) is to drop in something like TextDiff so that a more intelligent 'diff'ing process existed in the regression tests.
http://www.surfscranton.com/Architecture/TextDiff.htm
> Store expected test output as UTF-8 text files with native line endings
> -----------------------------------------------------------------------
>
> Key: PDFBOX-388
> URL: https://issues.apache.org/jira/browse/PDFBOX-388
> Project: PDFBox
> Issue Type: Improvement
> Components: Text extraction
> Reporter: Jukka Zitting
> Priority: Minor
>
> Currently the expected test output files in test/input are stored as UTF-16 files marked as application/octet-stream. This makes it hard to report or review changes to text extraction output.
> We could improve this by modifying the test suite to produce UTF-8 with native line endings and by updating the expected output files accordingly. Then any changes could be easily reported in patch format.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.