You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (Jira)" <ji...@apache.org> on 2020/03/31 10:16:00 UTC

[jira] [Created] (PDFBOX-4805) Regression in 2.0.19

Andreas Lehmkühler created PDFBOX-4805:
------------------------------------------

             Summary: Regression in 2.0.19
                 Key: PDFBOX-4805
                 URL: https://issues.apache.org/jira/browse/PDFBOX-4805
             Project: PDFBox
          Issue Type: Improvement
          Components: Text extraction
    Affects Versions: 2.0.19
            Reporter: Andreas Lehmkühler
            Assignee: Andreas Lehmkühler
             Fix For: 2.0.20


Joel Hirsh reported a regression with PDFTextStripper which was introduced with 2.0.19, see his post on [users@|https://lists.apache.org/thread.html/r35b50f5b00a39dcf6e77637e2ff2e097f26c395628ae476ab37b344a%40%3Cusers.pdfbox.apache.org%3E] for details.

He can't share the pdf in questions due to privacy but did some debugging and found out that PDFBOX-4760 is the case for that regression. I accidentally committed some [unrelated code|https://svn.apache.org/r1873653] which leads to bad text extraction results. As the code targets some corner cases it didn't came up as an issue when running our pre release tests. The issue is limited to the 2.0 trunk.




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org