You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@pdfbox.apache.org by Joel Hirsh <jo...@gmail.com> on 2017/01/08 02:56:46 UTC

Fix for (PDFBOX-3447) is causing file that worked previously to now fail

I have files from two different sources that used to work fine on 2.0.2 (or
at least they appear to work fine) and all the text could be extracted.

I just started testing with 2.0.4 and am getting an Exception from
 at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:320)
     org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:194)

Tracing it down, it appears to be fix for Issue 3447, and the comment says
it needs a better idea.  Since it is causing regression, and there is no
way for my code to get around this, can there be a better solution, that
maintains the capability from 2.0.2?

Re: Fix for (PDFBOX-3447) is causing file that worked previously to now fail

Posted by Tilman Hausherr <TH...@t-online.de>.

The best would be to create a new issue and that you attach one of your 
files.

Tilman

Am 08.01.2017 um 15:58 schrieb Joel Hirsh:
> Yes, I meant to type 3446.
>
> Sorry, about introducing that confusion.
>
> On Sun, Jan 8, 2017 at 5:57 AM, Andreas Lehmkuehler <an...@lehmi.de>
> wrote:
>
>> Am 08.01.2017 um 09:41 schrieb Tilman Hausherr:
>>
>>> Am 08.01.2017 um 03:56 schrieb Joel Hirsh:
>>>
>>>> I have files from two different sources that used to work fine on 2.0.2
>>>> (or
>>>> at least they appear to work fine) and all the text could be extracted.
>>>>
>>>> I just started testing with 2.0.4 and am getting an Exception from
>>>>    at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:320)
>>>>        org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser
>>>> .java:194)
>>>>
>>>> Tracing it down, it appears to be fix for Issue 3447, and the comment
>>>> says
>>>> it needs a better idea.  Since it is causing regression, and there is no
>>>> way for my code to get around this, can there be a better solution, that
>>>> maintains the capability from 2.0.2?
>>>>
>>>>
>>> PDFBOX-3447 is about pattern XStep/YStep, not text extraction.
>>>
>> Joel, did you mixed up the numbers? It looks like you are referring to
>> PDFBOX-3446, aren't you?
>>
>>
>> Tilman
>> BR
>> Andreas
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Re: Fix for (PDFBOX-3447) is causing file that worked previously to now fail

Posted by Joel Hirsh <jo...@gmail.com>.

Yes, I meant to type 3446.

Sorry, about introducing that confusion.

On Sun, Jan 8, 2017 at 5:57 AM, Andreas Lehmkuehler <an...@lehmi.de>
wrote:

> Am 08.01.2017 um 09:41 schrieb Tilman Hausherr:
>
>> Am 08.01.2017 um 03:56 schrieb Joel Hirsh:
>>
>>> I have files from two different sources that used to work fine on 2.0.2
>>> (or
>>> at least they appear to work fine) and all the text could be extracted.
>>>
>>> I just started testing with 2.0.4 and am getting an Exception from
>>>   at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:320)
>>>       org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser
>>> .java:194)
>>>
>>> Tracing it down, it appears to be fix for Issue 3447, and the comment
>>> says
>>> it needs a better idea.  Since it is causing regression, and there is no
>>> way for my code to get around this, can there be a better solution, that
>>> maintains the capability from 2.0.2?
>>>
>>>
>> PDFBOX-3447 is about pattern XStep/YStep, not text extraction.
>>
> Joel, did you mixed up the numbers? It looks like you are referring to
> PDFBOX-3446, aren't you?
>
>
> Tilman
>>
>
> BR
> Andreas
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Re: Fix for (PDFBOX-3447) is causing file that worked previously to now fail

Posted by Andreas Lehmkuehler <an...@lehmi.de>.

Am 08.01.2017 um 09:41 schrieb Tilman Hausherr:
> Am 08.01.2017 um 03:56 schrieb Joel Hirsh:
>> I have files from two different sources that used to work fine on 2.0.2 (or
>> at least they appear to work fine) and all the text could be extracted.
>>
>> I just started testing with 2.0.4 and am getting an Exception from
>>   at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:320)
>>       org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:194)
>>
>> Tracing it down, it appears to be fix for Issue 3447, and the comment says
>> it needs a better idea.  Since it is causing regression, and there is no
>> way for my code to get around this, can there be a better solution, that
>> maintains the capability from 2.0.2?
>>
>
> PDFBOX-3447 is about pattern XStep/YStep, not text extraction.
Joel, did you mixed up the numbers? It looks like you are referring to 
PDFBOX-3446, aren't you?


> Tilman

BR
Andreas


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org

Re: Fix for (PDFBOX-3447) is causing file that worked previously to now fail

Posted by Tilman Hausherr <TH...@t-online.de>.

Am 08.01.2017 um 03:56 schrieb Joel Hirsh:
> I have files from two different sources that used to work fine on 2.0.2 (or
> at least they appear to work fine) and all the text could be extracted.
>
> I just started testing with 2.0.4 and am getting an Exception from
>   at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:320)
>       org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:194)
>
> Tracing it down, it appears to be fix for Issue 3447, and the comment says
> it needs a better idea.  Since it is causing regression, and there is no
> way for my code to get around this, can there be a better solution, that
> maintains the capability from 2.0.2?
>

PDFBOX-3447 is about pattern XStep/YStep, not text extraction.


Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org