You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Andreas Lehmkuehler <an...@lehmi.de> on 2021/05/25 06:20:00 UTC

2.0.24 Release?

Hi,

how about cutting a 2.0.24 release in about 2 weeks from now?

There is already an amount of solved tickets and our friends from Tika are 
interested in a new version as well to cut a new release too including our 
latest "stuff".

Cheers
Andreas

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.24 Release?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,

just a friendly reminder. I'm going to cut the release in about 10-12 hours from 
now, if nobody objects ;-)

Andreas


Am 25.05.21 um 08:20 schrieb Andreas Lehmkuehler:
> Hi,
> 
> how about cutting a 2.0.24 release in about 2 weeks from now?
> 
> There is already an amount of solved tickets and our friends from Tika are 
> interested in a new version as well to cut a new release too including our 
> latest "stuff".
> 
> Cheers
> Andreas
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.24 Release?

Posted by Tim Allison <ta...@apache.org>.
+1 :)

On Tue, May 25, 2021 at 2:20 AM Andreas Lehmkuehler <an...@lehmi.de>
wrote:

> Hi,
>
> how about cutting a 2.0.24 release in about 2 weeks from now?
>
> There is already an amount of solved tickets and our friends from Tika are
> interested in a new version as well to cut a new release too including our
> latest "stuff".
>
> Cheers
> Andreas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
>

Re: 2.0.24 Release?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 05.06.21 um 20:09 schrieb Tilman Hausherr:
> Thanks!
> 
> I created one issue (PDFBOX-5207) but I don't consider this a blocker.
> 
> The other files where column T has text have troubles related to matrix 
> multiplication. I suspect that some parser changes produce larger numbers than 
> before.
> 
> The file
> bug_trackers/poppler/poppler-84988-0.zip-3.pdf
> has a different problem but I suspect it is related:
> /MediaBox [0 170141183460469231731687303715884105728 612 792]
> 
> in 2.0.23 rendering worked (it seems the number was skipped and then the 
> rectangle ignored), but in 2.0.24 it doesn't.
This is related to PDFBOX-5176 which changes the behaviour of the parser when it 
comes to numerical valid but out of range values.

> 
> Tilman
> 
> Am 03.06.2021 um 14:24 schrieb Tim Allison:
>> Reports are here:
>> https://corpora.tika.apache.org/base/reports/reports-pdfbox-2.0.24-SNAPSHOT.tgz
>>
>> No new exceptions. Content looks better by a tiny amount.  There are a
>> few files with some apparent regressions, but overall, the diffs are
>> negligible.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.24 Release?

Posted by Tilman Hausherr <TH...@t-online.de>.
Thanks!

I created one issue (PDFBOX-5207) but I don't consider this a blocker.

The other files where column T has text have troubles related to matrix 
multiplication. I suspect that some parser changes produce larger 
numbers than before.

The file
bug_trackers/poppler/poppler-84988-0.zip-3.pdf
has a different problem but I suspect it is related:
/MediaBox [0 170141183460469231731687303715884105728 612 792]

in 2.0.23 rendering worked (it seems the number was skipped and then the 
rectangle ignored), but in 2.0.24 it doesn't.

Tilman

Am 03.06.2021 um 14:24 schrieb Tim Allison:
> Reports are here:
> https://corpora.tika.apache.org/base/reports/reports-pdfbox-2.0.24-SNAPSHOT.tgz
>
> No new exceptions. Content looks better by a tiny amount.  There are a
> few files with some apparent regressions, but overall, the diffs are
> negligible.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.24 Release?

Posted by Tim Allison <ta...@apache.org>.
Reports are here:
https://corpora.tika.apache.org/base/reports/reports-pdfbox-2.0.24-SNAPSHOT.tgz

No new exceptions. Content looks better by a tiny amount.  There are a
few files with some apparent regressions, but overall, the diffs are
negligible.

Let me know if you have questions.

Best,

           Tim

On Mon, May 31, 2021 at 2:20 AM Andreas Lehmkuehler <an...@lehmi.de> wrote:
>
> Am 30.05.21 um 20:13 schrieb Tim Allison:
> > Will kick off tests on Tuesday, June 1 unless there are other text
> > extraction changes planned.
> Cool, I'm currently working on some 3.0 tickets so no interference from my side.
>
> Andreas
> >
> > On Sun, May 30, 2021 at 12:07 PM Andreas Lehmkuehler <an...@lehmi.de>
> > wrote:
> >
> >> I'm targeting the 7th or 8th of May.
> >>
> >> @Tim, @Tilman, is there any chance to run a 2.0.23 vs. 2.0.24 comparison
> >> first?
> >>
> >> Andreas
> >>
> >> Am 26.05.21 um 21:20 schrieb Tilman Hausherr:
> >>> +1
> >>>
> >>> Tilman
> >>>
> >>> Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
> >>>> Hi,
> >>>>
> >>>> how about cutting a 2.0.24 release in about 2 weeks from now?
> >>>>
> >>>> There is already an amount of solved tickets and our friends from Tika
> >> are
> >>>> interested in a new version as well to cut a new release too including
> >> our
> >>>> latest "stuff".
> >>>>
> >>>> Cheers
> >>>> Andreas
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >>> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>
> >>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.24 Release?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 30.05.21 um 20:13 schrieb Tim Allison:
> Will kick off tests on Tuesday, June 1 unless there are other text
> extraction changes planned.
Cool, I'm currently working on some 3.0 tickets so no interference from my side.

Andreas
> 
> On Sun, May 30, 2021 at 12:07 PM Andreas Lehmkuehler <an...@lehmi.de>
> wrote:
> 
>> I'm targeting the 7th or 8th of May.
>>
>> @Tim, @Tilman, is there any chance to run a 2.0.23 vs. 2.0.24 comparison
>> first?
>>
>> Andreas
>>
>> Am 26.05.21 um 21:20 schrieb Tilman Hausherr:
>>> +1
>>>
>>> Tilman
>>>
>>> Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
>>>> Hi,
>>>>
>>>> how about cutting a 2.0.24 release in about 2 weeks from now?
>>>>
>>>> There is already an amount of solved tickets and our friends from Tika
>> are
>>>> interested in a new version as well to cut a new release too including
>> our
>>>> latest "stuff".
>>>>
>>>> Cheers
>>>> Andreas
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.24 Release?

Posted by Tim Allison <ta...@apache.org>.
Will kick off tests on Tuesday, June 1 unless there are other text
extraction changes planned.

On Sun, May 30, 2021 at 12:07 PM Andreas Lehmkuehler <an...@lehmi.de>
wrote:

> I'm targeting the 7th or 8th of May.
>
> @Tim, @Tilman, is there any chance to run a 2.0.23 vs. 2.0.24 comparison
> first?
>
> Andreas
>
> Am 26.05.21 um 21:20 schrieb Tilman Hausherr:
> > +1
> >
> > Tilman
> >
> > Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
> >> Hi,
> >>
> >> how about cutting a 2.0.24 release in about 2 weeks from now?
> >>
> >> There is already an amount of solved tickets and our friends from Tika
> are
> >> interested in a new version as well to cut a new release too including
> our
> >> latest "stuff".
> >>
> >> Cheers
> >> Andreas
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: dev-help@pdfbox.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
>

Re: 2.0.24 Release?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
I'm targeting the 7th or 8th of May.

@Tim, @Tilman, is there any chance to run a 2.0.23 vs. 2.0.24 comparison first?

Andreas

Am 26.05.21 um 21:20 schrieb Tilman Hausherr:
> +1
> 
> Tilman
> 
> Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
>> Hi,
>>
>> how about cutting a 2.0.24 release in about 2 weeks from now?
>>
>> There is already an amount of solved tickets and our friends from Tika are 
>> interested in a new version as well to cut a new release too including our 
>> latest "stuff".
>>
>> Cheers
>> Andreas
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.24 Release?

Posted by Tilman Hausherr <TH...@t-online.de>.
+1

Tilman

Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
> Hi,
>
> how about cutting a 2.0.24 release in about 2 weeks from now?
>
> There is already an amount of solved tickets and our friends from Tika 
> are interested in a new version as well to cut a new release too 
> including our latest "stuff".
>
> Cheers
> Andreas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org