You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Andreas Lehmkuehler <an...@lehmi.de> on 2021/05/25 06:20:00 UTC
2.0.24 Release?
Hi,
how about cutting a 2.0.24 release in about 2 weeks from now?
There is already an amount of solved tickets and our friends from Tika are
interested in a new version as well to cut a new release too including our
latest "stuff".
Cheers
Andreas
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.24 Release?
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,
just a friendly reminder. I'm going to cut the release in about 10-12 hours from
now, if nobody objects ;-)
Andreas
Am 25.05.21 um 08:20 schrieb Andreas Lehmkuehler:
> Hi,
>
> how about cutting a 2.0.24 release in about 2 weeks from now?
>
> There is already an amount of solved tickets and our friends from Tika are
> interested in a new version as well to cut a new release too including our
> latest "stuff".
>
> Cheers
> Andreas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.24 Release?
Posted by Tim Allison <ta...@apache.org>.
+1 :)
On Tue, May 25, 2021 at 2:20 AM Andreas Lehmkuehler <an...@lehmi.de>
wrote:
> Hi,
>
> how about cutting a 2.0.24 release in about 2 weeks from now?
>
> There is already an amount of solved tickets and our friends from Tika are
> interested in a new version as well to cut a new release too including our
> latest "stuff".
>
> Cheers
> Andreas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
>
Re: 2.0.24 Release?
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 05.06.21 um 20:09 schrieb Tilman Hausherr:
> Thanks!
>
> I created one issue (PDFBOX-5207) but I don't consider this a blocker.
>
> The other files where column T has text have troubles related to matrix
> multiplication. I suspect that some parser changes produce larger numbers than
> before.
>
> The file
> bug_trackers/poppler/poppler-84988-0.zip-3.pdf
> has a different problem but I suspect it is related:
> /MediaBox [0 170141183460469231731687303715884105728 612 792]
>
> in 2.0.23 rendering worked (it seems the number was skipped and then the
> rectangle ignored), but in 2.0.24 it doesn't.
This is related to PDFBOX-5176 which changes the behaviour of the parser when it
comes to numerical valid but out of range values.
>
> Tilman
>
> Am 03.06.2021 um 14:24 schrieb Tim Allison:
>> Reports are here:
>> https://corpora.tika.apache.org/base/reports/reports-pdfbox-2.0.24-SNAPSHOT.tgz
>>
>> No new exceptions. Content looks better by a tiny amount. There are a
>> few files with some apparent regressions, but overall, the diffs are
>> negligible.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.24 Release?
Posted by Tilman Hausherr <TH...@t-online.de>.
Thanks!
I created one issue (PDFBOX-5207) but I don't consider this a blocker.
The other files where column T has text have troubles related to matrix
multiplication. I suspect that some parser changes produce larger
numbers than before.
The file
bug_trackers/poppler/poppler-84988-0.zip-3.pdf
has a different problem but I suspect it is related:
/MediaBox [0 170141183460469231731687303715884105728 612 792]
in 2.0.23 rendering worked (it seems the number was skipped and then the
rectangle ignored), but in 2.0.24 it doesn't.
Tilman
Am 03.06.2021 um 14:24 schrieb Tim Allison:
> Reports are here:
> https://corpora.tika.apache.org/base/reports/reports-pdfbox-2.0.24-SNAPSHOT.tgz
>
> No new exceptions. Content looks better by a tiny amount. There are a
> few files with some apparent regressions, but overall, the diffs are
> negligible.
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.24 Release?
Posted by Tim Allison <ta...@apache.org>.
Reports are here:
https://corpora.tika.apache.org/base/reports/reports-pdfbox-2.0.24-SNAPSHOT.tgz
No new exceptions. Content looks better by a tiny amount. There are a
few files with some apparent regressions, but overall, the diffs are
negligible.
Let me know if you have questions.
Best,
Tim
On Mon, May 31, 2021 at 2:20 AM Andreas Lehmkuehler <an...@lehmi.de> wrote:
>
> Am 30.05.21 um 20:13 schrieb Tim Allison:
> > Will kick off tests on Tuesday, June 1 unless there are other text
> > extraction changes planned.
> Cool, I'm currently working on some 3.0 tickets so no interference from my side.
>
> Andreas
> >
> > On Sun, May 30, 2021 at 12:07 PM Andreas Lehmkuehler <an...@lehmi.de>
> > wrote:
> >
> >> I'm targeting the 7th or 8th of May.
> >>
> >> @Tim, @Tilman, is there any chance to run a 2.0.23 vs. 2.0.24 comparison
> >> first?
> >>
> >> Andreas
> >>
> >> Am 26.05.21 um 21:20 schrieb Tilman Hausherr:
> >>> +1
> >>>
> >>> Tilman
> >>>
> >>> Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
> >>>> Hi,
> >>>>
> >>>> how about cutting a 2.0.24 release in about 2 weeks from now?
> >>>>
> >>>> There is already an amount of solved tickets and our friends from Tika
> >> are
> >>>> interested in a new version as well to cut a new release too including
> >> our
> >>>> latest "stuff".
> >>>>
> >>>> Cheers
> >>>> Andreas
> >>>>
> >>>> ---------------------------------------------------------------------
> >>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>>>
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >>> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>
> >>
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.24 Release?
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 30.05.21 um 20:13 schrieb Tim Allison:
> Will kick off tests on Tuesday, June 1 unless there are other text
> extraction changes planned.
Cool, I'm currently working on some 3.0 tickets so no interference from my side.
Andreas
>
> On Sun, May 30, 2021 at 12:07 PM Andreas Lehmkuehler <an...@lehmi.de>
> wrote:
>
>> I'm targeting the 7th or 8th of May.
>>
>> @Tim, @Tilman, is there any chance to run a 2.0.23 vs. 2.0.24 comparison
>> first?
>>
>> Andreas
>>
>> Am 26.05.21 um 21:20 schrieb Tilman Hausherr:
>>> +1
>>>
>>> Tilman
>>>
>>> Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
>>>> Hi,
>>>>
>>>> how about cutting a 2.0.24 release in about 2 weeks from now?
>>>>
>>>> There is already an amount of solved tickets and our friends from Tika
>> are
>>>> interested in a new version as well to cut a new release too including
>> our
>>>> latest "stuff".
>>>>
>>>> Cheers
>>>> Andreas
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.24 Release?
Posted by Tim Allison <ta...@apache.org>.
Will kick off tests on Tuesday, June 1 unless there are other text
extraction changes planned.
On Sun, May 30, 2021 at 12:07 PM Andreas Lehmkuehler <an...@lehmi.de>
wrote:
> I'm targeting the 7th or 8th of May.
>
> @Tim, @Tilman, is there any chance to run a 2.0.23 vs. 2.0.24 comparison
> first?
>
> Andreas
>
> Am 26.05.21 um 21:20 schrieb Tilman Hausherr:
> > +1
> >
> > Tilman
> >
> > Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
> >> Hi,
> >>
> >> how about cutting a 2.0.24 release in about 2 weeks from now?
> >>
> >> There is already an amount of solved tickets and our friends from Tika
> are
> >> interested in a new version as well to cut a new release too including
> our
> >> latest "stuff".
> >>
> >> Cheers
> >> Andreas
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: dev-help@pdfbox.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
>
Re: 2.0.24 Release?
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
I'm targeting the 7th or 8th of May.
@Tim, @Tilman, is there any chance to run a 2.0.23 vs. 2.0.24 comparison first?
Andreas
Am 26.05.21 um 21:20 schrieb Tilman Hausherr:
> +1
>
> Tilman
>
> Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
>> Hi,
>>
>> how about cutting a 2.0.24 release in about 2 weeks from now?
>>
>> There is already an amount of solved tickets and our friends from Tika are
>> interested in a new version as well to cut a new release too including our
>> latest "stuff".
>>
>> Cheers
>> Andreas
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.24 Release?
Posted by Tilman Hausherr <TH...@t-online.de>.
+1
Tilman
Am 25.05.2021 um 08:20 schrieb Andreas Lehmkuehler:
> Hi,
>
> how about cutting a 2.0.24 release in about 2 weeks from now?
>
> There is already an amount of solved tickets and our friends from Tika
> are interested in a new version as well to cut a new release too
> including our latest "stuff".
>
> Cheers
> Andreas
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org