You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by Andreas Lehmkuehler <an...@lehmi.de> on 2022/04/07 05:41:31 UTC
2.0.26 release
Hi,
sorry for the delay. I'm planning to cut the 2.0.26 release next Saturday, the
day after tomorrow, if nobody objects.
Andreas
P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is out
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 17.04.22 um 20:25 schrieb Tilman Hausherr:
> new regression tests results at
>
> https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.25_vs_2.0.26.tar.xz
>
> IMHO we're fine now!
Thanks for the fast re-test!
I'm going to cut the 2.0.26 release tomorrow
Andreas
>
> Tilman
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
new regression tests results at
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.25_vs_2.0.26.tar.xz
IMHO we're fine now!
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 15.04.2022 um 10:42 schrieb Tilman Hausherr:
> Am 14.04.2022 um 08:59 schrieb Tilman Hausherr:
>>
>> I will rerun the tests myself on the long weekend, if I have the time.
>
>
> https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.25_vs_2.0.26.tar.xz
This time I looked at content_diffs_with_exceptions.xlsx , I thought
this is about the difference with PDFs that fail with exceptions but I
suspect it is something different.
TOP_10_MORE_IN_A has 7 that are relevant, but I believe this is a
similar problem because the language looks similar, so I created only 1
issue.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 14.04.2022 um 08:59 schrieb Tilman Hausherr:
>
> I will rerun the tests myself on the long weekend, if I have the time.
https://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.25_vs_2.0.26.tar.xz
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 14.04.2022 um 08:13 schrieb Andreas Lehmkuehler:
> Cool, thanks for the feedback. I've set the ticket to resolved.
>
> Do we need to re-run the tests?
>
> BTW, what about PDFBOX-5394? Is there anything left to do? Do we have
> to wait for the feedback of the user?
I've set that one to resolved.
I will rerun the tests myself on the long weekend, if I have the time.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Cool, thanks for the feedback. I've set the ticket to resolved.
Do we need to re-run the tests?
BTW, what about PDFBOX-5394? Is there anything left to do? Do we have to wait
for the feedback of the user?
Andreas
Am 13.04.22 um 08:29 schrieb Tilman Hausherr:
> Yeah, PDFBOX-5413 fixes that one as well. 👍
>
> Tilman
>
> Am 12.04.2022 um 19:26 schrieb Tilman Hausherr:
>> Only one left: 7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M.pdf .
>>
>> There is some sort of problem with an incremental save, a part of the
>> multi-content stream is missing / has a new object number. Lets wait whether
>> it is related to PDFBOX-5413 .
>>
>> (The other one, HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5.pdf is an improvement, I'll
>> add it to my own tests)
>>
>> Tilman
>>
>> Am 12.04.2022 um 18:25 schrieb Tilman Hausherr:
>>> Only
>>> commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
>>> commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
>>> have a different text extraction
>>>
>>> With the other two it's attachment file names or doc info.
>>>
>>> Tilman
>>>
>>> Am 12.04.2022 um 08:16 schrieb Tilman Hausherr:
>>>> After having looked at the content differences and trying to rule out the
>>>> /Names differences, there are 4 files with content in TOP_10_MORE_IN_A that
>>>> feel suspicious and IMHO need investigation.
>>>>
>>>> commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
>>>> govdocs1/365/365260.pdf
>>>> commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
>>>> govdocs1/150/150282.pdf
>>>>
>>>> Tilman
>>>>
>>>>
>>>>
>>>> Am 12.04.2022 um 08:09 schrieb Andreas Lehmkuehler:
>>>>> Thanks Tim!
>>>>>
>>>>> Looks like there are 5 new exceptions left.
>>>>>
>>>>> I'm going to check the first two ones
>>>>>
>>>>> commoncrawl3/ZC/ZCY5MCL7KI6QXVMXUZ2AJKXICQIT4TL4
>>>>> commoncrawl3/WY/WYPJNTD5KQNODSXWK4GABURXRTTD5P4H
>>>>>
>>>>> The others are thrown within Jempbox ....
>>>>>
>>>>>
>>>>> Andreas
>>>>>
>>>>> Am 11.04.22 um 12:40 schrieb Tim Allison:
>>>>>> https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
>>>>>>
>>>>>> Haven't had a chance to review. Hot off the vm.
>>>>>>
>>>>>> On Sun, Apr 10, 2022 at 9:58 AM Tim Allison <ta...@apache.org> wrote:
>>>>>>>
>>>>>>> Will try to kick off today…first thing Monday morning (EDT) at the latest.
>>>>>>>
>>>>>>> On Sun, Apr 10, 2022 at 9:05 AM Andreas Lehmkuehler <an...@lehmi.de>
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
>>>>>>>>> testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by default).
>>>>>>>> I've fixed all new tickets. PDFBOX-5413 fixes the issue with the disabled
>>>>>>>> flatten test.
>>>>>>>>
>>>>>>>> @Tim Is there any chance to re-run the tests?
>>>>>>>>
>>>>>>>> Andreas
>>>>>>>>
>>>>>>>>>
>>>>>>>>> testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
>>>>>>>>>
>>>>>>>>> Time elapsed: 1.083 s <<< ERROR!
>>>>>>>>> java.io.IOException: javax.crypto.BadPaddingException: Given final
>>>>>>>>> block not
>>>>>>>>> properly padded. Such issues can arise if a bad key is used during
>>>>>>>>> decryption.
>>>>>>>>> at
>>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> at
>>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> at
>>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Caused by: javax.crypto.BadPaddingException: Given final block not
>>>>>>>>> properly
>>>>>>>>> padded. Such issues can arise if a bad key is used during decryption.
>>>>>>>>> at
>>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> at
>>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> at
>>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm not creating an issue this time in case this is also related to
>>>>>>>>> another
>>>>>>>>> known problem.
>>>>>>>>>
>>>>>>>>> Tilman
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
Yeah, PDFBOX-5413 fixes that one as well. 👍
Tilman
Am 12.04.2022 um 19:26 schrieb Tilman Hausherr:
> Only one left: 7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M.pdf .
>
> There is some sort of problem with an incremental save, a part of the
> multi-content stream is missing / has a new object number. Lets wait
> whether it is related to PDFBOX-5413 .
>
> (The other one, HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5.pdf is an
> improvement, I'll add it to my own tests)
>
> Tilman
>
> Am 12.04.2022 um 18:25 schrieb Tilman Hausherr:
>> Only
>> commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
>> commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
>> have a different text extraction
>>
>> With the other two it's attachment file names or doc info.
>>
>> Tilman
>>
>> Am 12.04.2022 um 08:16 schrieb Tilman Hausherr:
>>> After having looked at the content differences and trying to rule
>>> out the /Names differences, there are 4 files with content in
>>> TOP_10_MORE_IN_A that feel suspicious and IMHO need investigation.
>>>
>>> commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
>>> govdocs1/365/365260.pdf
>>> commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
>>> govdocs1/150/150282.pdf
>>>
>>> Tilman
>>>
>>>
>>>
>>> Am 12.04.2022 um 08:09 schrieb Andreas Lehmkuehler:
>>>> Thanks Tim!
>>>>
>>>> Looks like there are 5 new exceptions left.
>>>>
>>>> I'm going to check the first two ones
>>>>
>>>> commoncrawl3/ZC/ZCY5MCL7KI6QXVMXUZ2AJKXICQIT4TL4
>>>> commoncrawl3/WY/WYPJNTD5KQNODSXWK4GABURXRTTD5P4H
>>>>
>>>> The others are thrown within Jempbox ....
>>>>
>>>>
>>>> Andreas
>>>>
>>>> Am 11.04.22 um 12:40 schrieb Tim Allison:
>>>>> https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
>>>>>
>>>>> Haven't had a chance to review. Hot off the vm.
>>>>>
>>>>> On Sun, Apr 10, 2022 at 9:58 AM Tim Allison <ta...@apache.org>
>>>>> wrote:
>>>>>>
>>>>>> Will try to kick off today…first thing Monday morning (EDT) at
>>>>>> the latest.
>>>>>>
>>>>>> On Sun, Apr 10, 2022 at 9:05 AM Andreas Lehmkuehler
>>>>>> <an...@lehmi.de> wrote:
>>>>>>>
>>>>>>> Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
>>>>>>>> testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled
>>>>>>>> by default).
>>>>>>> I've fixed all new tickets. PDFBOX-5413 fixes the issue with the
>>>>>>> disabled
>>>>>>> flatten test.
>>>>>>>
>>>>>>> @Tim Is there any chance to re-run the tests?
>>>>>>>
>>>>>>> Andreas
>>>>>>>
>>>>>>>>
>>>>>>>> testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
>>>>>>>>
>>>>>>>> Time elapsed: 1.083 s <<< ERROR!
>>>>>>>> java.io.IOException: javax.crypto.BadPaddingException: Given
>>>>>>>> final block not
>>>>>>>> properly padded. Such issues can arise if a bad key is used
>>>>>>>> during decryption.
>>>>>>>> at
>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>>>>
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>>>>
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>>>>
>>>>>>>>
>>>>>>>> Caused by: javax.crypto.BadPaddingException: Given final block
>>>>>>>> not properly
>>>>>>>> padded. Such issues can arise if a bad key is used during
>>>>>>>> decryption.
>>>>>>>> at
>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>>>>
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>>>>
>>>>>>>>
>>>>>>>> at
>>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not creating an issue this time in case this is also
>>>>>>>> related to another
>>>>>>>> known problem.
>>>>>>>>
>>>>>>>> Tilman
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
Only one left: 7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M.pdf .
There is some sort of problem with an incremental save, a part of the
multi-content stream is missing / has a new object number. Lets wait
whether it is related to PDFBOX-5413 .
(The other one, HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5.pdf is an improvement,
I'll add it to my own tests)
Tilman
Am 12.04.2022 um 18:25 schrieb Tilman Hausherr:
> Only
> commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
> commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
> have a different text extraction
>
> With the other two it's attachment file names or doc info.
>
> Tilman
>
> Am 12.04.2022 um 08:16 schrieb Tilman Hausherr:
>> After having looked at the content differences and trying to rule out
>> the /Names differences, there are 4 files with content in
>> TOP_10_MORE_IN_A that feel suspicious and IMHO need investigation.
>>
>> commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
>> govdocs1/365/365260.pdf
>> commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
>> govdocs1/150/150282.pdf
>>
>> Tilman
>>
>>
>>
>> Am 12.04.2022 um 08:09 schrieb Andreas Lehmkuehler:
>>> Thanks Tim!
>>>
>>> Looks like there are 5 new exceptions left.
>>>
>>> I'm going to check the first two ones
>>>
>>> commoncrawl3/ZC/ZCY5MCL7KI6QXVMXUZ2AJKXICQIT4TL4
>>> commoncrawl3/WY/WYPJNTD5KQNODSXWK4GABURXRTTD5P4H
>>>
>>> The others are thrown within Jempbox ....
>>>
>>>
>>> Andreas
>>>
>>> Am 11.04.22 um 12:40 schrieb Tim Allison:
>>>> https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
>>>>
>>>> Haven't had a chance to review. Hot off the vm.
>>>>
>>>> On Sun, Apr 10, 2022 at 9:58 AM Tim Allison <ta...@apache.org>
>>>> wrote:
>>>>>
>>>>> Will try to kick off today…first thing Monday morning (EDT) at the
>>>>> latest.
>>>>>
>>>>> On Sun, Apr 10, 2022 at 9:05 AM Andreas Lehmkuehler
>>>>> <an...@lehmi.de> wrote:
>>>>>>
>>>>>> Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
>>>>>>> testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by
>>>>>>> default).
>>>>>> I've fixed all new tickets. PDFBOX-5413 fixes the issue with the
>>>>>> disabled
>>>>>> flatten test.
>>>>>>
>>>>>> @Tim Is there any chance to re-run the tests?
>>>>>>
>>>>>> Andreas
>>>>>>
>>>>>>>
>>>>>>> testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
>>>>>>>
>>>>>>> Time elapsed: 1.083 s <<< ERROR!
>>>>>>> java.io.IOException: javax.crypto.BadPaddingException: Given
>>>>>>> final block not
>>>>>>> properly padded. Such issues can arise if a bad key is used
>>>>>>> during decryption.
>>>>>>> at
>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>>>
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>>>
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>>>
>>>>>>>
>>>>>>> Caused by: javax.crypto.BadPaddingException: Given final block
>>>>>>> not properly
>>>>>>> padded. Such issues can arise if a bad key is used during
>>>>>>> decryption.
>>>>>>> at
>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>>>
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>>>
>>>>>>>
>>>>>>> at
>>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I'm not creating an issue this time in case this is also related
>>>>>>> to another
>>>>>>> known problem.
>>>>>>>
>>>>>>> Tilman
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> ---------------------------------------------------------------------
>>>>>>>
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
Only
commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
have a different text extraction
With the other two it's attachment file names or doc info.
Tilman
Am 12.04.2022 um 08:16 schrieb Tilman Hausherr:
> After having looked at the content differences and trying to rule out
> the /Names differences, there are 4 files with content in
> TOP_10_MORE_IN_A that feel suspicious and IMHO need investigation.
>
> commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
> govdocs1/365/365260.pdf
> commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
> govdocs1/150/150282.pdf
>
> Tilman
>
>
>
> Am 12.04.2022 um 08:09 schrieb Andreas Lehmkuehler:
>> Thanks Tim!
>>
>> Looks like there are 5 new exceptions left.
>>
>> I'm going to check the first two ones
>>
>> commoncrawl3/ZC/ZCY5MCL7KI6QXVMXUZ2AJKXICQIT4TL4
>> commoncrawl3/WY/WYPJNTD5KQNODSXWK4GABURXRTTD5P4H
>>
>> The others are thrown within Jempbox ....
>>
>>
>> Andreas
>>
>> Am 11.04.22 um 12:40 schrieb Tim Allison:
>>> https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
>>>
>>> Haven't had a chance to review. Hot off the vm.
>>>
>>> On Sun, Apr 10, 2022 at 9:58 AM Tim Allison <ta...@apache.org>
>>> wrote:
>>>>
>>>> Will try to kick off today…first thing Monday morning (EDT) at the
>>>> latest.
>>>>
>>>> On Sun, Apr 10, 2022 at 9:05 AM Andreas Lehmkuehler
>>>> <an...@lehmi.de> wrote:
>>>>>
>>>>> Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
>>>>>> testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by
>>>>>> default).
>>>>> I've fixed all new tickets. PDFBOX-5413 fixes the issue with the
>>>>> disabled
>>>>> flatten test.
>>>>>
>>>>> @Tim Is there any chance to re-run the tests?
>>>>>
>>>>> Andreas
>>>>>
>>>>>>
>>>>>> testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
>>>>>>
>>>>>> Time elapsed: 1.083 s <<< ERROR!
>>>>>> java.io.IOException: javax.crypto.BadPaddingException: Given
>>>>>> final block not
>>>>>> properly padded. Such issues can arise if a bad key is used
>>>>>> during decryption.
>>>>>> at
>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>>
>>>>>>
>>>>>> at
>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>>
>>>>>>
>>>>>> at
>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>>
>>>>>>
>>>>>> Caused by: javax.crypto.BadPaddingException: Given final block
>>>>>> not properly
>>>>>> padded. Such issues can arise if a bad key is used during
>>>>>> decryption.
>>>>>> at
>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>>
>>>>>>
>>>>>> at
>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>>
>>>>>>
>>>>>> at
>>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>>
>>>>>>
>>>>>>
>>>>>> I'm not creating an issue this time in case this is also related
>>>>>> to another
>>>>>> known problem.
>>>>>>
>>>>>> Tilman
>>>>>>
>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>>
>>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
After having looked at the content differences and trying to rule out
the /Names differences, there are 4 files with content in
TOP_10_MORE_IN_A that feel suspicious and IMHO need investigation.
commoncrawl3/7L/7LRS5U6CAFMN2P6JPTZVNBUW6XOFYH4M
govdocs1/365/365260.pdf
commoncrawl3/HO/HOAZTST4E26NPA7HL72WCIVMNRQ3E4M5
govdocs1/150/150282.pdf
Tilman
Am 12.04.2022 um 08:09 schrieb Andreas Lehmkuehler:
> Thanks Tim!
>
> Looks like there are 5 new exceptions left.
>
> I'm going to check the first two ones
>
> commoncrawl3/ZC/ZCY5MCL7KI6QXVMXUZ2AJKXICQIT4TL4
> commoncrawl3/WY/WYPJNTD5KQNODSXWK4GABURXRTTD5P4H
>
> The others are thrown within Jempbox ....
>
>
> Andreas
>
> Am 11.04.22 um 12:40 schrieb Tim Allison:
>> https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
>>
>> Haven't had a chance to review. Hot off the vm.
>>
>> On Sun, Apr 10, 2022 at 9:58 AM Tim Allison <ta...@apache.org> wrote:
>>>
>>> Will try to kick off today…first thing Monday morning (EDT) at the
>>> latest.
>>>
>>> On Sun, Apr 10, 2022 at 9:05 AM Andreas Lehmkuehler
>>> <an...@lehmi.de> wrote:
>>>>
>>>> Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
>>>>> testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by
>>>>> default).
>>>> I've fixed all new tickets. PDFBOX-5413 fixes the issue with the
>>>> disabled
>>>> flatten test.
>>>>
>>>> @Tim Is there any chance to re-run the tests?
>>>>
>>>> Andreas
>>>>
>>>>>
>>>>> testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
>>>>>
>>>>> Time elapsed: 1.083 s <<< ERROR!
>>>>> java.io.IOException: javax.crypto.BadPaddingException: Given final
>>>>> block not
>>>>> properly padded. Such issues can arise if a bad key is used during
>>>>> decryption.
>>>>> at
>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>
>>>>>
>>>>> at
>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>
>>>>>
>>>>> at
>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>
>>>>>
>>>>> Caused by: javax.crypto.BadPaddingException: Given final block not
>>>>> properly
>>>>> padded. Such issues can arise if a bad key is used during decryption.
>>>>> at
>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>>
>>>>>
>>>>> at
>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>>
>>>>>
>>>>> at
>>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>>
>>>>>
>>>>>
>>>>> I'm not creating an issue this time in case this is also related
>>>>> to another
>>>>> known problem.
>>>>>
>>>>> Tilman
>>>>>
>>>>>
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Thanks Tim!
Looks like there are 5 new exceptions left.
I'm going to check the first two ones
commoncrawl3/ZC/ZCY5MCL7KI6QXVMXUZ2AJKXICQIT4TL4
commoncrawl3/WY/WYPJNTD5KQNODSXWK4GABURXRTTD5P4H
The others are thrown within Jempbox ....
Andreas
Am 11.04.22 um 12:40 schrieb Tim Allison:
> https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
>
> Haven't had a chance to review. Hot off the vm.
>
> On Sun, Apr 10, 2022 at 9:58 AM Tim Allison <ta...@apache.org> wrote:
>>
>> Will try to kick off today…first thing Monday morning (EDT) at the latest.
>>
>> On Sun, Apr 10, 2022 at 9:05 AM Andreas Lehmkuehler <an...@lehmi.de> wrote:
>>>
>>> Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
>>>> testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by default).
>>> I've fixed all new tickets. PDFBOX-5413 fixes the issue with the disabled
>>> flatten test.
>>>
>>> @Tim Is there any chance to re-run the tests?
>>>
>>> Andreas
>>>
>>>>
>>>> testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
>>>> Time elapsed: 1.083 s <<< ERROR!
>>>> java.io.IOException: javax.crypto.BadPaddingException: Given final block not
>>>> properly padded. Such issues can arise if a bad key is used during decryption.
>>>> at
>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>
>>>> at
>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>
>>>> at
>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>
>>>> Caused by: javax.crypto.BadPaddingException: Given final block not properly
>>>> padded. Such issues can arise if a bad key is used during decryption.
>>>> at
>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>>>>
>>>> at
>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>>>>
>>>> at
>>>> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>>>>
>>>>
>>>> I'm not creating an issue this time in case this is also related to another
>>>> known problem.
>>>>
>>>> Tilman
>>>>
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tim Allison <ta...@apache.org>.
Yes. Sorry. That's my fault. I did something stupid in 2.3.0 and then
fixed it in 2.4.0-SNAPSHOT (TIKA-3711).
On Mon, Apr 11, 2022 at 1:35 PM Tilman Hausherr <TH...@t-online.de>
wrote:
> Am 11.04.2022 um 12:40 schrieb Tim Allison:
>
> https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
>
> Haven't had a chance to review. Hot off the vm.
>
>
> Thanks Tim!
>
> I looked at govdocs1/198/198191.pdf , this one has "high: 1 |
> quality.joboptions: 1" more in A. This is a name of a file attachment at
> Root/Names/EmbeddedFiles/Kids/[0]/Names/[0] but I can see it both in
> 2.0.25 and 2.0.26. Could it be that the test ran with different tika
> versions?
>
> Tilman
>
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
Am 11.04.2022 um 12:40 schrieb Tim Allison:
> https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
>
> Haven't had a chance to review. Hot off the vm.
Thanks Tim!
I looked at govdocs1/198/198191.pdf , this one has "high: 1 |
quality.joboptions: 1" more in A. This is a name of a file attachment
at Root/Names/EmbeddedFiles/Kids/[0]/Names/[0] but I can see it both
in 2.0.25 and 2.0.26. Could it be that the test ran with different tika
versions?
Tilman
Re: 2.0.26 release
Posted by Tim Allison <ta...@apache.org>.
https://corpora.tika.apache.org/base/reports/tika-2.4-20220410.tgz
Haven't had a chance to review. Hot off the vm.
On Sun, Apr 10, 2022 at 9:58 AM Tim Allison <ta...@apache.org> wrote:
>
> Will try to kick off today…first thing Monday morning (EDT) at the latest.
>
> On Sun, Apr 10, 2022 at 9:05 AM Andreas Lehmkuehler <an...@lehmi.de> wrote:
>>
>> Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
>> > testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by default).
>> I've fixed all new tickets. PDFBOX-5413 fixes the issue with the disabled
>> flatten test.
>>
>> @Tim Is there any chance to re-run the tests?
>>
>> Andreas
>>
>> >
>> > testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
>> > Time elapsed: 1.083 s <<< ERROR!
>> > java.io.IOException: javax.crypto.BadPaddingException: Given final block not
>> > properly padded. Such issues can arise if a bad key is used during decryption.
>> > at
>> > org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>> >
>> > at
>> > org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>> >
>> > at
>> > org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>> >
>> > Caused by: javax.crypto.BadPaddingException: Given final block not properly
>> > padded. Such issues can arise if a bad key is used during decryption.
>> > at
>> > org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>> >
>> > at
>> > org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>> >
>> > at
>> > org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>> >
>> >
>> > I'm not creating an issue this time in case this is also related to another
>> > known problem.
>> >
>> > Tilman
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> > For additional commands, e-mail: dev-help@pdfbox.apache.org
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tim Allison <ta...@apache.org>.
Will try to kick off today…first thing Monday morning (EDT) at the latest.
On Sun, Apr 10, 2022 at 9:05 AM Andreas Lehmkuehler <an...@lehmi.de>
wrote:
> Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
> > testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by
> default).
> I've fixed all new tickets. PDFBOX-5413 fixes the issue with the disabled
> flatten test.
>
> @Tim Is there any chance to re-run the tests?
>
> Andreas
>
> >
> >
> testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
>
> > Time elapsed: 1.083 s <<< ERROR!
> > java.io.IOException: javax.crypto.BadPaddingException: Given final block
> not
> > properly padded. Such issues can arise if a bad key is used during
> decryption.
> > at
> >
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>
> >
> > at
> >
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>
> >
> > at
> >
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>
> >
> > Caused by: javax.crypto.BadPaddingException: Given final block not
> properly
> > padded. Such issues can arise if a bad key is used during decryption.
> > at
> >
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>
> >
> > at
> >
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>
> >
> > at
> >
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>
> >
> >
> > I'm not creating an issue this time in case this is also related to
> another
> > known problem.
> >
> > Tilman
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: dev-help@pdfbox.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
>
Re: 2.0.26 release
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 09.04.22 um 19:00 schrieb Tilman Hausherr:
> testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by default).
I've fixed all new tickets. PDFBOX-5413 fixes the issue with the disabled
flatten test.
@Tim Is there any chance to re-run the tests?
Andreas
>
> testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
> Time elapsed: 1.083 s <<< ERROR!
> java.io.IOException: javax.crypto.BadPaddingException: Given final block not
> properly padded. Such issues can arise if a bad key is used during decryption.
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>
> Caused by: javax.crypto.BadPaddingException: Given final block not properly
> padded. Such issues can arise if a bad key is used during decryption.
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
>
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
>
> at
> org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
>
>
> I'm not creating an issue this time in case this is also related to another
> known problem.
>
> Tilman
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tilman Hausherr <TH...@t-online.de>.
testFlattenPDFBOX2469Filled also fails in 2.0 (it is disabled by default).
testFlattenPDFBOX2469Filled(org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest)
Time elapsed: 1.083 s <<< ERROR!
java.io.IOException: javax.crypto.BadPaddingException: Given final block
not properly padded. Such issues can arise if a bad key is used during
decryption.
at
org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
at
org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
at
org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
Caused by: javax.crypto.BadPaddingException: Given final block not
properly padded. Such issues can arise if a bad key is used during
decryption.
at
org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.generateSamples(PDAcroFormFlattenTest.java:345)
at
org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.flattenAndCompare(PDAcroFormFlattenTest.java:309)
at
org.apache.pdfbox.pdmodel.interactive.form.PDAcroFormFlattenTest.testFlattenPDFBOX2469Filled(PDAcroFormFlattenTest.java:105)
I'm not creating an issue this time in case this is also related to
another known problem.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Thanks Tim!
I've checked the first files of the new exceptions and there seems to be at
least one new regression
commoncrawl3/ZC/ZCY5MCL7KI6QXVMXUZ2AJKXICQIT4TL4
commoncrawl3/WY/WYPJNTD5KQNODSXWK4GABURXRTTD5P4H
commoncrawl3/YI/YIEMGIQYGXCQ5AZOE35ESXYCZHWR3V57
commoncrawl3_refetched/5C/5CWAUHFCZMK42IHSMSKNIR3MHXHR4IRN
All render fine using 2.0.25 but throw an exception using 2.0.26
I'm going to have a deeper look later
Am 07.04.22 um 20:27 schrieb Tim Allison:
> https://corpora.tika.apache.org/base/reports/pdfbox-2.0.26-snapshot-reports.tgz
>
> I haven't had a chance to look at them yet.
>
> On Thu, Apr 7, 2022 at 9:07 AM Andreas Lehmkühler <an...@lehmi.de> wrote:
>>
>> Yes, please
>>
>> Thanks in advance
>> Andreas
>>
>> 07.04.2022 11:44:38 Tim Allison <ta...@apache.org>:
>>
>>> Sounds great! Should I rerun the regression tests today?
>>>
>>> On Thu, Apr 7, 2022 at 1:41 AM Andreas Lehmkuehler <an...@lehmi.de> wrote:
>>>
>>>> Hi,
>>>>
>>>> sorry for the delay. I'm planning to cut the 2.0.26 release next
>>>> Saturday, the
>>>> day after tomorrow, if nobody objects.
>>>>
>>>> Andreas
>>>>
>>>> P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is
>>>> out
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tim Allison <ta...@apache.org>.
https://corpora.tika.apache.org/base/reports/pdfbox-2.0.26-snapshot-reports.tgz
I haven't had a chance to look at them yet.
On Thu, Apr 7, 2022 at 9:07 AM Andreas Lehmkühler <an...@lehmi.de> wrote:
>
> Yes, please
>
> Thanks in advance
> Andreas
>
> 07.04.2022 11:44:38 Tim Allison <ta...@apache.org>:
>
> > Sounds great! Should I rerun the regression tests today?
> >
> > On Thu, Apr 7, 2022 at 1:41 AM Andreas Lehmkuehler <an...@lehmi.de> wrote:
> >
> >> Hi,
> >>
> >> sorry for the delay. I'm planning to cut the 2.0.26 release next
> >> Saturday, the
> >> day after tomorrow, if nobody objects.
> >>
> >> Andreas
> >>
> >> P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is
> >> out
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>
> >>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Andreas Lehmkühler <an...@lehmi.de>.
Yes, please
Thanks in advance
Andreas
07.04.2022 11:44:38 Tim Allison <ta...@apache.org>:
> Sounds great! Should I rerun the regression tests today?
>
> On Thu, Apr 7, 2022 at 1:41 AM Andreas Lehmkuehler <an...@lehmi.de> wrote:
>
>> Hi,
>>
>> sorry for the delay. I'm planning to cut the 2.0.26 release next
>> Saturday, the
>> day after tomorrow, if nobody objects.
>>
>> Andreas
>>
>> P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is
>> out
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: 2.0.26 release
Posted by Tim Allison <ta...@apache.org>.
Sounds great! Should I rerun the regression tests today?
On Thu, Apr 7, 2022 at 1:41 AM Andreas Lehmkuehler <an...@lehmi.de> wrote:
> Hi,
>
> sorry for the delay. I'm planning to cut the 2.0.26 release next
> Saturday, the
> day after tomorrow, if nobody objects.
>
> Andreas
>
> P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is
> out
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>
>
Re: Next releases WAS: Re: 2.4.0 release?
Posted by Tim Allison <ta...@apache.org>.
https://repository.apache.org is having a bad day. Requests are
timing out left and right. I'll try to perform the release of
2.4.0-rc1 later today or tomorrow when the repo is happier.
On Thu, Apr 28, 2022 at 9:47 AM Tim Allison <ta...@apache.org> wrote:
>
> I've upgraded junrar in both branches, and the regression results look good.
>
> I'll start 1.28.2-rc2 shortly, and then follow up with 2.4.0-rc1 if
> there aren't any objections.
>
> On Tue, Apr 26, 2022 at 9:10 AM Tim Allison <ta...@apache.org> wrote:
> >
> > All,
> >
> > I'm prepping rc1 for 1.28.2 now.
> >
> > I'm running the regression tests for 2.4.0, and I hope to have results
> > today with possibly an rc later today or early tomorrow if there are
> > no surprises.
> >
> > Please let me know if there are any blockers.
> >
> > Best,
> >
> > Tim
> >
> > On Thu, Apr 7, 2022 at 9:50 AM Tim Allison <ta...@apache.org> wrote:
> > >
> > > All,
> > > Once the new PDFBox is out, we should probably kick off the 2.4.0
> > > release. If I'm release manager, given my schedule, that'll probably
> > > be the week of April 18th.
> > > I want to fix TIKA-3711 (embedded file names), but other than that,
> > > I don't think there are any blockers.
> > >
> > > WDYT?
> > >
> > > Best,
> > >
> > > Tim
> > >
> > > ---------- Forwarded message ---------
> > > From: Andreas Lehmkuehler <an...@lehmi.de>
> > > Date: Thu, Apr 7, 2022 at 1:41 AM
> > > Subject: 2.0.26 release
> > > To: <de...@pdfbox.apache.org>
> > >
> > >
> > > Hi,
> > >
> > > sorry for the delay. I'm planning to cut the 2.0.26 release next Saturday, the
> > > day after tomorrow, if nobody objects.
> > >
> > > Andreas
> > >
> > > P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is out
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > > For additional commands, e-mail: dev-help@pdfbox.apache.org
Re: Next releases WAS: Re: 2.4.0 release?
Posted by Tim Allison <ta...@apache.org>.
I've upgraded junrar in both branches, and the regression results look good.
I'll start 1.28.2-rc2 shortly, and then follow up with 2.4.0-rc1 if
there aren't any objections.
On Tue, Apr 26, 2022 at 9:10 AM Tim Allison <ta...@apache.org> wrote:
>
> All,
>
> I'm prepping rc1 for 1.28.2 now.
>
> I'm running the regression tests for 2.4.0, and I hope to have results
> today with possibly an rc later today or early tomorrow if there are
> no surprises.
>
> Please let me know if there are any blockers.
>
> Best,
>
> Tim
>
> On Thu, Apr 7, 2022 at 9:50 AM Tim Allison <ta...@apache.org> wrote:
> >
> > All,
> > Once the new PDFBox is out, we should probably kick off the 2.4.0
> > release. If I'm release manager, given my schedule, that'll probably
> > be the week of April 18th.
> > I want to fix TIKA-3711 (embedded file names), but other than that,
> > I don't think there are any blockers.
> >
> > WDYT?
> >
> > Best,
> >
> > Tim
> >
> > ---------- Forwarded message ---------
> > From: Andreas Lehmkuehler <an...@lehmi.de>
> > Date: Thu, Apr 7, 2022 at 1:41 AM
> > Subject: 2.0.26 release
> > To: <de...@pdfbox.apache.org>
> >
> >
> > Hi,
> >
> > sorry for the delay. I'm planning to cut the 2.0.26 release next Saturday, the
> > day after tomorrow, if nobody objects.
> >
> > Andreas
> >
> > P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is out
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: dev-help@pdfbox.apache.org
Next releases WAS: Re: 2.4.0 release?
Posted by Tim Allison <ta...@apache.org>.
All,
I'm prepping rc1 for 1.28.2 now.
I'm running the regression tests for 2.4.0, and I hope to have results
today with possibly an rc later today or early tomorrow if there are
no surprises.
Please let me know if there are any blockers.
Best,
Tim
On Thu, Apr 7, 2022 at 9:50 AM Tim Allison <ta...@apache.org> wrote:
>
> All,
> Once the new PDFBox is out, we should probably kick off the 2.4.0
> release. If I'm release manager, given my schedule, that'll probably
> be the week of April 18th.
> I want to fix TIKA-3711 (embedded file names), but other than that,
> I don't think there are any blockers.
>
> WDYT?
>
> Best,
>
> Tim
>
> ---------- Forwarded message ---------
> From: Andreas Lehmkuehler <an...@lehmi.de>
> Date: Thu, Apr 7, 2022 at 1:41 AM
> Subject: 2.0.26 release
> To: <de...@pdfbox.apache.org>
>
>
> Hi,
>
> sorry for the delay. I'm planning to cut the 2.0.26 release next Saturday, the
> day after tomorrow, if nobody objects.
>
> Andreas
>
> P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is out
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
2.4.0 release?
Posted by Tim Allison <ta...@apache.org>.
All,
Once the new PDFBox is out, we should probably kick off the 2.4.0
release. If I'm release manager, given my schedule, that'll probably
be the week of April 18th.
I want to fix TIKA-3711 (embedded file names), but other than that,
I don't think there are any blockers.
WDYT?
Best,
Tim
---------- Forwarded message ---------
From: Andreas Lehmkuehler <an...@lehmi.de>
Date: Thu, Apr 7, 2022 at 1:41 AM
Subject: 2.0.26 release
To: <de...@pdfbox.apache.org>
Hi,
sorry for the delay. I'm planning to cut the 2.0.26 release next Saturday, the
day after tomorrow, if nobody objects.
Andreas
P.S.: I'm targeting a new 3.0.0 alpha release once the 2.0.26 release is out
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org