You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Allison, Timothy B." <ta...@mitre.org> on 2017/09/11 21:24:16 UTC

2.0.8?

>I hope there aren't any new regressions.

Happy to help find them!  :)

On a related note, do we have a sense of the schedule for PDFBox 2.0.8?  I'd like to include it in Tika's last Java 7 release...end of Sept, middle of Oct., or whenever 2.0.8 is out. :)


-----Original Message-----
From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org] 
Sent: Monday, September 11, 2017 4:52 PM
To: dev@pdfbox.apache.org
Subject: [jira] [Comment Edited] (PDFBOX-3928) IllegalArgumentException: root cannot be null with truncated file


    [ https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161965#comment-16161965 ] 

Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
---------------------------------------------------------------------

Both case are tricky (PDFBOX-3798 is truncated within an object and the attached pdf has a truncated xref table), so that I had to improve the brute force search one more time. 
[~tilman] thanks for the finding. I hope there aren't any new regressions.


was (Author: lehmi):
Both case are tricky, so that I had to improve the brute force search one more time. 
[~tilman] thanks for the finding. I hope there aren't any new regressions.

> IllegalArgumentException: root cannot be null with truncated file
> -----------------------------------------------------------------
>
>                 Key: PDFBOX-3928
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3928
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.7
>            Reporter: Tilman Hausherr
>            Assignee: Andreas Lehmkühler
>              Labels: regression
>             Fix For: 2.0.8, 3.0.0
>
>         Attachments: 023505.pdf
>
>
> {code}
> java.lang.IllegalArgumentException: root cannot be null
>     org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>     org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
>     org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
>     org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
>     org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
>     java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
>     java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown Source)
>     java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown Source)
>     org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
>     org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
>     org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
>     org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
>     org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
> {code}
> This worked in 2.0.6, but no longer in 2.0.7. It happens since [ https://svn.apache.org/r1795705 ] of PDFBOX-3798.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: 2.0.8?

Posted by "Allison, Timothy B." <ta...@mitre.org>.
> because I'm ill but I expect to be my old self later this week.

I'm sorry to hear it!  I hope that you are feeling better soon!

> I'd also like to have a test from version 2.0.4 compared to trunk because 2.0.5 was the version were the tests weren't done, the problems were fixed in 2.0.6 but at that time we tested only 2.0.5 against 2.0.6.

I was just thinking the same thing, but without the specific versions in mind.  :) Great idea.  Will do over the next week...



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler:
> Good idea, there are already a lot of solved tickets for 2.0.8
>
> @all Is there anything pending which should be included?
>
> How about cutting the release in a week or two from now?
>
> @Tim please run a test 2.0.7 vs. 2.0.8 if possible

rather two because I'm ill but I expect to be my old self later this week.

I'd also like to have a test from version 2.0.4 compared to trunk 
because 2.0.5 was the version were the tests weren't done, the problems 
were fixed in 2.0.6 but at that time we tested only 2.0.5 against 2.0.6.

Tilman


>
> Andreas
>
> Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
>>> I hope there aren't any new regressions.
>>
>> Happy to help find them!  :)
>>
>> On a related note, do we have a sense of the schedule for PDFBox 
>> 2.0.8?  I'd like to include it in Tika's last Java 7 release...end of 
>> Sept, middle of Oct., or whenever 2.0.8 is out. :)
>>
>>
>> -----Original Message-----
>> From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
>> Sent: Monday, September 11, 2017 4:52 PM
>> To: dev@pdfbox.apache.org
>> Subject: [jira] [Comment Edited] (PDFBOX-3928) 
>> IllegalArgumentException: root cannot be null with truncated file
>>
>>
>>      [ 
>> https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161965#comment-16161965 
>> ]
>>
>> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
>> ---------------------------------------------------------------------
>>
>> Both case are tricky (PDFBOX-3798 is truncated within an object and 
>> the attached pdf has a truncated xref table), so that I had to 
>> improve the brute force search one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new 
>> regressions.
>>
>>
>> was (Author: lehmi):
>> Both case are tricky, so that I had to improve the brute force search 
>> one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new 
>> regressions.
>>
>>> IllegalArgumentException: root cannot be null with truncated file
>>> -----------------------------------------------------------------
>>>
>>>                  Key: PDFBOX-3928
>>>                  URL: https://issues.apache.org/jira/browse/PDFBOX-3928
>>>              Project: PDFBox
>>>           Issue Type: Bug
>>>           Components: Parsing
>>>     Affects Versions: 2.0.7
>>>             Reporter: Tilman Hausherr
>>>             Assignee: Andreas Lehmkühler
>>>               Labels: regression
>>>              Fix For: 2.0.8, 3.0.0
>>>
>>>          Attachments: 023505.pdf
>>>
>>>
>>> {code}
>>> java.lang.IllegalArgumentException: root cannot be null
>>> org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>>> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
>>> org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
>>> org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
>>> org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
>>> java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
>>> java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
>>> Source)
>>> java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
>>> Source)
>>> org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
>>> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
>>> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
>>> org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
>>>      org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
>>> {code}
>>> This worked in 2.0.6, but no longer in 2.0.7. It happens since [ 
>>> https://svn.apache.org/r1795705 ] of PDFBOX-3798.
>>
>>
>>
>> -- 
>> This message was sent by Atlassian JIRA
>> (v6.4.14#64029)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
>> additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi,

in two weeks would be fine for me.

BR Maruan 

> Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler <an...@lehmi.de>:
> 
> Good idea, there are already a lot of solved tickets for 2.0.8
> 
> @all Is there anything pending which should be included?
> 
> How about cutting the release in a week or two from now?
> 
> @Tim please run a test 2.0.7 vs. 2.0.8 if possible
> 
> Andreas
> 
> Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
>>> I hope there aren't any new regressions.
>> Happy to help find them!  :)
>> On a related note, do we have a sense of the schedule for PDFBox 2.0.8?  I'd like to include it in Tika's last Java 7 release...end of Sept, middle of Oct., or whenever 2.0.8 is out. :)
>> -----Original Message-----
>> From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
>> Sent: Monday, September 11, 2017 4:52 PM
>> To: dev@pdfbox.apache.org
>> Subject: [jira] [Comment Edited] (PDFBOX-3928) IllegalArgumentException: root cannot be null with truncated file
>>     [ https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161965#comment-16161965 ]
>> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
>> ---------------------------------------------------------------------
>> Both case are tricky (PDFBOX-3798 is truncated within an object and the attached pdf has a truncated xref table), so that I had to improve the brute force search one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>> was (Author: lehmi):
>> Both case are tricky, so that I had to improve the brute force search one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>> IllegalArgumentException: root cannot be null with truncated file
>>> -----------------------------------------------------------------
>>> 
>>>                 Key: PDFBOX-3928
>>>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3928
>>>             Project: PDFBox
>>>          Issue Type: Bug
>>>          Components: Parsing
>>>    Affects Versions: 2.0.7
>>>            Reporter: Tilman Hausherr
>>>            Assignee: Andreas Lehmkühler
>>>              Labels: regression
>>>             Fix For: 2.0.8, 3.0.0
>>> 
>>>         Attachments: 023505.pdf
>>> 
>>> 
>>> {code}
>>> java.lang.IllegalArgumentException: root cannot be null
>>>     org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>>>     org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
>>>     org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
>>>     org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
>>>     org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
>>>     java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
>>>     java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown Source)
>>>     java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown Source)
>>>     org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
>>>     org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
>>>     org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
>>>     org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
>>>     org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
>>> {code}
>>> This worked in 2.0.6, but no longer in 2.0.7. It happens since [ https://svn.apache.org/r1795705 ] of PDFBOX-3798.
>> --
>> This message was sent by Atlassian JIRA
>> (v6.4.14#64029)
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 18.09.2017 um 20:16 schrieb Allison, Timothy B.:
> http://162.242.228.174/reports/pdfbox_2_0_7_Vs_2_0_8-SNAPSHOT_reports.tar.gz
>
> is now available.  I haven't yet had a chance to look at either...

Thanks again... the good news is that there is only one regression 
(022616.pdf) which I reported earlier.

There seems to be some loss of metadata but I can't investigate that. I 
remember something that some time in the past this turned out to be a 
false alarm.


Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: 2.0.8?

Posted by "Allison, Timothy B." <ta...@mitre.org>.
http://162.242.228.174/reports/pdfbox_2_0_7_Vs_2_0_8-SNAPSHOT_reports.tar.gz

is now available.  I haven't yet had a chance to look at either...

-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org] 
Sent: Monday, September 18, 2017 12:51 PM
To: dev@pdfbox.apache.org
Subject: RE: 2.0.8?

Reports for 2.0.4 vs 2.0.8-SNAPSHOT (r1808067) are available:

http://162.242.228.174/reports/pdfbox_2_0_4_Vs_2_0_8-SNAPSHOT_reports.tar.gz

I'll post 2.0.7 vs 2.0.8-SNAPSHOT in the next few hours.



-----Original Message-----
From: Andreas Lehmkuehler [mailto:andreas@lehmi.de]
Sent: Wednesday, September 13, 2017 2:33 PM
To: dev@pdfbox.apache.org
Subject: Re: 2.0.8?

Due to the responses I'm planning to cut the release on Monday the 25th

Andreas

Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler:
> Good idea, there are already a lot of solved tickets for 2.0.8
> 
> @all Is there anything pending which should be included?
> 
> How about cutting the release in a week or two from now?
> 
> @Tim please run a test 2.0.7 vs. 2.0.8 if possible
> 
> Andreas
> 
> Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
>>> I hope there aren't any new regressions.
>>
>> Happy to help find them!  :)
>>
>> On a related note, do we have a sense of the schedule for PDFBox 
>> 2.0.8?  I'd like to include it in Tika's last Java 7 release...end of 
>> Sept, middle of Oct., or whenever 2.0.8 is out. :)
>>
>>
>> -----Original Message-----
>> From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
>> Sent: Monday, September 11, 2017 4:52 PM
>> To: dev@pdfbox.apache.org
>> Subject: [jira] [Comment Edited] (PDFBOX-3928)
>> IllegalArgumentException: root cannot be null with truncated file
>>
>>
>>      [
>> https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.
>> jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1
>> 6161965#comment-16161965
>> ]
>>
>> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
>> ---------------------------------------------------------------------
>>
>> Both case are tricky (PDFBOX-3798 is truncated within an object and 
>> the attached pdf has a truncated xref table), so that I had to 
>> improve the brute force search one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>
>>
>> was (Author: lehmi):
>> Both case are tricky, so that I had to improve the brute force search 
>> one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>
>>> IllegalArgumentException: root cannot be null with truncated file
>>> -----------------------------------------------------------------
>>>
>>>                  Key: PDFBOX-3928
>>>                  URL: 
>>> https://issues.apache.org/jira/browse/PDFBOX-3928
>>>              Project: PDFBox
>>>           Issue Type: Bug
>>>           Components: Parsing
>>>     Affects Versions: 2.0.7
>>>             Reporter: Tilman Hausherr
>>>             Assignee: Andreas Lehmkühler
>>>               Labels: regression
>>>              Fix For: 2.0.8, 3.0.0
>>>
>>>          Attachments: 023505.pdf
>>>
>>>
>>> {code}
>>> java.lang.IllegalArgumentException: root cannot be null
>>>      org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>>>      
>>> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatal
>>> og.java:129)
>>>      
>>> org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
>>>      
>>> org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEnt
>>> ry.java:42)
>>>      
>>> org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeMode
>>> l.java:195)
>>>      java.desktop/java.beans.PropertyChangeSupport.fire(Unknown
>>> Source)
>>>      
>>> java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unk
>>> nown
>>> Source)
>>>      
>>> java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unk
>>> nown
>>> Source)
>>>      
>>> org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:128
>>> 8)
>>>      
>>> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:
>>> 1235)
>>>      
>>> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:
>>> 1218)
>>>      
>>> org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
>>>      org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
>>> {code}
>>> This worked in 2.0.6, but no longer in 2.0.7. It happens since [
>>> https://svn.apache.org/r1795705 ] of PDFBOX-3798.
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.4.14#64029)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
>> additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
>> additional commands, e-mail: dev-help@pdfbox.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
> additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 18.09.2017 um 18:51 schrieb Allison, Timothy B.:
> Reports for 2.0.4 vs 2.0.8-SNAPSHOT (r1808067) are available:
>
> http://162.242.228.174/reports/pdfbox_2_0_4_Vs_2_0_8-SNAPSHOT_reports.tar.gz


Thanks! The good news is that in content_diffs_with_exceptions.xlsx, the 
sum of the column NUM_COMMON_TOKENS_DIFF_IN_B is positive :-)

Tilman



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: 2.0.8?

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Reports for 2.0.4 vs 2.0.8-SNAPSHOT (r1808067) are available:

http://162.242.228.174/reports/pdfbox_2_0_4_Vs_2_0_8-SNAPSHOT_reports.tar.gz

I'll post 2.0.7 vs 2.0.8-SNAPSHOT in the next few hours.



-----Original Message-----
From: Andreas Lehmkuehler [mailto:andreas@lehmi.de] 
Sent: Wednesday, September 13, 2017 2:33 PM
To: dev@pdfbox.apache.org
Subject: Re: 2.0.8?

Due to the responses I'm planning to cut the release on Monday the 25th

Andreas

Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler:
> Good idea, there are already a lot of solved tickets for 2.0.8
> 
> @all Is there anything pending which should be included?
> 
> How about cutting the release in a week or two from now?
> 
> @Tim please run a test 2.0.7 vs. 2.0.8 if possible
> 
> Andreas
> 
> Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
>>> I hope there aren't any new regressions.
>>
>> Happy to help find them!  :)
>>
>> On a related note, do we have a sense of the schedule for PDFBox 
>> 2.0.8?  I'd like to include it in Tika's last Java 7 release...end of 
>> Sept, middle of Oct., or whenever 2.0.8 is out. :)
>>
>>
>> -----Original Message-----
>> From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
>> Sent: Monday, September 11, 2017 4:52 PM
>> To: dev@pdfbox.apache.org
>> Subject: [jira] [Comment Edited] (PDFBOX-3928) 
>> IllegalArgumentException: root cannot be null with truncated file
>>
>>
>>      [
>> https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.
>> jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1
>> 6161965#comment-16161965
>> ]
>>
>> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
>> ---------------------------------------------------------------------
>>
>> Both case are tricky (PDFBOX-3798 is truncated within an object and 
>> the attached pdf has a truncated xref table), so that I had to 
>> improve the brute force search one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>
>>
>> was (Author: lehmi):
>> Both case are tricky, so that I had to improve the brute force search 
>> one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>
>>> IllegalArgumentException: root cannot be null with truncated file
>>> -----------------------------------------------------------------
>>>
>>>                  Key: PDFBOX-3928
>>>                  URL: 
>>> https://issues.apache.org/jira/browse/PDFBOX-3928
>>>              Project: PDFBox
>>>           Issue Type: Bug
>>>           Components: Parsing
>>>     Affects Versions: 2.0.7
>>>             Reporter: Tilman Hausherr
>>>             Assignee: Andreas Lehmkühler
>>>               Labels: regression
>>>              Fix For: 2.0.8, 3.0.0
>>>
>>>          Attachments: 023505.pdf
>>>
>>>
>>> {code}
>>> java.lang.IllegalArgumentException: root cannot be null
>>>      org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>>>      
>>> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatal
>>> og.java:129)
>>>      
>>> org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
>>>      
>>> org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEnt
>>> ry.java:42)
>>>      
>>> org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeMode
>>> l.java:195)
>>>      java.desktop/java.beans.PropertyChangeSupport.fire(Unknown 
>>> Source)
>>>      
>>> java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unk
>>> nown
>>> Source)
>>>      
>>> java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unk
>>> nown
>>> Source)
>>>      
>>> org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:128
>>> 8)
>>>      
>>> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:
>>> 1235)
>>>      
>>> org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:
>>> 1218)
>>>      
>>> org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
>>>      org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
>>> {code}
>>> This worked in 2.0.6, but no longer in 2.0.7. It happens since [
>>> https://svn.apache.org/r1795705 ] of PDFBOX-3798.
>>
>>
>>
>> --
>> This message was sent by Atlassian JIRA
>> (v6.4.14#64029)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
>> additional commands, e-mail: dev-help@pdfbox.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
>> additional commands, e-mail: dev-help@pdfbox.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
> additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 25.09.2017 um 12:25 schrieb Andreas Lehmkühler:
>> Andreas Lehmkuehler <an...@lehmi.de> hat am 13. September 2017 um 20:33 geschrieben:
>>
>>
>> Due to the responses I'm planning to cut the release on Monday the 25th
> I'm still working on a solution for PDFBOX-3934 to avoid the regression with PDFBOX-3318. Should we postpone the release for a couple of days or a week max? Or should I simply revert my changes?
>
> WDYT?

I am neutral on this one.

Tilman

>
> Andreas
>
>> Andreas
>>
>> Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler:
>>> Good idea, there are already a lot of solved tickets for 2.0.8
>>>
>>> @all Is there anything pending which should be included?
>>>
>>> How about cutting the release in a week or two from now?
>>>
>>> @Tim please run a test 2.0.7 vs. 2.0.8 if possible
>>>
>>> Andreas
>>>
>>> Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
>>>>> I hope there aren't any new regressions.
>>>> Happy to help find them!  :)
>>>>
>>>> On a related note, do we have a sense of the schedule for PDFBox 2.0.8?  I'd
>>>> like to include it in Tika's last Java 7 release...end of Sept, middle of
>>>> Oct., or whenever 2.0.8 is out. :)
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
>>>> Sent: Monday, September 11, 2017 4:52 PM
>>>> To: dev@pdfbox.apache.org
>>>> Subject: [jira] [Comment Edited] (PDFBOX-3928) IllegalArgumentException: root
>>>> cannot be null with truncated file
>>>>
>>>>
>>>>       [
>>>> https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161965#comment-16161965
>>>> ]
>>>>
>>>> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
>>>> ---------------------------------------------------------------------
>>>>
>>>> Both case are tricky (PDFBOX-3798 is truncated within an object and the
>>>> attached pdf has a truncated xref table), so that I had to improve the brute
>>>> force search one more time.
>>>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>>>
>>>>
>>>> was (Author: lehmi):
>>>> Both case are tricky, so that I had to improve the brute force search one more
>>>> time.
>>>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>>>
>>>>> IllegalArgumentException: root cannot be null with truncated file
>>>>> -----------------------------------------------------------------
>>>>>
>>>>>                   Key: PDFBOX-3928
>>>>>                   URL: https://issues.apache.org/jira/browse/PDFBOX-3928
>>>>>               Project: PDFBox
>>>>>            Issue Type: Bug
>>>>>            Components: Parsing
>>>>>      Affects Versions: 2.0.7
>>>>>              Reporter: Tilman Hausherr
>>>>>              Assignee: Andreas Lehmkühler
>>>>>                Labels: regression
>>>>>               Fix For: 2.0.8, 3.0.0
>>>>>
>>>>>           Attachments: 023505.pdf
>>>>>
>>>>>
>>>>> {code}
>>>>> java.lang.IllegalArgumentException: root cannot be null
>>>>>       org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>>>>>       
>>>>> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
>>>>>       org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
>>>>>       
>>>>> org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
>>>>>       
>>>>> org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
>>>>>       java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
>>>>>       java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown
>>>>> Source)
>>>>>       java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown
>>>>> Source)
>>>>>       org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
>>>>>       org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
>>>>>       org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
>>>>>       org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
>>>>>       org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
>>>>> {code}
>>>>> This worked in 2.0.6, but no longer in 2.0.7. It happens since [
>>>>> https://svn.apache.org/r1795705 ] of PDFBOX-3798.
>>>>
>>>>
>>>> -- 
>>>> This message was sent by Atlassian JIRA
>>>> (v6.4.14#64029)
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional
>>>> commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: 2.0.8?

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Sounds good.  

I kicked off the eval process yesterday, but because of a bug in our config-file reader and/or user error in modifying the config file, I wound up with 500k pdfs parsed by our EmptyParser....no results.

I restarted the eval process just now. I should have results in 6 hours.



-----Original Message-----
From: Andreas Lehmkuehler [mailto:andreas@lehmi.de] 
Sent: Sunday, October 1, 2017 6:31 AM
To: dev@pdfbox.apache.org
Subject: Re: 2.0.8?

Am 25.09.2017 um 18:39 schrieb Andreas Lehmkuehler:
> Am 25.09.2017 um 12:30 schrieb Maruan Sahyoun:
>> Hi,
>>>> Andreas Lehmkuehler <an...@lehmi.de> hat am 13. September 2017 um 
>>>> 20:33
>>>> geschrieben:
>>>>
>>>>
>>>> Due to the responses I'm planning to cut the release on Monday the 
>>>> 25th
>>>
>>> I'm still working on a solution for PDFBOX-3934 to avoid the 
>>> regression with PDFBOX-3318. Should we postpone the release for a 
>>> couple of days or a week max? Or should I simply revert my changes?
>>
>> I'd go for postponing in order to fix that regression - what about 
>> setting the date to next Monday?
> OK, let's postpone, I'm targeting next Monday. Thanks for your 
> patience ;-)
Just a friendly reminder, I'm going to cut the release in about 30 hours from now.

Andreas

> 
> Andreas
>>
>> BR
>> Maruan
>>
>>>
>>> WDYT?
>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
> additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org



Re: 2.0.8?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 02.10.2017 um 23:48 schrieb Allison, Timothy B.:
>> Re 308576.pdf: the text extraction has a huge loss, but a manual check shows it is identical. However that file has the NPE from PDActionURI.getURI(), could it be that this results in an abort of text extraction?
> Same for 569017.pdf.
>
> Likely.  There are two "per file pair contents" files.  The one ending with "_ignore_exceptions.xlsx" means that results are not reported if there was an exception caught for one of the files (308576.pdf and 569017.pdf aren't in that file).  The other one "*_with_exceptions" includes both.  Based on your feedback, I should add 2 boolean cols to "*_with_exceptions.xlsx" for exceptionInA and exceptionInB?

Sorry, I had forgotten that. Yes, the two columns would be useful.

Tilman

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: 2.0.8?

Posted by "Allison, Timothy B." <ta...@mitre.org>.
> Re 308576.pdf: the text extraction has a huge loss, but a manual check shows it is identical. However that file has the NPE from PDActionURI.getURI(), could it be that this results in an abort of text extraction?
Same for 569017.pdf.

Likely.  There are two "per file pair contents" files.  The one ending with "_ignore_exceptions.xlsx" means that results are not reported if there was an exception caught for one of the files (308576.pdf and 569017.pdf aren't in that file).  The other one "*_with_exceptions" includes both.  Based on your feedback, I should add 2 boolean cols to "*_with_exceptions.xlsx" for exceptionInA and exceptionInB?

Re: 2.0.8?

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 02.10.2017 um 21:58 schrieb Allison, Timothy B.:
> Reports are here:
> http://162.242.228.174/reports/pdfbox_2_0_7_Vs_2_0_8_take2.tar.gz
>
> Looks like some new NPEs.  I'll take a look at the metadata diffs.


Re 308576.pdf: the text extraction has a huge loss, but a manual check 
shows it is identical. However that file has the NPE from 
PDActionURI.getURI(), could it be that this results in an abort of text 
extraction?
Same for 569017.pdf.

Some meta diffs are because of a bug fix. In 074031.pdf, some fields had 
"þÿ". This has been fixed and it's now empty.


Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: 2.0.8?

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Sorry all for taking longer than expected!  File under "this information would have been useful..." ☹

-----Original Message-----
From: Allison, Timothy B. 
Sent: Monday, October 2, 2017 3:59 PM
To: dev@pdfbox.apache.org
Subject: RE: 2.0.8?

Reports are here:
http://162.242.228.174/reports/pdfbox_2_0_7_Vs_2_0_8_take2.tar.gz

Looks like some new NPEs.  I'll take a look at the metadata diffs.

-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org] 
Sent: Monday, October 2, 2017 9:24 AM
To: dev@pdfbox.apache.org
Subject: RE: 2.0.8?

>>>Email originates from a non-MITRE system. Use caution.<<<

Sounds good.  

I kicked off the eval process yesterday, but because of a bug in our config-file reader and/or user error in modifying the config file, I wound up with 500k pdfs parsed by our EmptyParser....no results.

I restarted the eval process just now. I should have results in 6 hours.



-----Original Message-----
From: Andreas Lehmkuehler [mailto:andreas@lehmi.de]
Sent: Sunday, October 1, 2017 6:31 AM
To: dev@pdfbox.apache.org
Subject: Re: 2.0.8?

Am 25.09.2017 um 18:39 schrieb Andreas Lehmkuehler:
> Am 25.09.2017 um 12:30 schrieb Maruan Sahyoun:
>> Hi,
>>>> Andreas Lehmkuehler <an...@lehmi.de> hat am 13. September 2017 um
>>>> 20:33
>>>> geschrieben:
>>>>
>>>>
>>>> Due to the responses I'm planning to cut the release on Monday the 
>>>> 25th
>>>
>>> I'm still working on a solution for PDFBOX-3934 to avoid the 
>>> regression with PDFBOX-3318. Should we postpone the release for a 
>>> couple of days or a week max? Or should I simply revert my changes?
>>
>> I'd go for postponing in order to fix that regression - what about 
>> setting the date to next Monday?
> OK, let's postpone, I'm targeting next Monday. Thanks for your 
> patience ;-)
Just a friendly reminder, I'm going to cut the release in about 30 hours from now.

Andreas

> 
> Andreas
>>
>> BR
>> Maruan
>>
>>>
>>> WDYT?
>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
> additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org



RE: 2.0.8?

Posted by "Allison, Timothy B." <ta...@mitre.org>.
Reports are here:
http://162.242.228.174/reports/pdfbox_2_0_7_Vs_2_0_8_take2.tar.gz

Looks like some new NPEs.  I'll take a look at the metadata diffs.

-----Original Message-----
From: Allison, Timothy B. [mailto:tallison@mitre.org] 
Sent: Monday, October 2, 2017 9:24 AM
To: dev@pdfbox.apache.org
Subject: RE: 2.0.8?

>>>Email originates from a non-MITRE system. Use caution.<<<

Sounds good.  

I kicked off the eval process yesterday, but because of a bug in our config-file reader and/or user error in modifying the config file, I wound up with 500k pdfs parsed by our EmptyParser....no results.

I restarted the eval process just now. I should have results in 6 hours.



-----Original Message-----
From: Andreas Lehmkuehler [mailto:andreas@lehmi.de]
Sent: Sunday, October 1, 2017 6:31 AM
To: dev@pdfbox.apache.org
Subject: Re: 2.0.8?

Am 25.09.2017 um 18:39 schrieb Andreas Lehmkuehler:
> Am 25.09.2017 um 12:30 schrieb Maruan Sahyoun:
>> Hi,
>>>> Andreas Lehmkuehler <an...@lehmi.de> hat am 13. September 2017 um
>>>> 20:33
>>>> geschrieben:
>>>>
>>>>
>>>> Due to the responses I'm planning to cut the release on Monday the 
>>>> 25th
>>>
>>> I'm still working on a solution for PDFBOX-3934 to avoid the 
>>> regression with PDFBOX-3318. Should we postpone the release for a 
>>> couple of days or a week max? Or should I simply revert my changes?
>>
>> I'd go for postponing in order to fix that regression - what about 
>> setting the date to next Monday?
> OK, let's postpone, I'm targeting next Monday. Thanks for your 
> patience ;-)
Just a friendly reminder, I'm going to cut the release in about 30 hours from now.

Andreas

> 
> Andreas
>>
>> BR
>> Maruan
>>
>>>
>>> WDYT?
>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For 
> additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org



Re: 2.0.8?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 25.09.2017 um 18:39 schrieb Andreas Lehmkuehler:
> Am 25.09.2017 um 12:30 schrieb Maruan Sahyoun:
>> Hi,
>>>> Andreas Lehmkuehler <an...@lehmi.de> hat am 13. September 2017 um 20:33 
>>>> geschrieben:
>>>>
>>>>
>>>> Due to the responses I'm planning to cut the release on Monday the 25th
>>>
>>> I'm still working on a solution for PDFBOX-3934 to avoid the regression with 
>>> PDFBOX-3318. Should we postpone the release for a couple of days or a week 
>>> max? Or should I simply revert my changes?
>>
>> I'd go for postponing in order to fix that regression - what about
>> setting the date to next Monday?
> OK, let's postpone, I'm targeting next Monday. Thanks for your patience ;-)
Just a friendly reminder, I'm going to cut the release in about 30 hours from now.

Andreas

> 
> Andreas
>>
>> BR
>> Maruan
>>
>>>
>>> WDYT?
>>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Am 25.09.2017 um 12:30 schrieb Maruan Sahyoun:
> Hi,
>>> Andreas Lehmkuehler <an...@lehmi.de> hat am 13. September 2017 um 20:33 geschrieben:
>>>
>>>
>>> Due to the responses I'm planning to cut the release on Monday the 25th
>>
>> I'm still working on a solution for PDFBOX-3934 to avoid the regression with PDFBOX-3318. Should we postpone the release for a couple of days or a week max? Or should I simply revert my changes?
> 
> I'd go for postponing in order to fix that regression - what about
> setting the date to next Monday?
OK, let's postpone, I'm targeting next Monday. Thanks for your patience ;-)

Andreas
> 
> BR
> Maruan
> 
>>
>> WDYT?
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


RE: 2.0.8?

Posted by "Allison, Timothy B." <ta...@mitre.org>.
> I'd go for postponing in order to fix that regression - what about setting the date to next Monday?

+1 I’m happy pushing it out later if the fix happens >= Friday and we want to run the full regression tests again.

Thank you, Andreas!

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Maruan Sahyoun <sa...@fileaffairs.de>.
Hi, 
> > Andreas Lehmkuehler <an...@lehmi.de> hat am 13. September 2017 um 20:33 geschrieben:
> > 
> > 
> > Due to the responses I'm planning to cut the release on Monday the 25th
> 
> I'm still working on a solution for PDFBOX-3934 to avoid the regression with PDFBOX-3318. Should we postpone the release for a couple of days or a week max? Or should I simply revert my changes?

I'd go for postponing in order to fix that regression - what about
setting the date to next Monday?

BR
Maruan

> 
> WDYT?
> 
> Andreas
> 
> > 
> > Andreas
> > 
> > Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler:
> > > Good idea, there are already a lot of solved tickets for 2.0.8
> > > 
> > > @all Is there anything pending which should be included?
> > > 
> > > How about cutting the release in a week or two from now?
> > > 
> > > @Tim please run a test 2.0.7 vs. 2.0.8 if possible
> > > 
> > > Andreas
> > > 
> > > Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
> > > > > I hope there aren't any new regressions.
> > > > 
> > > > Happy to help find them!  :)
> > > > 
> > > > On a related note, do we have a sense of the schedule for PDFBox 2.0.8?  I'd 
> > > > like to include it in Tika's last Java 7 release...end of Sept, middle of 
> > > > Oct., or whenever 2.0.8 is out. :)
> > > > 
> > > > 
> > > > -----Original Message-----
> > > > From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
> > > > Sent: Monday, September 11, 2017 4:52 PM
> > > > To: dev@pdfbox.apache.org
> > > > Subject: [jira] [Comment Edited] (PDFBOX-3928) IllegalArgumentException: root 
> > > > cannot be null with truncated file
> > > > 
> > > > 
> > > >      [ 
> > > > https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161965#comment-16161965 
> > > > ]
> > > > 
> > > > Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
> > > > ---------------------------------------------------------------------
> > > > 
> > > > Both case are tricky (PDFBOX-3798 is truncated within an object and the 
> > > > attached pdf has a truncated xref table), so that I had to improve the brute 
> > > > force search one more time.
> > > > [~tilman] thanks for the finding. I hope there aren't any new regressions.
> > > > 
> > > > 
> > > > was (Author: lehmi):
> > > > Both case are tricky, so that I had to improve the brute force search one more 
> > > > time.
> > > > [~tilman] thanks for the finding. I hope there aren't any new regressions.
> > > > 
> > > > > IllegalArgumentException: root cannot be null with truncated file
> > > > > -----------------------------------------------------------------
> > > > > 
> > > > >                  Key: PDFBOX-3928
> > > > >                  URL: https://issues.apache.org/jira/browse/PDFBOX-3928
> > > > >              Project: PDFBox
> > > > >           Issue Type: Bug
> > > > >           Components: Parsing
> > > > >     Affects Versions: 2.0.7
> > > > >             Reporter: Tilman Hausherr
> > > > >             Assignee: Andreas Lehmkühler
> > > > >               Labels: regression
> > > > >              Fix For: 2.0.8, 3.0.0
> > > > > 
> > > > >          Attachments: 023505.pdf
> > > > > 
> > > > > 
> > > > > {code}
> > > > > java.lang.IllegalArgumentException: root cannot be null
> > > > >      org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
> > > > >      
> > > > > org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
> > > > >      org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
> > > > >      
> > > > > org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
> > > > >      
> > > > > org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
> > > > >      java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
> > > > >      java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
> > > > > Source)
> > > > >      java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
> > > > > Source)
> > > > >      org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
> > > > >      org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
> > > > >      org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
> > > > >      org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
> > > > >      org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
> > > > > {code}
> > > > > This worked in 2.0.6, but no longer in 2.0.7. It happens since [ 
> > > > > https://svn.apache.org/r1795705 ] of PDFBOX-3798.
> > > > 
> > > > 
> > > > 
> > > > -- 
> > > > This message was sent by Atlassian JIRA
> > > > (v6.4.14#64029)
> > > > 
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional 
> > > > commands, e-mail: dev-help@pdfbox.apache.org
> > > > 
> > > > 
> > > > ---------------------------------------------------------------------
> > > > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > > > For additional commands, e-mail: dev-help@pdfbox.apache.org
> > > > 
> > > 
> > > 
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > > For additional commands, e-mail: dev-help@pdfbox.apache.org
> > > 
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: dev-help@pdfbox.apache.org
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 
-- 
Maruan Sahyoun

FileAffairs GmbH
Josef-Schappe-Straße 21
40882 Ratingen

Tel: +49 (2102) 89497 88
Fax: +49 (2102) 89497 91
sahyoun@fileaffairs.de
www.fileaffairs.de

Geschäftsführer: Maruan Sahyoun
Handelsregister: AG Düsseldorf, HRB 53837
UST.-ID: DE248275827

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Andreas Lehmkühler <an...@lehmi.de>.
> Andreas Lehmkuehler <an...@lehmi.de> hat am 13. September 2017 um 20:33 geschrieben:
> 
> 
> Due to the responses I'm planning to cut the release on Monday the 25th
I'm still working on a solution for PDFBOX-3934 to avoid the regression with PDFBOX-3318. Should we postpone the release for a couple of days or a week max? Or should I simply revert my changes?

WDYT?

Andreas

> 
> Andreas
> 
> Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler:
> > Good idea, there are already a lot of solved tickets for 2.0.8
> > 
> > @all Is there anything pending which should be included?
> > 
> > How about cutting the release in a week or two from now?
> > 
> > @Tim please run a test 2.0.7 vs. 2.0.8 if possible
> > 
> > Andreas
> > 
> > Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
> >>> I hope there aren't any new regressions.
> >>
> >> Happy to help find them!  :)
> >>
> >> On a related note, do we have a sense of the schedule for PDFBox 2.0.8?  I'd 
> >> like to include it in Tika's last Java 7 release...end of Sept, middle of 
> >> Oct., or whenever 2.0.8 is out. :)
> >>
> >>
> >> -----Original Message-----
> >> From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
> >> Sent: Monday, September 11, 2017 4:52 PM
> >> To: dev@pdfbox.apache.org
> >> Subject: [jira] [Comment Edited] (PDFBOX-3928) IllegalArgumentException: root 
> >> cannot be null with truncated file
> >>
> >>
> >>      [ 
> >> https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161965#comment-16161965 
> >> ]
> >>
> >> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
> >> ---------------------------------------------------------------------
> >>
> >> Both case are tricky (PDFBOX-3798 is truncated within an object and the 
> >> attached pdf has a truncated xref table), so that I had to improve the brute 
> >> force search one more time.
> >> [~tilman] thanks for the finding. I hope there aren't any new regressions.
> >>
> >>
> >> was (Author: lehmi):
> >> Both case are tricky, so that I had to improve the brute force search one more 
> >> time.
> >> [~tilman] thanks for the finding. I hope there aren't any new regressions.
> >>
> >>> IllegalArgumentException: root cannot be null with truncated file
> >>> -----------------------------------------------------------------
> >>>
> >>>                  Key: PDFBOX-3928
> >>>                  URL: https://issues.apache.org/jira/browse/PDFBOX-3928
> >>>              Project: PDFBox
> >>>           Issue Type: Bug
> >>>           Components: Parsing
> >>>     Affects Versions: 2.0.7
> >>>             Reporter: Tilman Hausherr
> >>>             Assignee: Andreas Lehmkühler
> >>>               Labels: regression
> >>>              Fix For: 2.0.8, 3.0.0
> >>>
> >>>          Attachments: 023505.pdf
> >>>
> >>>
> >>> {code}
> >>> java.lang.IllegalArgumentException: root cannot be null
> >>>      org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
> >>>      
> >>> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
> >>>      org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
> >>>      
> >>> org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
> >>>      
> >>> org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
> >>>      java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
> >>>      java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
> >>> Source)
> >>>      java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
> >>> Source)
> >>>      org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
> >>>      org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
> >>>      org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
> >>>      org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
> >>>      org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
> >>> {code}
> >>> This worked in 2.0.6, but no longer in 2.0.7. It happens since [ 
> >>> https://svn.apache.org/r1795705 ] of PDFBOX-3798.
> >>
> >>
> >>
> >> -- 
> >> This message was sent by Atlassian JIRA
> >> (v6.4.14#64029)
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional 
> >> commands, e-mail: dev-help@pdfbox.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> >> For additional commands, e-mail: dev-help@pdfbox.apache.org
> >>
> > 
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: dev-help@pdfbox.apache.org
> > 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Due to the responses I'm planning to cut the release on Monday the 25th

Andreas

Am 12.09.2017 um 06:43 schrieb Andreas Lehmkuehler:
> Good idea, there are already a lot of solved tickets for 2.0.8
> 
> @all Is there anything pending which should be included?
> 
> How about cutting the release in a week or two from now?
> 
> @Tim please run a test 2.0.7 vs. 2.0.8 if possible
> 
> Andreas
> 
> Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
>>> I hope there aren't any new regressions.
>>
>> Happy to help find them!  :)
>>
>> On a related note, do we have a sense of the schedule for PDFBox 2.0.8?  I'd 
>> like to include it in Tika's last Java 7 release...end of Sept, middle of 
>> Oct., or whenever 2.0.8 is out. :)
>>
>>
>> -----Original Message-----
>> From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
>> Sent: Monday, September 11, 2017 4:52 PM
>> To: dev@pdfbox.apache.org
>> Subject: [jira] [Comment Edited] (PDFBOX-3928) IllegalArgumentException: root 
>> cannot be null with truncated file
>>
>>
>>      [ 
>> https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161965#comment-16161965 
>> ]
>>
>> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
>> ---------------------------------------------------------------------
>>
>> Both case are tricky (PDFBOX-3798 is truncated within an object and the 
>> attached pdf has a truncated xref table), so that I had to improve the brute 
>> force search one more time.
>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>
>>
>> was (Author: lehmi):
>> Both case are tricky, so that I had to improve the brute force search one more 
>> time.
>> [~tilman] thanks for the finding. I hope there aren't any new regressions.
>>
>>> IllegalArgumentException: root cannot be null with truncated file
>>> -----------------------------------------------------------------
>>>
>>>                  Key: PDFBOX-3928
>>>                  URL: https://issues.apache.org/jira/browse/PDFBOX-3928
>>>              Project: PDFBox
>>>           Issue Type: Bug
>>>           Components: Parsing
>>>     Affects Versions: 2.0.7
>>>             Reporter: Tilman Hausherr
>>>             Assignee: Andreas Lehmkühler
>>>               Labels: regression
>>>              Fix For: 2.0.8, 3.0.0
>>>
>>>          Attachments: 023505.pdf
>>>
>>>
>>> {code}
>>> java.lang.IllegalArgumentException: root cannot be null
>>>      org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>>>      
>>> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
>>>      org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
>>>      
>>> org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
>>>      
>>> org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
>>>      java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
>>>      java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
>>> Source)
>>>      java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown 
>>> Source)
>>>      org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
>>>      org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
>>>      org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
>>>      org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
>>>      org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
>>> {code}
>>> This worked in 2.0.6, but no longer in 2.0.7. It happens since [ 
>>> https://svn.apache.org/r1795705 ] of PDFBOX-3798.
>>
>>
>>
>> -- 
>> This message was sent by Atlassian JIRA
>> (v6.4.14#64029)
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional 
>> commands, e-mail: dev-help@pdfbox.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: dev-help@pdfbox.apache.org
>>
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


Re: 2.0.8?

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Good idea, there are already a lot of solved tickets for 2.0.8

@all Is there anything pending which should be included?

How about cutting the release in a week or two from now?

@Tim please run a test 2.0.7 vs. 2.0.8 if possible

Andreas

Am 11.09.2017 um 23:24 schrieb Allison, Timothy B.:
>> I hope there aren't any new regressions.
> 
> Happy to help find them!  :)
> 
> On a related note, do we have a sense of the schedule for PDFBox 2.0.8?  I'd like to include it in Tika's last Java 7 release...end of Sept, middle of Oct., or whenever 2.0.8 is out. :)
> 
> 
> -----Original Message-----
> From: Andreas Lehmkühler (JIRA) [mailto:jira@apache.org]
> Sent: Monday, September 11, 2017 4:52 PM
> To: dev@pdfbox.apache.org
> Subject: [jira] [Comment Edited] (PDFBOX-3928) IllegalArgumentException: root cannot be null with truncated file
> 
> 
>      [ https://issues.apache.org/jira/browse/PDFBOX-3928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16161965#comment-16161965 ]
> 
> Andreas Lehmkühler edited comment on PDFBOX-3928 at 9/11/17 8:51 PM:
> ---------------------------------------------------------------------
> 
> Both case are tricky (PDFBOX-3798 is truncated within an object and the attached pdf has a truncated xref table), so that I had to improve the brute force search one more time.
> [~tilman] thanks for the finding. I hope there aren't any new regressions.
> 
> 
> was (Author: lehmi):
> Both case are tricky, so that I had to improve the brute force search one more time.
> [~tilman] thanks for the finding. I hope there aren't any new regressions.
> 
>> IllegalArgumentException: root cannot be null with truncated file
>> -----------------------------------------------------------------
>>
>>                  Key: PDFBOX-3928
>>                  URL: https://issues.apache.org/jira/browse/PDFBOX-3928
>>              Project: PDFBox
>>           Issue Type: Bug
>>           Components: Parsing
>>     Affects Versions: 2.0.7
>>             Reporter: Tilman Hausherr
>>             Assignee: Andreas Lehmkühler
>>               Labels: regression
>>              Fix For: 2.0.8, 3.0.0
>>
>>          Attachments: 023505.pdf
>>
>>
>> {code}
>> java.lang.IllegalArgumentException: root cannot be null
>>      org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
>>      org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
>>      org.apache.pdfbox.pdmodel.PDDocument.getPages(PDDocument.java:1388)
>>      org.apache.pdfbox.debugger.ui.DocumentEntry.getPageCount(DocumentEntry.java:42)
>>      org.apache.pdfbox.debugger.ui.PDFTreeModel.getChildCount(PDFTreeModel.java:195)
>>      java.desktop/java.beans.PropertyChangeSupport.fire(Unknown Source)
>>      java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown Source)
>>      java.desktop/java.beans.PropertyChangeSupport.firePropertyChange(Unknown Source)
>>      org.apache.pdfbox.debugger.PDFDebugger.initTree(PDFDebugger.java:1288)
>>      org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1235)
>>      org.apache.pdfbox.debugger.PDFDebugger.readPDFFile(PDFDebugger.java:1218)
>>      org.apache.pdfbox.debugger.PDFDebugger.main(PDFDebugger.java:1209)
>>      org.apache.pdfbox.tools.PDFBox.main(PDFBox.java:85)
>> {code}
>> This worked in 2.0.6, but no longer in 2.0.7. It happens since [ https://svn.apache.org/r1795705 ] of PDFBOX-3798.
> 
> 
> 
> --
> This message was sent by Atlassian JIRA
> (v6.4.14#64029)
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org For additional commands, e-mail: dev-help@pdfbox.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: dev-help@pdfbox.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org