You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Timo Boehme (JIRA)" <ji...@apache.org> on 2012/06/06 15:24:22 UTC
[jira] [Created] (PDFBOX-1333) Stream parsing of BaseParser should
fall back to scanning if length value is wrong
Timo Boehme created PDFBOX-1333:
-----------------------------------
Summary: Stream parsing of BaseParser should fall back to scanning if length value is wrong
Key: PDFBOX-1333
URL: https://issues.apache.org/jira/browse/PDFBOX-1333
Project: PDFBox
Issue Type: Improvement
Components: Parsing
Affects Versions: 1.7.0
Reporter: Timo Boehme
Assignee: Timo Boehme
Fix For: 1.8.0
In 1.7.0 stream parsing in BaseParser was optimized to use length value if available. The advantage is faster parsing and independence of 'endstream' bytes sequences in stream. However the disadvantage is that streams with wrong length values cannot be parsed anymore (see PDFBOX-1331).
To solve this we should check if 'endstream' is really reached when using length value and if not, fall back to 'old' behavior of reading stream until 'endstream' is found.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (PDFBOX-1333) Stream parsing of BaseParser should
fall back to scanning if length value is wrong
Posted by "Timo Boehme (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Timo Boehme updated PDFBOX-1333:
--------------------------------
Attachment: 2012-06-06_BaseParser_streamFallBack.patch
patch for BaseParser which tests that stream parsing using length value reaches 'endstream', if not, parsed data are pushed back and stream is parsed again using scanning for 'endstream';
this patch also increases push back buffer to 64kB in order to be able to hold larger streams; size can be modified using system property org.apache.pdfbox.baseParser.pushBackSize
> Stream parsing of BaseParser should fall back to scanning if length value is wrong
> ----------------------------------------------------------------------------------
>
> Key: PDFBOX-1333
> URL: https://issues.apache.org/jira/browse/PDFBOX-1333
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing
> Affects Versions: 1.7.0
> Reporter: Timo Boehme
> Assignee: Timo Boehme
> Fix For: 1.8.0
>
> Attachments: 2012-06-06_BaseParser_streamFallBack.patch
>
>
> In 1.7.0 stream parsing in BaseParser was optimized to use length value if available. The advantage is faster parsing and independence of 'endstream' bytes sequences in stream. However the disadvantage is that streams with wrong length values cannot be parsed anymore (see PDFBOX-1331).
> To solve this we should check if 'endstream' is really reached when using length value and if not, fall back to 'old' behavior of reading stream until 'endstream' is found.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (PDFBOX-1333) Stream parsing of BaseParser should
fall back to scanning if length value is wrong
Posted by "Timo Boehme (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/PDFBOX-1333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Timo Boehme closed PDFBOX-1333.
-------------------------------
Resolution: Fixed
fixed by applying patch in rev. 1346891
> Stream parsing of BaseParser should fall back to scanning if length value is wrong
> ----------------------------------------------------------------------------------
>
> Key: PDFBOX-1333
> URL: https://issues.apache.org/jira/browse/PDFBOX-1333
> Project: PDFBox
> Issue Type: Improvement
> Components: Parsing
> Affects Versions: 1.7.0
> Reporter: Timo Boehme
> Assignee: Timo Boehme
> Fix For: 1.8.0
>
> Attachments: 2012-06-06_BaseParser_streamFallBack.patch
>
>
> In 1.7.0 stream parsing in BaseParser was optimized to use length value if available. The advantage is faster parsing and independence of 'endstream' bytes sequences in stream. However the disadvantage is that streams with wrong length values cannot be parsed anymore (see PDFBOX-1331).
> To solve this we should check if 'endstream' is really reached when using length value and if not, fall back to 'old' behavior of reading stream until 'endstream' is found.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira