You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/04/10 11:40:00 UTC
[jira] [Comment Edited] (PDFBOX-5161) Content stream parse error
that doesn't happen when content stream is parsed alone
[ https://issues.apache.org/jira/browse/PDFBOX-5161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17318477#comment-17318477 ]
Tilman Hausherr edited comment on PDFBOX-5161 at 4/10/21, 11:39 AM:
--------------------------------------------------------------------
There is one difference: when it works, the input is a RandomAccessReadBuffer, and when it doesn't, it is a SequenceRandomAccessRead.
In SequenceRandomAccessRead.read() there is this code
{code}
int maxAvailBytes = Math.min(available(), length);
if (maxAvailBytes == 0)
{
return -1;
}
{code}
that part of the code gets hit long before EOF.
was (Author: tilman):
There is one difference: when it works, the input is a RandomAccessReadBuffer, and it doesn't, it is a SequenceRandomAccessRead.
In SequenceRandomAccessRead.read() there is this code
{code}
int maxAvailBytes = Math.min(available(), length);
if (maxAvailBytes == 0)
{
return -1;
}
{code}
that part of the code gets hit long before EOF.
> Content stream parse error that doesn't happen when content stream is parsed alone
> ----------------------------------------------------------------------------------
>
> Key: PDFBOX-5161
> URL: https://issues.apache.org/jira/browse/PDFBOX-5161
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 3.0.0 PDFBox
> Reporter: Tilman Hausherr
> Priority: Major
> Labels: regression
> Fix For: 3.0.0 PDFBox
>
> Attachments: 179212.pdf, cs.txt
>
>
> {noformat}
> java.io.IOException: Unknown dir object c=')' cInt=41 peek=')' peekInt=41 at offset 12287
> org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:865)
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSArray(BaseParser.java:634)
> org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:130)
> {noformat}
> This code doesn't reproduce the problem:
> {code}
> byte[] bytes = Files.readAllBytes(Paths.get("cs.txt"));
> PDFStreamParser parser = new PDFStreamParser(bytes);
> parser.parse();
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org