You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "lebouvier (Jira)" <ji...@apache.org> on 2022/07/08 16:06:00 UTC
[jira] [Created] (PDFBOX-5476) Error: Expected operator 'ID' actual='In' at stream offset 142897 []" error occurs in some pdf
lebouvier created PDFBOX-5476:
---------------------------------
Summary: Error: Expected operator 'ID' actual='In' at stream offset 142897 []" error occurs in some pdf
Key: PDFBOX-5476
URL: https://issues.apache.org/jira/browse/PDFBOX-5476
Project: PDFBox
Issue Type: Bug
Reporter: lebouvier
Hi,
While we upload some PDF, we encounter an error like this : "Error: Expected operator 'ID' actual='In' at stream offset 142897 []"
We used *2.0.25 pdfbox version* and we also tried *2.0.26* and it will also work fine in some pdf, but not with others.
+Code :+
public static boolean extractFirstPdfPageAsImageJPEG(final File sourcePdf, final File resultImg,
final Integer maxWidth, final Integer maxHeight) {
try (final PDDocument document = PDDocument.load(sourcePdf)) {
final PDFRenderer pdfRenderer = new PDFRenderer(document);
final BufferedImage extractedImage = pdfRenderer.renderImageWithDPI(0, 100, ImageType.RGB);
final int originalHeight = extractedImage.getHeight();
final int originalWidth = extractedImage.getWidth();
int scaledHeight = originalHeight;
int scaledWidth = originalWidth;
if (originalWidth > maxWidth) {
scaledWidth = maxWidth;
scaledHeight = scaledWidth * originalHeight / originalWidth;
if (scaledHeight > maxHeight) {
scaledHeight = maxHeight;
scaledWidth = scaledHeight * originalWidth / originalHeight;
}
} else if (originalHeight > maxHeight) {
scaledHeight = maxHeight;
scaledWidth = scaledHeight * originalWidth / originalHeight;
}
// creates output image
final BufferedImage resizedImage = new BufferedImage(scaledWidth, scaledHeight, extractedImage.getType());
final Graphics2D g2d = resizedImage.createGraphics();
g2d.drawImage(extractedImage, 0, 0, scaledWidth, scaledHeight, null);
g2d.dispose();
ImageIO.write(resizedImage, "JPEG", resultImg);
return true;
} catch (final IOException e) {
LOG.error(e.getMessage(), e);
return false;
}
}
this *pdfRenderer.renderImageWithDPI(0, 100, ImageType.RGB)* method will generate error :
Error: Expected operator 'ID' actual='In' at stream offset 142897 []
java.io.IOException: Error: Expected operator 'ID' actual='In' at stream offset 142897
at org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:280) ~[pdfbox-2.0.25.jar:2.0.25]
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:521) ~[pdfbox-2.0.25.jar:2.0.25]
at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:492) ~[pdfbox-2.0.25.jar:2.0.25]
at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:155) ~[pdfbox-2.0.25.jar:2.0.25]
at org.apache.pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:282) ~[pdfbox-2.0.25.jar:2.0.25]
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:355) ~[pdfbox-2.0.25.jar:2.0.25]
at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:272) ~[pdfbox-2.0.25.jar:2.0.25]
at org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:258) ~[pdfbox-2.0.25.jar:2.0.25]
Is it a known bug ? Do you know when it will be fixed ?
Thanks a lot,
Regards.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org