You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2022/05/14 02:31:00 UTC

[jira] [Updated] (PDFBOX-5433) PDFStreamEngine creating new operators that do not exist in document

     [ https://issues.apache.org/jira/browse/PDFBOX-5433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tilman Hausherr updated PDFBOX-5433:
------------------------------------
    Attachment: screenshot-1.png

> PDFStreamEngine creating new operators that do not exist in document
> --------------------------------------------------------------------
>
>                 Key: PDFBOX-5433
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5433
>             Project: PDFBox
>          Issue Type: Bug
>            Reporter: Mike Cantrell
>            Priority: Major
>         Attachments: pdfbox-stream-engine-operators.zip, screenshot-1.png
>
>
> We're using PDFStreamEngine to do some analysis and filtering (optimizations) to the document's content streams. I've found an odd case where a form giving us extra (unwanted) operators that don't exist in the original stream.
> According to the PDFDebugger, the form's stream has the following contents:
>  
> {code:java}
> 0 TL
> q
>   BT
>     1 0 0 rg
>     0 i
>     /TT0 20 Tf
>     0 Tc
>     0 Tw
>     0 Ts
>     100 Tz
>     0 Tr
>     0 -15.791 TD
>     (HOODHD035236) Tj
>   ET
> Q{code}
> I created a debug utility to output the operators given by the PDFStreamEngine
> {code:java}
> @Getter
> static class StreamDebugger extends PDFStreamEngine {
>     String formName;
>     Operator operator;
>     List<COSBase> operands;
>     int operatorCount;
>     public StreamDebugger() {
>         addOperator(new BeginText());
>         addOperator(new Concatenate());
>         addOperator(new DrawObject()); // special text version
>         addOperator(new EndText());
>         addOperator(new SetGraphicsStateParameters());
>         addOperator(new Save());
>         addOperator(new Restore());
>         addOperator(new NextLine());
>         addOperator(new SetCharSpacing());
>         addOperator(new MoveText());
>         addOperator(new MoveTextSetLeading());
>         addOperator(new SetFontAndSize());
>         addOperator(new ShowText());
>         addOperator(new ShowTextAdjusted());
>         addOperator(new SetTextLeading());
>         addOperator(new SetMatrix());
>         addOperator(new SetTextRenderingMode());
>         addOperator(new SetTextRise());
>         addOperator(new SetWordSpacing());
>         addOperator(new SetTextHorizontalScaling());
>         addOperator(new ShowTextLine());
>         addOperator(new ShowTextLineAndSpace());
>     }
>     @Override
>     public void showForm(PDFormXObject form) throws IOException {
>         this.formName = ((COSName) operands.get(0)).getName();
>         super.showForm(form);
>         this.formName = null;
>     }
>     @Override
>     protected void processOperator(Operator operator, List<COSBase> operands) throws IOException {
>         this.operator = operator;
>         this.operands = operands;
>         if (Objects.equals(this.formName, "Fm0")) {
>             this.operatorCount++;
>             System.out.printf("%s:%s%n", operator.getName(), operands.toString());
>         }
>         super.processOperator(operator, operands);
>     }
> } {code}
> The resulting output:
> {code:java}
> TL:[COSInt{0}]
> q:[]
> BT:[]
> rg:[COSInt{1}, COSInt{0}, COSInt{0}]
> i:[COSInt{0}]
> Tf:[COSName{TT0}, COSInt{20}]
> Tc:[COSInt{0}]
> Tw:[COSInt{0}]
> Ts:[COSInt{0}]
> Tz:[COSInt{100}]
> Tr:[COSInt{0}]
> TD:[COSInt{0}, COSFloat{-15.791}]
> TL:[COSFloat{15.791}]
> Td:[COSInt{0}, COSFloat{-15.791}]
> Tj:[COSString{HOODHD035236}]
> ET:[]
> Q:[] {code}
> These operators do not exist in the original stream:
> {code:java}
> TL:[COSFloat{15.791}]
> Td:[COSInt{0}, COSFloat{-15.791}]{code}
> If you were to re-write the stream given the operators from the engine, it causes display issues in the resulting PDF.
> I'm attaching a test case which demonstrates the issue. 
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org