You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/07/12 20:47:04 UTC

[jira] [Commented] (PDFBOX-2205) (Graphics) Operator Refactoring

    [ https://issues.apache.org/jira/browse/PDFBOX-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14059880#comment-14059880 ] 

John Hewson commented on PDFBOX-2205:
-------------------------------------

Ok, I committed this is [r1610021|http://svn.apache.org/r1610021]. I introduced a new class {{PDFGraphicsStreamEngine}} which the old "operators.pagedrawer" operators now direct all their callbacks to. The existing {{PageDrawer}} is now a subclass of this class. The "operators.pagedrawer" is now "operators.graphics".

{{PDFGraphicsStreamEngine}} contains many new graphics callbacks which previously were routed directly to {{PageDrawer}}, such as {{moveTo}}, {{lineTo}}, {{curveTp}}, etc. I've also encapsulated the GeneralPath and TransparencyGroup objects which PageDrawer was making use of. Overall I find that the code is much easier to read and to reason about: a nice side-effect.

I took the opportunity to do some general cleaning of the operator classes, fixing the JavaDoc and generally cleaning things up, along with some repackaging.

---

It's probably worth emphasising that now we have PDFGraphicsStreamEngine and PDFTextStreamEngine that subclassing of operators is basically deprecated - it isn't fully - but most users should be subclassing one of these two classes for most use cases where operator overloading was performed. One reason why operator overloading is so undesirable is that it effectively messes with PDFBox's internals, breaking encapsulation and forcing users into copying and pasting code from PDFBox into their own code when super.process() isn't flexible enough for them.

---

I've noticed that I'm getting very slightly different renderings due to some ints no longer being truncated inside the operator classes. However, we shouldn't be seeing any rendering changes - any regressions please let me know. Likewise, any feedback is more than welcome.

> (Graphics) Operator Refactoring
> -------------------------------
>
>                 Key: PDFBOX-2205
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2205
>             Project: PDFBox
>          Issue Type: Improvement
>          Components: PDModel, Rendering
>    Affects Versions: 2.0.0
>            Reporter: John Hewson
>             Fix For: 2.0.0
>
>
> I'm in the process of porting a fairly complex program which uses the 1.8 API over to 2.0, as a way of finding out where the rough edges in 2.0 are. The app which I'm porting hooks into many of the graphics operators and subclasses PageDrawer to get access to the PDF's graphics state.
> It turns out that this doesn't work very well, especially in 2.0 where more of the PageDrawer's state is private and we have the additional complexity of transparency groups.
> The main issue is that the graphics operators are coupled to PageDrawer, but I'm not interested in the AWT rendering, I just need a way to hook into the graphics operations - subclassing the operators has proven to be a poor solution as there are cases where calling super.process() doesn't provide enough flexibility.
> So here's my solution: in the same way that text processing was recently factored-out into PDFTextStreamEngine for end-users to subclass, I'd like to do the same with graphics operations. Instead of the graphics operators being coupled to PageDrawer, which is only one possible implementation of graphics handling, we can move the methods which the operators call up into a new subclass of PDFStreamEngine, let's call it PDFGraphicsStreamEngine. This class can then be subclassed by anyone interested in hooking into the graphics operations, including PageDrawer.
> With the new callbacks for text handling already in PDFTextStreamEngine and the addition of new graphics callbacks in PDFGraphicsStreamEngine, most of the time it shouldn't be necessary for end-users to need to override the operator classes to get access to the information they need, which would be a huge benefit :)
> This will involve a bunch of changes to operators, so I'll take the chance to do some general cleaning up while I'm at it: the operator classes haven't received much attention for a while. With more callbacks in PDFStreamEngine et al, we're moving towards a point where the operator classes are becoming almost an internal part of the PDFBox API: might be something to think about for the future.



--
This message was sent by Atlassian JIRA
(v6.2#6252)