You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-dev@xmlgraphics.apache.org by Jeremias Maerki <de...@greenmail.ch> on 2004/08/15 16:55:52 UTC

PDF Transcoder: Revisiting transcoder setup

Thomas (and all),

I'm currently tracking down differences between the PDF and PS
transcoders. The following thread triggered that:
http://nagoya.apache.org/eyebrowse/BrowseList?listName=fop-user@xml.apache.org&by=thread&from=825221

There's an SVG attached to the first post. It's a SVG with width and
height specified but without a viewBox. In the EPS transcoder the file
showed correctly, at first. At least until I changed the resolution from
96dpi to 300dpi. The PDF transcoder did the whole thing wrong because
it's now working at a fixed 72dpi as per your patch:
http://marc.theaimsgroup.com/?l=fop-dev&m=106795227230411&w=2

I didn't question that patch back then, but I'm now curious why you
think it is necessary to use a fixed resolution. You say it's the
default user space of PDFGraphics2D. I'd say there is no resolution in
PDF, only the resolution at which some SVG constructs will be rendered
as bitmaps when they cannot (yet) be expressed natively. In the EPS
transcoder I've managed to make it work at every resolution in the
meantime (fixes not committed, yet) by generating the right initial
transforms. I believe this can be applied to PDF, too.

The limitation to 72dpi in the PDF transcoder has the undesired side
effect of outputting embedded images in a very low quality. Removing it
improved quality a lot here.

But I still have one problem with the above SVG file without viewBox. In
the PDF transcoder the images is too big. If I manually put a viewBox="0
0 533 266" into the SVG file then it comes out correctly in PDF.
Obviously, the SVG file is made for a 96dpi environment which explains
the behaviour. A solution to this problem would be if I could ask Batik
to give me the effective (outermost) viewBox even if none is available.
Does something like that exist? I didn't find anything.

Sorry for the long post. I'm not sure I understood the whole thing
completely and I'm hoping you (or someone else) might have some
additional ideas. Thanks!

Jeremias Maerki


Re: PDF Transcoder: Revisiting transcoder setup

Posted by Thomas DeWeese <Th...@Kodak.com>.
Hi Jeremias,

    Thanks!

    I have a new small patch for FOP that tells Batik not
to raserize clipped components (I'll submit over on fop-dev).
This is of course one of those tricky 'bootstrapping' issues, since
you can't apply the patch until after you integrate a new Batik... :/

Jeremias Maerki wrote:

> PDFObject.formatDateTime() is now compatible to JDK 1.3 again.
> 
> On 16.08.2004 22:26:47 Thomas DeWeese wrote:
> 
>>>>   1) It looks like a JDK 1.4 dependency has slipped in:
>>>>java.lang.IllegalArgumentException: Illegal pattern character 'Z'
>>>
>>>Yes. I will see to that. FOP officially still targets JDK 1.3.x, too,
>>>although some would like that requirement dropped. In that case I will
>>>see to it that the Batik-relevant parts remain 1.3-compatible for the
>>>time being.


---------------------------------------------------------------------
To unsubscribe, e-mail: batik-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: batik-dev-help@xml.apache.org


Re: PDF Transcoder: Revisiting transcoder setup

Posted by Jeremias Maerki <de...@greenmail.ch>.
PDFObject.formatDateTime() is now compatible to JDK 1.3 again.

On 16.08.2004 22:26:47 Thomas DeWeese wrote:
> >>    1) It looks like a JDK 1.4 dependency has slipped in:
> >>java.lang.IllegalArgumentException: Illegal pattern character 'Z'
> > 
> > Yes. I will see to that. FOP officially still targets JDK 1.3.x, too,
> > although some would like that requirement dropped. In that case I will
> > see to it that the Batik-relevant parts remain 1.3-compatible for the
> > time being.
> 
>     Thanks!
> 


Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: batik-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: batik-dev-help@xml.apache.org


Re: PDF Transcoder: Revisiting transcoder setup

Posted by Thomas DeWeese <Th...@Kodak.com>.
Jeremias Maerki wrote:

>>>The limitation to 72dpi in the PDF transcoder has the undesired side
>>>effect of outputting embedded images in a very low quality. Removing it
>>>improved quality a lot here.
>>
>>     Yah, my original thought was to have the PDFGraphics2D rasterize
>>at something like 150-300dpi (configurable of course) but provide an
>>initial transform of 1 unit = 72 dpi.
> 
> see above.

   I actually believe that this would help other user communities.
Unlike Batik/SVG where we are happy to work with a Graphics2D where
the default mapping is 150 or 300dpi most applications will just
wig out (they assume the Graphics is ~72-96dpi).  For them to generate
high resolution images the PDF Graphics needs to internalize this
transform.

   So the idea is that one unit maps to 1/72 or 1/96th of an inch,
but if you ask the Graphics2D for it's transform it would return
something like a scale by 4 (i.e. the the device has 4 pixels for each
user space unit).  The tricky bit is that it then internally needs to
remap objects drawn in this artificial 300dpi coordinate system to the
PDF 72dpi coordinate system (essentially it contains a second
'hidden' transform that maps the 'device' space to PDF space).

   This system will "trick" well written applications into rasterizing
content at 300dpi without having to be setup to scale all other
drawing operations.  This is what the current Print Graphics instances
do.  The scale factor should of course be configurable.

>>    To answer your question in this case you can ask the root GVT node
>>for it's bounds (getBounds()).  In the case of a dynamic document you
>>can use getBBox() on the DOM nodes (but that generally doesn't apply 
>>here). This is the true geometric bounds of the document but often
>>this isn't really what people want - which is why it isn't a good
>>idea to just use it anyways.
>>
>>    I agree that the document is in error.  It is assuming that all
>>user agent's will use 96 pixels to the inch for real work conversion.
>>This is a bad assumption, in SVG you must include a viewBox if you
>>want to ensure your content will show properly in the useragent's
>>window.
> 
> Ok, but that means we could (!) provide an optional convenience mechanism
> for the user to have his faulty SVG display correctly if he can't
> (technically) fix his SVG. I'll try that.

    You could add a 'fit to page' type functionality.  Personally, I
would rather encourage people to always include a viewBox attribute,
for almost all documents it is likely an error/oversight that no
viewBox is included.

>>    1) It looks like a JDK 1.4 dependency has slipped in:
>>java.lang.IllegalArgumentException: Illegal pattern character 'Z'
> 
> Yes. I will see to that. FOP officially still targets JDK 1.3.x, too,
> although some would like that requirement dropped. In that case I will
> see to it that the Batik-relevant parts remain 1.3-compatible for the
> time being.

    Thanks!

>>    2) I was going to commit a patch that adds support for anti-aliased
>>       clipping (for shape-rendering="geometricPrecision").  However
>>       this may result in a lot more content being rasterized.  This
>>       isn't a big deal within Batik but for transcoding this loses a
>>       lot of semantic value.  So I will likely add a method to the
>>       UserAgent to check if the user agent want's it to use
>>       anti-aliased clipping or not.
>>
>>       Does this sound like an acceptable solution?
> 
> 
> Sounds good to me.

    Ok, I'll work on this. The more I consider it I suspect that a
custom RenderingHint may work better.



---------------------------------------------------------------------
To unsubscribe, e-mail: batik-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: batik-dev-help@xml.apache.org


Re: PDF Transcoder: Revisiting transcoder setup

Posted by Jeremias Maerki <de...@greenmail.ch>.
(comments inline)

On 15.08.2004 22:53:16 Thomas DeWeese wrote:
> Jeremias Maerki wrote:
> > Thomas (and all),
> > 
> > I'm currently tracking down differences between the PDF and PS
> > transcoders. The following thread triggered that:
> > http://nagoya.apache.org/eyebrowse/BrowseList?listName=fop-user@xml.apache.org&by=thread&from=825221
> > 
> > There's an SVG attached to the first post. It's a SVG with width and
> > height specified but without a viewBox. In the EPS transcoder the file
> > showed correctly, at first. At least until I changed the resolution from
> > 96dpi to 300dpi. The PDF transcoder did the whole thing wrong because
> > it's now working at a fixed 72dpi as per your patch:
> > http://marc.theaimsgroup.com/?l=fop-dev&m=106795227230411&w=2
> > 
> > I didn't question that patch back then, but I'm now curious why you
> > think it is necessary to use a fixed resolution. You say it's the
> > default user space of PDFGraphics2D. I'd say there is no resolution in
> > PDF, only the resolution at which some SVG constructs will be rendered
> > as bitmaps when they cannot (yet) be expressed natively. 
> 
>     This is almost right.  The problem is that the PDF transcoder
> starts with one userspace unit equal to one 'pt' (1/72nd of an inch).
> Without my patch the generated PDF files for documents that specified
> a size using a real world unit were off by 96/72.  I suspect that your
> EPS files will have similar issues.

Right. Default userspace unit in PostScript is 1pt. PDF and PostScript
are similar in many aspects.

> > In the EPS transcoder I've managed to make it work at every resolution 
> > in the meantime (fixes not committed, yet) by generating the right initial
> > transforms. I believe this can be applied to PDF, too.
> 
>     Yes, I suspect that we could fix this by adding a scale by
> 72/getPixelUnitToMillimeter() before rendering the document (My
> original thought was to have the PDFGraphics do much of this work,
> which might have some advantages but would be a lot more work).

I'm not sure what you mean here, but if it made the behaviour of
PDFGraphics too much SVG/Batik-specific I wouldn't be too happy. I'd
like to use it for different purposes, too: Java Printing System, custom
printables, a FOP renderer that uses Java2D to render to PDF.... I'm
pretty sure the problems can be handled using the right setup code.

> > The limitation to 72dpi in the PDF transcoder has the undesired side
> > effect of outputting embedded images in a very low quality. Removing it
> > improved quality a lot here.
> 
>      Yah, my original thought was to have the PDFGraphics2D rasterize
> at something like 150-300dpi (configurable of course) but provide an
> initial transform of 1 unit = 72 dpi.

see above.

> > But I still have one problem with the above SVG file without viewBox. In
> > the PDF transcoder the images is too big. If I manually put a viewBox="0
> > 0 533 266" into the SVG file then it comes out correctly in PDF.
> > Obviously, the SVG file is made for a 96dpi environment which explains
> > the behaviour. A solution to this problem would be if I could ask Batik
> > to give me the effective (outermost) viewBox even if none is available.
> > Does something like that exist? I didn't find anything.
> 
>     This would be a violation of the SVG specification.  You should not
> 'create' a transform where the user has not requested one.  The one 
> place you might use something like this is if the user does not
> provide a width and height - in this case the user agent is free to
> select one.

<fx>bulb over my head blinking on</fx>

>     To answer your question in this case you can ask the root GVT node
> for it's bounds (getBounds()).  In the case of a dynamic document you
> can use getBBox() on the DOM nodes (but that generally doesn't apply 
> here). This is the true geometric bounds of the document but often
> this isn't really what people want - which is why it isn't a good
> idea to just use it anyways.
> 
>     I agree that the document is in error.  It is assuming that all
> user agent's will use 96 pixels to the inch for real work conversion.
> This is a bad assumption, in SVG you must include a viewBox if you
> want to ensure your content will show properly in the useragent's
> window.

Ok, but that means we could (!) provide an optional convenience mechanism
for the user to have his faulty SVG display correctly if he can't
(technically) fix his SVG. I'll try that.

> > Sorry for the long post. I'm not sure I understood the whole thing
> > completely and I'm hoping you (or someone else) might have some
> > additional ideas. Thanks!
> 
>     Looking at this raised two new issues for me.
>     1) It looks like a JDK 1.4 dependency has slipped in:
> java.lang.IllegalArgumentException: Illegal pattern character 'Z'
>          at 
> java.text.SimpleDateFormat.subFormat(SimpleDateFormat.java:472)
>          at java.text.SimpleDateFormat.format(SimpleDateFormat.java:432)
>          at java.text.DateFormat.format(DateFormat.java:300)
>          at 
> org.apache.fop.pdf.PDFObject.formatDateTime(PDFObject.java:231)
>          at org.apache.fop.pdf.PDFInfo.toPDF(PDFInfo.java:159)
>          at org.apache.fop.pdf.PDFObject.output(PDFObject.java:150)
>          at org.apache.fop.pdf.PDFDocument.output(PDFDocument.java:794)
>          at 
> org.apache.fop.svg.PDFGraphics2D.drawImage(PDFGraphics2D.java:505)
> 
>        Can this be fixed?  Batik still targets JDK 1.3.x.

Yes. I will see to that. FOP officially still targets JDK 1.3.x, too,
although some would like that requirement dropped. In that case I will
see to it that the Batik-relevant parts remain 1.3-compatible for the
time being.

Worst of all, the issue in PDFObject may even be my fault.

>     2) I was going to commit a patch that adds support for anti-aliased
>        clipping (for shape-rendering="geometricPrecision").  However
>        this may result in a lot more content being rasterized.  This
>        isn't a big deal within Batik but for transcoding this loses a
>        lot of semantic value.  So I will likely add a method to the
>        UserAgent to check if the user agent want's it to use
>        anti-aliased clipping or not.
> 
>        Does this sound like an acceptable solution?

Sounds good to me.


Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: batik-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: batik-dev-help@xml.apache.org


Re: PDF Transcoder: Revisiting transcoder setup

Posted by Thomas DeWeese <Th...@Kodak.com>.
Jeremias Maerki wrote:
> Thomas (and all),
> 
> I'm currently tracking down differences between the PDF and PS
> transcoders. The following thread triggered that:
> http://nagoya.apache.org/eyebrowse/BrowseList?listName=fop-user@xml.apache.org&by=thread&from=825221
> 
> There's an SVG attached to the first post. It's a SVG with width and
> height specified but without a viewBox. In the EPS transcoder the file
> showed correctly, at first. At least until I changed the resolution from
> 96dpi to 300dpi. The PDF transcoder did the whole thing wrong because
> it's now working at a fixed 72dpi as per your patch:
> http://marc.theaimsgroup.com/?l=fop-dev&m=106795227230411&w=2
> 
> I didn't question that patch back then, but I'm now curious why you
> think it is necessary to use a fixed resolution. You say it's the
> default user space of PDFGraphics2D. I'd say there is no resolution in
> PDF, only the resolution at which some SVG constructs will be rendered
> as bitmaps when they cannot (yet) be expressed natively. 

    This is almost right.  The problem is that the PDF transcoder
starts with one userspace unit equal to one 'pt' (1/72nd of an inch).
Without my patch the generated PDF files for documents that specified
a size using a real world unit were off by 96/72.  I suspect that your
EPS files will have similar issues.

> In the EPS transcoder I've managed to make it work at every resolution 
> in the meantime (fixes not committed, yet) by generating the right initial
> transforms. I believe this can be applied to PDF, too.

    Yes, I suspect that we could fix this by adding a scale by
72/getPixelUnitToMillimeter() before rendering the document (My
original thought was to have the PDFGraphics do much of this work,
which might have some advantages but would be a lot more work).

> The limitation to 72dpi in the PDF transcoder has the undesired side
> effect of outputting embedded images in a very low quality. Removing it
> improved quality a lot here.

     Yah, my original thought was to have the PDFGraphics2D rasterize
at something like 150-300dpi (configurable of course) but provide an
initial transform of 1 unit = 72 dpi.

> But I still have one problem with the above SVG file without viewBox. In
> the PDF transcoder the images is too big. If I manually put a viewBox="0
> 0 533 266" into the SVG file then it comes out correctly in PDF.
> Obviously, the SVG file is made for a 96dpi environment which explains
> the behaviour. A solution to this problem would be if I could ask Batik
> to give me the effective (outermost) viewBox even if none is available.
> Does something like that exist? I didn't find anything.

    This would be a violation of the SVG specification.  You should not
'create' a transform where the user has not requested one.  The one 
place you might use something like this is if the user does not
provide a width and height - in this case the user agent is free to
select one.

    To answer your question in this case you can ask the root GVT node
for it's bounds (getBounds()).  In the case of a dynamic document you
can use getBBox() on the DOM nodes (but that generally doesn't apply 
here). This is the true geometric bounds of the document but often
this isn't really what people want - which is why it isn't a good
idea to just use it anyways.

    I agree that the document is in error.  It is assuming that all
user agent's will use 96 pixels to the inch for real work conversion.
This is a bad assumption, in SVG you must include a viewBox if you
want to ensure your content will show properly in the useragent's
window.

> Sorry for the long post. I'm not sure I understood the whole thing
> completely and I'm hoping you (or someone else) might have some
> additional ideas. Thanks!

    Looking at this raised two new issues for me.
    1) It looks like a JDK 1.4 dependency has slipped in:
java.lang.IllegalArgumentException: Illegal pattern character 'Z'
         at 
java.text.SimpleDateFormat.subFormat(SimpleDateFormat.java:472)
         at java.text.SimpleDateFormat.format(SimpleDateFormat.java:432)
         at java.text.DateFormat.format(DateFormat.java:300)
         at 
org.apache.fop.pdf.PDFObject.formatDateTime(PDFObject.java:231)
         at org.apache.fop.pdf.PDFInfo.toPDF(PDFInfo.java:159)
         at org.apache.fop.pdf.PDFObject.output(PDFObject.java:150)
         at org.apache.fop.pdf.PDFDocument.output(PDFDocument.java:794)
         at 
org.apache.fop.svg.PDFGraphics2D.drawImage(PDFGraphics2D.java:505)

       Can this be fixed?  Batik still targets JDK 1.3.x.

    2) I was going to commit a patch that adds support for anti-aliased
       clipping (for shape-rendering="geometricPrecision").  However
       this may result in a lot more content being rasterized.  This
       isn't a big deal within Batik but for transcoding this loses a
       lot of semantic value.  So I will likely add a method to the
       UserAgent to check if the user agent want's it to use
       anti-aliased clipping or not.

       Does this sound like an acceptable solution?

---------------------------------------------------------------------
To unsubscribe, e-mail: batik-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: batik-dev-help@xml.apache.org


Re: PDF Transcoder: Revisiting transcoder setup

Posted by Thomas DeWeese <Th...@Kodak.com>.
Jeremias Maerki wrote:
> Thomas (and all),
> 
> I'm currently tracking down differences between the PDF and PS
> transcoders. The following thread triggered that:
> http://nagoya.apache.org/eyebrowse/BrowseList?listName=fop-user@xml.apache.org&by=thread&from=825221
> 
> There's an SVG attached to the first post. It's a SVG with width and
> height specified but without a viewBox. In the EPS transcoder the file
> showed correctly, at first. At least until I changed the resolution from
> 96dpi to 300dpi. The PDF transcoder did the whole thing wrong because
> it's now working at a fixed 72dpi as per your patch:
> http://marc.theaimsgroup.com/?l=fop-dev&m=106795227230411&w=2
> 
> I didn't question that patch back then, but I'm now curious why you
> think it is necessary to use a fixed resolution. You say it's the
> default user space of PDFGraphics2D. I'd say there is no resolution in
> PDF, only the resolution at which some SVG constructs will be rendered
> as bitmaps when they cannot (yet) be expressed natively. 

    This is almost right.  The problem is that the PDF transcoder
starts with one userspace unit equal to one 'pt' (1/72nd of an inch).
Without my patch the generated PDF files for documents that specified
a size using a real world unit were off by 96/72.  I suspect that your
EPS files will have similar issues.

> In the EPS transcoder I've managed to make it work at every resolution 
> in the meantime (fixes not committed, yet) by generating the right initial
> transforms. I believe this can be applied to PDF, too.

    Yes, I suspect that we could fix this by adding a scale by
72/getPixelUnitToMillimeter() before rendering the document (My
original thought was to have the PDFGraphics do much of this work,
which might have some advantages but would be a lot more work).

> The limitation to 72dpi in the PDF transcoder has the undesired side
> effect of outputting embedded images in a very low quality. Removing it
> improved quality a lot here.

     Yah, my original thought was to have the PDFGraphics2D rasterize
at something like 150-300dpi (configurable of course) but provide an
initial transform of 1 unit = 72 dpi.

> But I still have one problem with the above SVG file without viewBox. In
> the PDF transcoder the images is too big. If I manually put a viewBox="0
> 0 533 266" into the SVG file then it comes out correctly in PDF.
> Obviously, the SVG file is made for a 96dpi environment which explains
> the behaviour. A solution to this problem would be if I could ask Batik
> to give me the effective (outermost) viewBox even if none is available.
> Does something like that exist? I didn't find anything.

    This would be a violation of the SVG specification.  You should not
'create' a transform where the user has not requested one.  The one 
place you might use something like this is if the user does not
provide a width and height - in this case the user agent is free to
select one.

    To answer your question in this case you can ask the root GVT node
for it's bounds (getBounds()).  In the case of a dynamic document you
can use getBBox() on the DOM nodes (but that generally doesn't apply 
here). This is the true geometric bounds of the document but often
this isn't really what people want - which is why it isn't a good
idea to just use it anyways.

    I agree that the document is in error.  It is assuming that all
user agent's will use 96 pixels to the inch for real work conversion.
This is a bad assumption, in SVG you must include a viewBox if you
want to ensure your content will show properly in the useragent's
window.

> Sorry for the long post. I'm not sure I understood the whole thing
> completely and I'm hoping you (or someone else) might have some
> additional ideas. Thanks!

    Looking at this raised two new issues for me.
    1) It looks like a JDK 1.4 dependency has slipped in:
java.lang.IllegalArgumentException: Illegal pattern character 'Z'
         at 
java.text.SimpleDateFormat.subFormat(SimpleDateFormat.java:472)
         at java.text.SimpleDateFormat.format(SimpleDateFormat.java:432)
         at java.text.DateFormat.format(DateFormat.java:300)
         at 
org.apache.fop.pdf.PDFObject.formatDateTime(PDFObject.java:231)
         at org.apache.fop.pdf.PDFInfo.toPDF(PDFInfo.java:159)
         at org.apache.fop.pdf.PDFObject.output(PDFObject.java:150)
         at org.apache.fop.pdf.PDFDocument.output(PDFDocument.java:794)
         at 
org.apache.fop.svg.PDFGraphics2D.drawImage(PDFGraphics2D.java:505)

       Can this be fixed?  Batik still targets JDK 1.3.x.

    2) I was going to commit a patch that adds support for anti-aliased
       clipping (for shape-rendering="geometricPrecision").  However
       this may result in a lot more content being rasterized.  This
       isn't a big deal within Batik but for transcoding this loses a
       lot of semantic value.  So I will likely add a method to the
       UserAgent to check if the user agent want's it to use
       anti-aliased clipping or not.

       Does this sound like an acceptable solution?