You are viewing a plain text version of this content. The canonical link for it is here.
Posted to fop-users@xmlgraphics.apache.org by Ju...@Piater.name on 2008/06/28 13:25:32 UTC

Getting Batik to read SVG DTDs via catalog resolver

Hi,

Processing my documents via command-line fop is enormously slowed down
by DTDs fetched over the 'net for SVG graphics included via
fo:external-graphic.

Is this a fop or a batik issue?

What can I do to get it to use an xml-commons-resolver?

Thanks,
Justus

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: [Jeuclid-devel] Getting Batik to read SVG DTDs via catalog resolver

Posted by Ju...@Piater.name.
Max Berger <ma...@berger.name> wrote on Wed, 09 Jul 2008 10:34:45 +0200:

> I assume you have an external graphic? The JEuclid plugin tries to parse
> it - since it is valid XML it has to follow all external entities for
> correct parsing.

Why does JEuclid need to parse external graphics? Is JEuclid able to
substitute MathML contained in XML content included via
fo:external-graphic?

> I could also convince JEuclid to use JEuclid's URI Resolver, but I am
> not sure if that would solve the problem, or just move it.

You mean Fop's? Otherwise I don't understand this remark.

The modularity of XML processing is great, but as a user, I normally
would not want to configure each module separately. I would like to be
able to simply throw any supported resolver into the classpath, and
all components use it (i.e., Fop and any plugins).

> You may want to play around with some Xerces parameters, especially the
> feature http://apache.org/xml/features/nonvalidating/load-external-dtd
>
> further info: http://xerces.apache.org/xerces-j/features.html

Just to avoid unnecessary work: This involves building the JEuclid Fop
plugin from sources, right? Or is there a way to configure this at run
time?

Thanks,
Justus

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: [Jeuclid-devel] Getting Batik to read SVG DTDs via catalog resolver

Posted by Max Berger <ma...@berger.name>.
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dear Justus,


Justus-bulk@Piater.name schrieb:
> - Batik appears to preferentially read DTDs stored inside the jar. Not
>   a very clean solution, but it works out of the box.

JEuclid does the same, for the exact problem you mentioned: The default
parser tries to load the external entities (e.g. DTD), which causes
traffic and sometimes long delays. All MathML related dtd's are
therefore redirected to load from internal sources.

> - The culprit is - surprise! - the JEuclid Fop plugin, even though my
>   test document does not contain any MathML, and the problem goes away
>   if I remove the DOCTYPE declaration from the included SVG document:

I assume you have an external graphic? The JEuclid plugin tries to parse
it - since it is valid XML it has to follow all external entities for
correct parsing.

I could disable this (by returning invalid entities), but that would
disable entity handling alltogether, but that would violate the XML specs.

>   If I sever my Internet connection,
>   net.sourceforge.jeuclid.xmlgraphics.PreloaderMathML.parseSource()
>   hangs for 40 seconds. Or more precisely, it appears to be the
>   org.apache.xerces.impl.XMLEntityManager, called indirectly by that
>   method, that attempts to open a HTTP connection.
>   If I remove JEuclid from the classpath, the problem goes away.

As said, a Xerces problem. The same thing may happen the other way round
( when you have an external MathML file with DTD, when being parsed by
the SVG plugin).

> It seems that the Xerces parser used by JEuclid should somehow be
> convinced to use the EntityResolver available on the classpath. Or
> perhaps, one might abuse FOP's URIResolver, similarly to what I
> described above.

I could also convince JEuclid to use JEuclid's URI Resolver, but I am
not sure if that would solve the problem, or just move it.

You may want to play around with some Xerces parameters, especially the
feature http://apache.org/xml/features/nonvalidating/load-external-dtd

further info: http://xerces.apache.org/xerces-j/features.html

> Any comments or suggestions (Max)?
> Justus

hth

Max
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIdHgl+Gr+4pk71JwRAmFaAJoDDLeMiV/FrG5RdXxqx+ybMrBLOwCfZTen
gJLPuYHG+IlGMl4CjtiNkAc=
=WL4o
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Getting Batik to read SVG DTDs via catalog resolver

Posted by Ju...@Piater.name.
Hi -

Here's what I've come up with so far:

- Batik appears to preferentially read DTDs stored inside the jar. Not
  a very clean solution, but it works out of the box.

- I nevertheless played around with your suggestion below. Such a
  subclass exists already
  (org.apache.fop.svg.FOPSAXSVGDocumentFactory); I modified that and
  PreloaderSVG.java to use Fop's URIResolver. However, this solution
  is not very clean as it abuses an URIResolver for an EntityResolver,
  and it only works if the catalog declares <uri> substitutions; the
  URIResolver does not treat them as system ids. So, I don't think
  this is a clean solution either.

- The culprit is - surprise! - the JEuclid Fop plugin, even though my
  test document does not contain any MathML, and the problem goes away
  if I remove the DOCTYPE declaration from the included SVG document:

  If I sever my Internet connection,
  net.sourceforge.jeuclid.xmlgraphics.PreloaderMathML.parseSource()
  hangs for 40 seconds. Or more precisely, it appears to be the
  org.apache.xerces.impl.XMLEntityManager, called indirectly by that
  method, that attempts to open a HTTP connection.

  If I remove JEuclid from the classpath, the problem goes away.

It seems that the Xerces parser used by JEuclid should somehow be
convinced to use the EntityResolver available on the classpath. Or
perhaps, one might abuse FOP's URIResolver, similarly to what I
described above.

Any comments or suggestions (Max)?

Justus


Jeremias Maerki <de...@jeremias-maerki.ch> wrote on Mon, 30 Jun 2008
10:40:13 +0200:

> Generally, Batik doesn't support URI resolution using EntityResolver or
> URIResolver as FOP does. Looking at your case I think this could be
> improved by subclassing Batik's SAXSVGDocumentFactory (used in
> PreloaderSVG.java) and overriding the resolveEntity() method to consult
> FOP's URIResolver. Want to give it a try?
>
> On 28.06.2008 13:25:32 Justus-bulk wrote:
>> Hi,
>> 
>> Processing my documents via command-line fop is enormously slowed down
>> by DTDs fetched over the 'net for SVG graphics included via
>> fo:external-graphic.
>> 
>> Is this a fop or a batik issue?
>> 
>> What can I do to get it to use an xml-commons-resolver?
>> 
>> Thanks,
>> Justus
>
>
>
>
> Jeremias Maerki

---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org


Re: Getting Batik to read SVG DTDs via catalog resolver

Posted by Jeremias Maerki <de...@jeremias-maerki.ch>.
Generally, Batik doesn't support URI resolution using EntityResolver or
URIResolver as FOP does. Looking at your case I think this could be
improved by subclassing Batik's SAXSVGDocumentFactory (used in
PreloaderSVG.java) and overriding the resolveEntity() method to consult
FOP's URIResolver. Want to give it a try?

On 28.06.2008 13:25:32 Justus-bulk wrote:
> Hi,
> 
> Processing my documents via command-line fop is enormously slowed down
> by DTDs fetched over the 'net for SVG graphics included via
> fo:external-graphic.
> 
> Is this a fop or a batik issue?
> 
> What can I do to get it to use an xml-commons-resolver?
> 
> Thanks,
> Justus




Jeremias Maerki


---------------------------------------------------------------------
To unsubscribe, e-mail: fop-users-unsubscribe@xmlgraphics.apache.org
For additional commands, e-mail: fop-users-help@xmlgraphics.apache.org