You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by David Crossley <cr...@indexgeo.com.au> on 2001/10/01 15:22:39 UTC

Re: Catalogs don't work on XSLT stylesheets

Hi Stephano, thanks for your Bugzilla entry. Would you please
add a test case stylesheet to demonstrate how you declare an
external entity. I wonder if this is actually a bug or a usage issue.

The entity resolver is attached to the parser, so resolution
happens at that level rather than at the XSLT level. By the time
that the XML stream gets to the stylesheet, then all entities
have already been resolved. I would expect that you would need
to declare all necessary entities from the document type 
declaration of your XML instance document.

regards, David Crossley

> Date: 1 Oct 2001 10:55:06 -0000
> From: bugzilla@apache.org
>
> http://nagoya.apache.org/bugzilla/show_bug.cgi?id=3895
> 
> Catalogs don't work on XSLT stylesheets
> 
>            Summary: Catalogs don't work on XSLT stylesheets
>            Product: Cocoon 2
>            Version: 2.0rc1
>           Platform: All
>         OS/Version: Other
>             Status: NEW
>           Severity: Normal
>           Priority: Other
>          Component: core
>         AssignedTo: cocoon-dev@xml.apache.org
>         ReportedBy: stefano@apache.org
> 
> Since it's pretty common that users require entities such as &nbsp; or 
> equivalent when processing XML to get to XHTML, it would be
> desirable to have 
> entitity discovery capabilities thru catalogs in all XML documents, 
stylesheets included.
> 
> Probably is just a matter of modifying the TraxTransformer to include
> the entity resolver but I don't have time to figure this out myself.

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: Catalogs don't work on XSLT stylesheets

Posted by Stefano Mazzocchi <st...@apache.org>.
David Crossley wrote:
> 
> OK Stefano, i have reproduced the issue with the following
> steps. In the quoted email below, i build upon your potential
> solution.
> 
> 1) referenced an entity in stylesheet test.xsl - i used J&ouml;rg
> 2) received processing expection ... as expected because
> i have not yet declared the ISOlat1 entity set
> ----
> ERROR   (2001-10-04) 01:12.23:056   [cocoon  ] (/cocoon/catalog-demo)
> Thread-16/TraxTransformer: Problem in getTransformer:
> org.apache.cocoon.ProcessingException: Error in creating Transform Handler:
> org.xml.sax.SAXParseException: The entity "ouml" was referenced, but not
> declared.
> ----
> ... that was an extract from the cocoon.log whereas the error page
> was very cryptic.
> 3) declared the entity set in my test.xsl ... similar to Stefano's
> example below, but for ISOlat1
> 4) now we get a different exception ... file not found ISOlat1.pen
> ... this indicates that the entity resolver is not working.
> There were also no resolver messages going to stdout.
> 5) as a workaround, i copied the ISOlat1.pen file to the same
> directory as the stylesheet
> ... now the parser finds ISOlat1 via the default system identifier
> 6) it works, i see my output with the proper entity well presented

Ok, cool.

Interesting enough, I found that something about entities in Cocoon that
appears magic until I looked deeper underneath: if you do something like

<!DOCTYPE xsl:stylesheet [
 <!ENTITY nbsp "&#160;" > 
]>

in your stylesheet, and then serialize using the HTML serializer, you
get the good old entity &nbsp; but if you do

<!DOCTYPE xsl:stylesheet [
 <!ENTITY stefano "&#160;" > 
]>

you *STILL* get &nbsp; encoded. :)

This is because the flow of entity expansion goes like this:

 1) the entity is referenced
 2) the entity is found by the parser and expanded
 3) the expanded entity (now normal UNICODE chars) are passed on
 4) the HTML serializer recreates the encoding based on the HTML DTD
(which references the same ISO entity lists of our catalog).

> 7) summary - entity catalogs are not working at stylesheet level

Ok, this is what I thought.

> see more comments below ...
> 
> --------------------------------
> Stefano Mazzocchi wrote:
> > David Crossley wrote:
> > > Hi Stephano,
> > ehmm, Stefano, please :)
> > > thanks for your Bugzilla entry.
> > You are welcome :)
> >
> > > Would you please
> > > add a test case stylesheet to demonstrate how you declare an
> > > external entity. I wonder if this is actually a bug or a usage issue.
> >
> > Good point. What I did was placing the following
> >
> > <!DOCTYPE xsl:stylesheet [
> >  <!ENTITY % ISOnum PUBLIC
> >    "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML"
> >    "ISOnum.pen"
> >  %ISOnum;
> > ]>
> >
> > above the <xsl:stylesheet> element and it didn't work.
> >
> > > The entity resolver is attached to the parser, so resolution
> > > happens at that level rather than at the XSLT level.
> >
> > Ah, ok. I thought that maybe the Trax handlers required another instance
> > of the entity resolver.
> 
> I think that you are right. Every time that a parser is used, it needs
> to have an entity resolver set for it. Cocoon does this already for
> org/apache/cocoon/components/parser/*Parser
> That parser is used for processing the original XML stream. At that
> level the entity catalog resolver works beautifully.

Ok.
 
> However, it now appears that other parsers are called from
> elsewhere in Cocoon (e.g. TraxTransformer).

Well, to be honest, I already knew that, I'm not discovering this right
now :)

> So yes, the entity
> resolver needs to be set for whatever parser is used.
> 
> If our analysis is correct would someone please follow the
> example code in JaxpParser.java to set entity resolvers for any
> other parsers.

I'll do that right away.
 
> > > By the time
> > > that the XML stream gets to the stylesheet, then all entities
> > > have already been resolved. I would expect that you would need
> > > to declare all necessary entities from the document type
> > > declaration of your XML instance document.
> 
> I tried doing this to no avail. I imagine that is because there were
> two separate parser instances. The XML stream has already had
> all of its entities resolved and now the stylesheet is adding more.
> 
> > Am I doing something wrong in the above declaration?
> 
> I am not really sure Stefano - my XSLT book does not
> discuss entity declarations very much, or even using character
> entities at all. Anyway the declaration that you have provided
> does work, as evidenced by 6) above.

Ok, let's ask those who know:

Scott, is it possible to for a TrAX processor to use a specific JAXP
parser instance instead of looking up its own?

This is mainly because Cocoon uses avalon-based component management to
share the same parser that provides entity resolution via catalogs
(using Norman Walsh's code provided by Sun) while Xalan is asking JAXP
for its own instance that doesn't have the entity resolution set.

So, either, we can set the entity resolver before xalan asks for the
JAXP parser or we force the TrAX processor to use our own instance
(where the entity resolver is already set).

Thanks much for answer.

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: Catalogs don't work on XSLT stylesheets

Posted by David Crossley <cr...@indexgeo.com.au>.
OK Stefano, i have reproduced the issue with the following
steps. In the quoted email below, i build upon your potential
solution.

1) referenced an entity in stylesheet test.xsl - i used J&ouml;rg
2) received processing expection ... as expected because
i have not yet declared the ISOlat1 entity set
----
ERROR   (2001-10-04) 01:12.23:056   [cocoon  ] (/cocoon/catalog-demo) 
Thread-16/TraxTransformer: Problem in getTransformer:
org.apache.cocoon.ProcessingException: Error in creating Transform Handler: 
org.xml.sax.SAXParseException: The entity "ouml" was referenced, but not 
declared.
----
... that was an extract from the cocoon.log whereas the error page
was very cryptic.
3) declared the entity set in my test.xsl ... similar to Stefano's
example below, but for ISOlat1
4) now we get a different exception ... file not found ISOlat1.pen
... this indicates that the entity resolver is not working.
There were also no resolver messages going to stdout.
5) as a workaround, i copied the ISOlat1.pen file to the same
directory as the stylesheet
... now the parser finds ISOlat1 via the default system identifier
6) it works, i see my output with the proper entity well presented
7) summary - entity catalogs are not working at stylesheet level

see more comments below ...

--------------------------------
Stefano Mazzocchi wrote:
> David Crossley wrote:
> > Hi Stephano, 
> ehmm, Stefano, please :)
> > thanks for your Bugzilla entry. 
> You are welcome :)
> 
> > Would you please
> > add a test case stylesheet to demonstrate how you declare an
> > external entity. I wonder if this is actually a bug or a usage issue.
> 
> Good point. What I did was placing the following
> 
> <!DOCTYPE xsl:stylesheet [
>  <!ENTITY % ISOnum PUBLIC
>    "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML"
>    "ISOnum.pen"
>  %ISOnum;
> ]>
>  
> above the <xsl:stylesheet> element and it didn't work.
> 
> > The entity resolver is attached to the parser, so resolution
> > happens at that level rather than at the XSLT level.
> 
> Ah, ok. I thought that maybe the Trax handlers required another instance
> of the entity resolver.

I think that you are right. Every time that a parser is used, it needs
to have an entity resolver set for it. Cocoon does this already for 
org/apache/cocoon/components/parser/*Parser
That parser is used for processing the original XML stream. At that
level the entity catalog resolver works beautifully.

However, it now appears that other parsers are called from
elsewhere in Cocoon (e.g. TraxTransformer). So yes, the entity
resolver needs to be set for whatever parser is used.

If our analysis is correct would someone please follow the
example code in JaxpParser.java to set entity resolvers for any
other parsers.

> > By the time
> > that the XML stream gets to the stylesheet, then all entities
> > have already been resolved. I would expect that you would need
> > to declare all necessary entities from the document type
> > declaration of your XML instance document.

I tried doing this to no avail. I imagine that is because there were
two separate parser instances. The XML stream has already had
all of its entities resolved and now the stylesheet is adding more. 

> Am I doing something wrong in the above declaration?

I am not really sure Stefano - my XSLT book does not
discuss entity declarations very much, or even using character
entities at all. Anyway the declaration that you have provided
does work, as evidenced by 6) above.
regards, David Crossley

> -- 
> Stefano Mazzocchi      One must still have chaos in oneself to be
>                           able to give birth to a dancing star.
> <st...@apache.org>                             Friedrich Nietzsche
> --------------------------------------------------------------------

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: Catalogs don't work on XSLT stylesheets

Posted by Stefano Mazzocchi <st...@apache.org>.
David Crossley wrote:
> 
> Hi Stephano, 

ehmm, Stefano, please :)

> thanks for your Bugzilla entry. 

You are welcome :)

> Would you please
> add a test case stylesheet to demonstrate how you declare an
> external entity. I wonder if this is actually a bug or a usage issue.

Good point. What I did was placing the following

<!DOCTYPE xsl:stylesheet [
 <!ENTITY % ISOnum PUBLIC
   "ISO 8879:1986//ENTITIES Numeric and Special Graphic//EN//XML"
   "ISOnum.pen">
 %ISOnum;
]>
 
above the <xsl:stylesheet> element and it didn't work.

> The entity resolver is attached to the parser, so resolution
> happens at that level rather than at the XSLT level.

Ah, ok. I thought that maybe the Trax handlers required another instance
of the entity resolver.

> By the time
> that the XML stream gets to the stylesheet, then all entities
> have already been resolved. I would expect that you would need
> to declare all necessary entities from the document type
> declaration of your XML instance document.

Am I doing something wrong in the above declaration?

-- 
Stefano Mazzocchi      One must still have chaos in oneself to be
                          able to give birth to a dancing star.
<st...@apache.org>                             Friedrich Nietzsche
--------------------------------------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org