You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Thorsten Scherler <th...@apache.org> on 2006/02/10 22:47:17 UTC

[Proposal] remove @disable-output-escaping from rssissues-to-document.xsl

Hi all,

I would like to remove @disable-output-escaping from
forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl since it is causing invalid xml data and not using it does not make a difference for output. 

See http://forrest.apache.org/forrest-issues.html#%5BFOR-546%5D+Sitemap
+reference+doc+should+be+updated+to+reflect+plugin+architecture
and find:
"...
completeness. <br> <br> &nbsp;&nbsp;&lt;map:components&gt; <br>
&nbsp;&nbsp;&nbsp;&nbsp;&lt;map:serializers&gt; <br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;map:serializer
name=&quot;fo2pdf&quot; <br>
..."

See below for more information.

lazy consensus active.

salu2

El mié, 08-02-2006 a las 08:21 +0100, Thorsten Scherler escribió:
> El mié, 08-02-2006 a las 15:08 +1100, David Crossley escribió:
> > Thorsten Scherler wrote:
> > > 
> > > is there a reason why 
> > > http://localhost:8888/forrest-issues.xml on site-author
> > > produces not well-formed markup?
> > 
> > It is Jira RSS providing this. See the project.issues-rss-url
> > in site-author/forrest.properties file. 
> 
> see more down ;-)
> 
>     <map:match pattern="forrest-issues.xml">
> <!--getting  jira rss (valid xml) -->
> <!-- e.g.
> http://issues.apache.org/jira/secure/IssueNavigator.jspa?view=rss&pid=12310000&fixfor=12310040&resolutionIds=-1&sorter/field=priority&sorter/order=DESC&tempMax=25&reset=true&decorator=none
> -->
>       <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
> <!--transform it-->
>       <map:transform src="{lm:transform.rssissues.document}" />
>       <map:serialize type="xml-document"/>
>     </map:match>
> 
> 
> > Each Issue Description
> > is wrapped in a CDATA section. Perhaps our stylesheet does
> > not handle that properly.
> 
> That is point of view of defining properly (well-formed vs.
> well-presented).
> 
> Forrest get something like:
> <description><![CDATA[html-to-document.xsl no longer converts content to
> an XDoc. Instead it renders converts documents to XDoc, instead it
> allows H1, H2 etc. elements to pass through.
> <br>
> 
> <br>
> The result is a page that seems to render correctly and in the single test case I have used it still renders correctly in PDF and Text format. However, this is a backward incompatible change that will break sites that use includes with XPath statements such as /section[@id=&quot;foo&quot;] (sections are no longer created)
> <br>
> 
> <br>
> ]]></description>
> 
> Then
> forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl
> ...
> <xsl:value-of select="description" disable-output-escaping="yes" />
> ...
> will transform that as markup when @disable-output-escaping="yes". If
> you remove @disable-output-escaping then it will transformed to
> &lt;br&gt;
> 
> That looses the markup information but is wellformed markup. I prefer
> well-formed over well-presented, but best would be both. ;-)
> 
> I am unsure how to fix that so somebody an idea?
> 
> salu2
> 
> > 
> > -David
> > 
> > > ...
> > > <br>
> > > 
> > > <br>
> > > Does this indicate a memory leak?
> > > <br>
> > > 
> > > ...
> > > 
> > > *************
> > > This non valid markup produces in the dispatcher for http://localhost:8888/forrest-issues.html
> > > 
> > > dispatcherError: 500 - Internal server error
> > > The contract "content-main" has thrown thrown an exception by resolving raw data from "cocoon://forrest-issues.body.xml".
> > > 
> > > dispatcherErrorStack:
> > >  org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>".
> > > 
> > > 
> > > Thanks to the error handling in the dispatcher it did not took me long to find forrest-issues.xml 
> > > in site-author//sitemap.xmap and not on the file system.
> > > 
> > >     <map:match pattern="forrest-issues.xml">
> > >       <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
> > >       <map:transform src="{lm:transform.rssissues.document}" />
> > >       <map:serialize type="xml-document"/>
> > >     </map:match>
> > > 
> > > The dispatcher needs wellformed input data as raw data.
> > > 
> > > Somebody has an idea?
> > > 
> > > salu2
> > > -- 
> > > thorsten
> > > 
> > > "Together we stand, divided we fall!" 
> > > Hey you (Pink Floyd)
-- 
thorsten

"Together we stand, divided we fall!" 
Hey you (Pink Floyd)


[OT] Re: [Proposal] remove @disable-output-escaping fromrssissues-to-document.xsl

Posted by "Gav...." <br...@brightontown.com.au>.
Apologies for the long paste here, it is from a thread in a forum that is 
members only.
First, I thought its contents may help to provide a clue, second if anyone 
has
time for any hints for the poster I will pass them on with credits. Thanks.

----------------------------------------------------------------------------------
I'm playing around with the XSLTProcessor class to transform an XML document 
in to HTML. The source XML document employs a CDATA section to contain some 
HTML content that I literally want to dump as-is in to the resulting HTML 
document. In case you're wondering, it's for a very basic XML CMS I'm 
tinkering with - just for practice really.

Here's the input XML:


      HTML
      <?xml version="1.0"?>
      <webcopy>
      <title>Home</title>
      <description>Our homepage</description>
      <content><![CDATA[
      <h1>Welcome to our XML CMS</h1>
      <p>Easy to use XML based CMS</p>
      ]]></content>
      </webcopy>




The relevant portion of the XSL style-sheet is:


      HTML
      <xsl:value-of select="content" disable-output-escaping="yes"/>




The problem is with the output escaping. From a command line, msxsl and 
XMLStarlet (which I believe is based on libxslt) work as intended, and 
honour the disable-output-escaping attribute. When applying the same 
style-sheet in PHP using the transformToXML() method of the XSLTProcessor 
class, the contents of the CDATA section are escaped no matter what. What I 
want is this:


      HTML
      <div id="content">
      <h1>Welcome to our XML CMS</h1>
      <p>Easy to use XML based CMS</p>
      </div>




What I get is this:


      HTML
      <div id="content">
      &lt;h1&gt;Welcome to our XML CMS&lt;/h1&gt;
      &lt;p&gt;Easy to use XML based CMS&lt;/p&gt;
      </div>




If I'm reading the XSL specs correctly, and there's every chance that I'm 
not, an XSLT Processor is not required to honour the disable-output-escaping 
attribute. If that's true, then I need to find an alternative solution 
anyway, regardless of the apparent shortcomings of PHP/libxslt. So the 
question is, has anybody got any suggestions for another way of achieving 
the desired result?

In case it helps, I'm using xampp with:

PHP 5.1.1
libxslt 1.1.15
libxml 2.6.22


One reply to this post was :-

This might not be an ideal solution but have you considered defining the 
source document so the content element can take all elements of say a xhtml 
div element. I have done this in the passed with a simular idea for content 
management, at the time I did not understand CDATA so this was my solution 
for putting html into a xml file.

So you have the schema
      CODE

      <?xml version="1.0" encoding="UTF-8"?>
      <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
elementFormDefault="qualified" attributeFormDefault="unqualified">
      <xs:element name="webcopy">
       <xs:complexType>
        <xs:sequence>
         <xs:element name="title" type="xs:string"/>
         <xs:element name="description" type="xs:string"/>
         <xs:element name="content" type="xs:string"/>
        </xs:sequence>
       </xs:complexType>
      </xs:element>
      </xs:schema>



So you end up with this
      CODE

      <?xml version="1.0" encoding="UTF-8"?>
      <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" 
xmlns:xhtml="http://www.w3.org/1999/xhtml" elementFormDefault="qualified" 
attributeFormDefault="unqualified">
      <xs:import namespace="http://www.w3.org/1999/xhtml" 
schemaLocation="xhtml.xsd" />
      <xs:element name="webcopy">
       <xs:complexType>
        <xs:sequence>
         <xs:element name="title" type="xs:string"/>
         <xs:element name="description" type="xs:string"/>
         <xs:element name="content" type="xhtml:div"/>
        </xs:sequence>
       </xs:complexType>
      </xs:element>
      </xs:schema>



Sorry I can't be much more help but I can't find a schema for xhtml the 
offical one looks like this
      CODE

      <?xml version="1.0"?>
      <!DOCTYPE schema SYSTEM "http://www.w3.org/2001/XMLSchema.dtd">
      <schema xmlns="http://www.w3.org/2001/XMLSchema" 
targetNamespace="http://www.w3.org/1999/xhtml">
      <annotation>
       <documentation>
        Someday a schema for XHTML will live here
       </documentation>
      </annotation>
      </schema>


----------------------------------------------------------------------------------

Gav...


----- Original Message ----- 
From: "Thorsten Scherler" <th...@apache.org>
To: <de...@forrest.apache.org>
Sent: Saturday, February 11, 2006 5:47 AM
Subject: [Proposal] remove @disable-output-escaping 
fromrssissues-to-document.xsl


| Hi all,
|
| I would like to remove @disable-output-escaping from
| forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl 
since it is causing invalid xml data and not using it does not make a 
difference for output.
|
| See http://forrest.apache.org/forrest-issues.html#%5BFOR-546%5D+Sitemap
| +reference+doc+should+be+updated+to+reflect+plugin+architecture
| and find:
| "...
| completeness. <br> <br> &nbsp;&nbsp;&lt;map:components&gt; <br>
| &nbsp;&nbsp;&nbsp;&nbsp;&lt;map:serializers&gt; <br>
| &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&lt;map:serializer
| name=&quot;fo2pdf&quot; <br>
| ..."
|
| See below for more information.
|
| lazy consensus active.
|
| salu2
|
| El mié, 08-02-2006 a las 08:21 +0100, Thorsten Scherler escribió:
| > El mié, 08-02-2006 a las 15:08 +1100, David Crossley escribió:
| > > Thorsten Scherler wrote:
| > > >
| > > > is there a reason why
| > > > http://localhost:8888/forrest-issues.xml on site-author
| > > > produces not well-formed markup?
| > >
| > > It is Jira RSS providing this. See the project.issues-rss-url
| > > in site-author/forrest.properties file.
| >
| > see more down ;-)
| >
| >     <map:match pattern="forrest-issues.xml">
| > <!--getting  jira rss (valid xml) -->
| > <!-- e.g.
| > 
http://issues.apache.org/jira/secure/IssueNavigator.jspa?view=rss&pid=12310000&fixfor=12310040&resolutionIds=-1&sorter/field=priority&sorter/order=DESC&tempMax=25&reset=true&decorator=none
| > -->
| >       <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
| > <!--transform it-->
| >       <map:transform src="{lm:transform.rssissues.document}" />
| >       <map:serialize type="xml-document"/>
| >     </map:match>
| >
| >
| > > Each Issue Description
| > > is wrapped in a CDATA section. Perhaps our stylesheet does
| > > not handle that properly.
| >
| > That is point of view of defining properly (well-formed vs.
| > well-presented).
| >
| > Forrest get something like:
| > <description><![CDATA[html-to-document.xsl no longer converts content to
| > an XDoc. Instead it renders converts documents to XDoc, instead it
| > allows H1, H2 etc. elements to pass through.
| > <br>
| >
| > <br>
| > The result is a page that seems to render correctly and in the single 
test case I have used it still renders correctly in PDF and Text format. 
However, this is a backward incompatible change that will break sites that 
use includes with XPath statements such as /section[@id=&quot;foo&quot;] 
(sections are no longer created)
| > <br>
| >
| > <br>
| > ]]></description>
| >
| > Then
| > 
forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl
| > ...
| > <xsl:value-of select="description" disable-output-escaping="yes" />
| > ...
| > will transform that as markup when @disable-output-escaping="yes". If
| > you remove @disable-output-escaping then it will transformed to
| > &lt;br&gt;
| >
| > That looses the markup information but is wellformed markup. I prefer
| > well-formed over well-presented, but best would be both. ;-)
| >
| > I am unsure how to fix that so somebody an idea?
| >
| > salu2
| >
| > >
| > > -David
| > >
| > > > ...
| > > > <br>
| > > >
| > > > <br>
| > > > Does this indicate a memory leak?
| > > > <br>
| > > >
| > > > ...
| > > >
| > > > *************
| > > > This non valid markup produces in the dispatcher for 
http://localhost:8888/forrest-issues.html
| > > >
| > > > dispatcherError: 500 - Internal server error
| > > > The contract "content-main" has thrown thrown an exception by 
resolving raw data from "cocoon://forrest-issues.body.xml".
| > > >
| > > > dispatcherErrorStack:
| > > >  org.xml.sax.SAXParseException: The element type "br" must be 
terminated by the matching end-tag "</br>".
| > > >
| > > >
| > > > Thanks to the error handling in the dispatcher it did not took me 
long to find forrest-issues.xml
| > > > in site-author//sitemap.xmap and not on the file system.
| > > >
| > > >     <map:match pattern="forrest-issues.xml">
| > > >       <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
| > > >       <map:transform src="{lm:transform.rssissues.document}" />
| > > >       <map:serialize type="xml-document"/>
| > > >     </map:match>
| > > >
| > > > The dispatcher needs wellformed input data as raw data.
| > > >
| > > > Somebody has an idea?
| > > >
| > > > salu2
| > > > -- 
| > > > thorsten
| > > >
| > > > "Together we stand, divided we fall!"
| > > > Hey you (Pink Floyd)
| -- 
| thorsten
|
| "Together we stand, divided we fall!"
| Hey you (Pink Floyd)
|
|
|
|
| -- 
| No virus found in this incoming message.
| Checked by AVG Free Edition.
| Version: 7.1.375 / Virus Database: 267.15.4/255 - Release Date: 9/02/2006
|
| 



-- 
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.375 / Virus Database: 267.15.6/257 - Release Date: 10/02/2006