You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@forrest.apache.org by Thorsten Scherler <th...@apache.org> on 2006/02/08 02:55:12 UTC
Why does forrest-issues.xml produce not wellformed xml?
Hi all,
is there a reason why
http://localhost:8888/forrest-issues.xml on site-author
produces not well-formed markup?
...
<br>
<br>
Does this indicate a memory leak?
<br>
...
*************
This non valid markup produces in the dispatcher for http://localhost:8888/forrest-issues.html
dispatcherError: 500 - Internal server error
The contract "content-main" has thrown thrown an exception by resolving raw data from "cocoon://forrest-issues.body.xml".
dispatcherErrorStack:
org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>".
Thanks to the error handling in the dispatcher it did not took me long to find forrest-issues.xml
in site-author//sitemap.xmap and not on the file system.
<map:match pattern="forrest-issues.xml">
<map:generate type="file" src="{lm:forrest.issues-rss-url}" />
<map:transform src="{lm:transform.rssissues.document}" />
<map:serialize type="xml-document"/>
</map:match>
The dispatcher needs wellformed input data as raw data.
Somebody has an idea?
salu2
--
thorsten
"Together we stand, divided we fall!"
Hey you (Pink Floyd)
[OT] Re: [Proposal] remove @disable-output-escaping fromrssissues-to-document.xsl
Posted by "Gav...." <br...@brightontown.com.au>.
Apologies for the long paste here, it is from a thread in a forum that is
members only.
First, I thought its contents may help to provide a clue, second if anyone
has
time for any hints for the poster I will pass them on with credits. Thanks.
----------------------------------------------------------------------------------
I'm playing around with the XSLTProcessor class to transform an XML document
in to HTML. The source XML document employs a CDATA section to contain some
HTML content that I literally want to dump as-is in to the resulting HTML
document. In case you're wondering, it's for a very basic XML CMS I'm
tinkering with - just for practice really.
Here's the input XML:
HTML
<?xml version="1.0"?>
<webcopy>
<title>Home</title>
<description>Our homepage</description>
<content><![CDATA[
<h1>Welcome to our XML CMS</h1>
<p>Easy to use XML based CMS</p>
]]></content>
</webcopy>
The relevant portion of the XSL style-sheet is:
HTML
<xsl:value-of select="content" disable-output-escaping="yes"/>
The problem is with the output escaping. From a command line, msxsl and
XMLStarlet (which I believe is based on libxslt) work as intended, and
honour the disable-output-escaping attribute. When applying the same
style-sheet in PHP using the transformToXML() method of the XSLTProcessor
class, the contents of the CDATA section are escaped no matter what. What I
want is this:
HTML
<div id="content">
<h1>Welcome to our XML CMS</h1>
<p>Easy to use XML based CMS</p>
</div>
What I get is this:
HTML
<div id="content">
<h1>Welcome to our XML CMS</h1>
<p>Easy to use XML based CMS</p>
</div>
If I'm reading the XSL specs correctly, and there's every chance that I'm
not, an XSLT Processor is not required to honour the disable-output-escaping
attribute. If that's true, then I need to find an alternative solution
anyway, regardless of the apparent shortcomings of PHP/libxslt. So the
question is, has anybody got any suggestions for another way of achieving
the desired result?
In case it helps, I'm using xampp with:
PHP 5.1.1
libxslt 1.1.15
libxml 2.6.22
One reply to this post was :-
This might not be an ideal solution but have you considered defining the
source document so the content element can take all elements of say a xhtml
div element. I have done this in the passed with a simular idea for content
management, at the time I did not understand CDATA so this was my solution
for putting html into a xml file.
So you have the schema
CODE
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
elementFormDefault="qualified" attributeFormDefault="unqualified">
<xs:element name="webcopy">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="description" type="xs:string"/>
<xs:element name="content" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
So you end up with this
CODE
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:xhtml="http://www.w3.org/1999/xhtml" elementFormDefault="qualified"
attributeFormDefault="unqualified">
<xs:import namespace="http://www.w3.org/1999/xhtml"
schemaLocation="xhtml.xsd" />
<xs:element name="webcopy">
<xs:complexType>
<xs:sequence>
<xs:element name="title" type="xs:string"/>
<xs:element name="description" type="xs:string"/>
<xs:element name="content" type="xhtml:div"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Sorry I can't be much more help but I can't find a schema for xhtml the
offical one looks like this
CODE
<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "http://www.w3.org/2001/XMLSchema.dtd">
<schema xmlns="http://www.w3.org/2001/XMLSchema"
targetNamespace="http://www.w3.org/1999/xhtml">
<annotation>
<documentation>
Someday a schema for XHTML will live here
</documentation>
</annotation>
</schema>
----------------------------------------------------------------------------------
Gav...
----- Original Message -----
From: "Thorsten Scherler" <th...@apache.org>
To: <de...@forrest.apache.org>
Sent: Saturday, February 11, 2006 5:47 AM
Subject: [Proposal] remove @disable-output-escaping
fromrssissues-to-document.xsl
| Hi all,
|
| I would like to remove @disable-output-escaping from
| forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl
since it is causing invalid xml data and not using it does not make a
difference for output.
|
| See http://forrest.apache.org/forrest-issues.html#%5BFOR-546%5D+Sitemap
| +reference+doc+should+be+updated+to+reflect+plugin+architecture
| and find:
| "...
| completeness. <br> <br> <map:components> <br>
| <map:serializers> <br>
| <map:serializer
| name="fo2pdf" <br>
| ..."
|
| See below for more information.
|
| lazy consensus active.
|
| salu2
|
| El mié, 08-02-2006 a las 08:21 +0100, Thorsten Scherler escribió:
| > El mié, 08-02-2006 a las 15:08 +1100, David Crossley escribió:
| > > Thorsten Scherler wrote:
| > > >
| > > > is there a reason why
| > > > http://localhost:8888/forrest-issues.xml on site-author
| > > > produces not well-formed markup?
| > >
| > > It is Jira RSS providing this. See the project.issues-rss-url
| > > in site-author/forrest.properties file.
| >
| > see more down ;-)
| >
| > <map:match pattern="forrest-issues.xml">
| > <!--getting jira rss (valid xml) -->
| > <!-- e.g.
| >
http://issues.apache.org/jira/secure/IssueNavigator.jspa?view=rss&pid=12310000&fixfor=12310040&resolutionIds=-1&sorter/field=priority&sorter/order=DESC&tempMax=25&reset=true&decorator=none
| > -->
| > <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
| > <!--transform it-->
| > <map:transform src="{lm:transform.rssissues.document}" />
| > <map:serialize type="xml-document"/>
| > </map:match>
| >
| >
| > > Each Issue Description
| > > is wrapped in a CDATA section. Perhaps our stylesheet does
| > > not handle that properly.
| >
| > That is point of view of defining properly (well-formed vs.
| > well-presented).
| >
| > Forrest get something like:
| > <description><![CDATA[html-to-document.xsl no longer converts content to
| > an XDoc. Instead it renders converts documents to XDoc, instead it
| > allows H1, H2 etc. elements to pass through.
| > <br>
| >
| > <br>
| > The result is a page that seems to render correctly and in the single
test case I have used it still renders correctly in PDF and Text format.
However, this is a backward incompatible change that will break sites that
use includes with XPath statements such as /section[@id="foo"]
(sections are no longer created)
| > <br>
| >
| > <br>
| > ]]></description>
| >
| > Then
| >
forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl
| > ...
| > <xsl:value-of select="description" disable-output-escaping="yes" />
| > ...
| > will transform that as markup when @disable-output-escaping="yes". If
| > you remove @disable-output-escaping then it will transformed to
| > <br>
| >
| > That looses the markup information but is wellformed markup. I prefer
| > well-formed over well-presented, but best would be both. ;-)
| >
| > I am unsure how to fix that so somebody an idea?
| >
| > salu2
| >
| > >
| > > -David
| > >
| > > > ...
| > > > <br>
| > > >
| > > > <br>
| > > > Does this indicate a memory leak?
| > > > <br>
| > > >
| > > > ...
| > > >
| > > > *************
| > > > This non valid markup produces in the dispatcher for
http://localhost:8888/forrest-issues.html
| > > >
| > > > dispatcherError: 500 - Internal server error
| > > > The contract "content-main" has thrown thrown an exception by
resolving raw data from "cocoon://forrest-issues.body.xml".
| > > >
| > > > dispatcherErrorStack:
| > > > org.xml.sax.SAXParseException: The element type "br" must be
terminated by the matching end-tag "</br>".
| > > >
| > > >
| > > > Thanks to the error handling in the dispatcher it did not took me
long to find forrest-issues.xml
| > > > in site-author//sitemap.xmap and not on the file system.
| > > >
| > > > <map:match pattern="forrest-issues.xml">
| > > > <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
| > > > <map:transform src="{lm:transform.rssissues.document}" />
| > > > <map:serialize type="xml-document"/>
| > > > </map:match>
| > > >
| > > > The dispatcher needs wellformed input data as raw data.
| > > >
| > > > Somebody has an idea?
| > > >
| > > > salu2
| > > > --
| > > > thorsten
| > > >
| > > > "Together we stand, divided we fall!"
| > > > Hey you (Pink Floyd)
| --
| thorsten
|
| "Together we stand, divided we fall!"
| Hey you (Pink Floyd)
|
|
|
|
| --
| No virus found in this incoming message.
| Checked by AVG Free Edition.
| Version: 7.1.375 / Virus Database: 267.15.4/255 - Release Date: 9/02/2006
|
|
--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.375 / Virus Database: 267.15.6/257 - Release Date: 10/02/2006
[Proposal] remove @disable-output-escaping from
rssissues-to-document.xsl
Posted by Thorsten Scherler <th...@apache.org>.
Hi all,
I would like to remove @disable-output-escaping from
forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl since it is causing invalid xml data and not using it does not make a difference for output.
See http://forrest.apache.org/forrest-issues.html#%5BFOR-546%5D+Sitemap
+reference+doc+should+be+updated+to+reflect+plugin+architecture
and find:
"...
completeness. <br> <br> <map:components> <br>
<map:serializers> <br>
<map:serializer
name="fo2pdf" <br>
..."
See below for more information.
lazy consensus active.
salu2
El mié, 08-02-2006 a las 08:21 +0100, Thorsten Scherler escribió:
> El mié, 08-02-2006 a las 15:08 +1100, David Crossley escribió:
> > Thorsten Scherler wrote:
> > >
> > > is there a reason why
> > > http://localhost:8888/forrest-issues.xml on site-author
> > > produces not well-formed markup?
> >
> > It is Jira RSS providing this. See the project.issues-rss-url
> > in site-author/forrest.properties file.
>
> see more down ;-)
>
> <map:match pattern="forrest-issues.xml">
> <!--getting jira rss (valid xml) -->
> <!-- e.g.
> http://issues.apache.org/jira/secure/IssueNavigator.jspa?view=rss&pid=12310000&fixfor=12310040&resolutionIds=-1&sorter/field=priority&sorter/order=DESC&tempMax=25&reset=true&decorator=none
> -->
> <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
> <!--transform it-->
> <map:transform src="{lm:transform.rssissues.document}" />
> <map:serialize type="xml-document"/>
> </map:match>
>
>
> > Each Issue Description
> > is wrapped in a CDATA section. Perhaps our stylesheet does
> > not handle that properly.
>
> That is point of view of defining properly (well-formed vs.
> well-presented).
>
> Forrest get something like:
> <description><![CDATA[html-to-document.xsl no longer converts content to
> an XDoc. Instead it renders converts documents to XDoc, instead it
> allows H1, H2 etc. elements to pass through.
> <br>
>
> <br>
> The result is a page that seems to render correctly and in the single test case I have used it still renders correctly in PDF and Text format. However, this is a backward incompatible change that will break sites that use includes with XPath statements such as /section[@id="foo"] (sections are no longer created)
> <br>
>
> <br>
> ]]></description>
>
> Then
> forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl
> ...
> <xsl:value-of select="description" disable-output-escaping="yes" />
> ...
> will transform that as markup when @disable-output-escaping="yes". If
> you remove @disable-output-escaping then it will transformed to
> <br>
>
> That looses the markup information but is wellformed markup. I prefer
> well-formed over well-presented, but best would be both. ;-)
>
> I am unsure how to fix that so somebody an idea?
>
> salu2
>
> >
> > -David
> >
> > > ...
> > > <br>
> > >
> > > <br>
> > > Does this indicate a memory leak?
> > > <br>
> > >
> > > ...
> > >
> > > *************
> > > This non valid markup produces in the dispatcher for http://localhost:8888/forrest-issues.html
> > >
> > > dispatcherError: 500 - Internal server error
> > > The contract "content-main" has thrown thrown an exception by resolving raw data from "cocoon://forrest-issues.body.xml".
> > >
> > > dispatcherErrorStack:
> > > org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>".
> > >
> > >
> > > Thanks to the error handling in the dispatcher it did not took me long to find forrest-issues.xml
> > > in site-author//sitemap.xmap and not on the file system.
> > >
> > > <map:match pattern="forrest-issues.xml">
> > > <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
> > > <map:transform src="{lm:transform.rssissues.document}" />
> > > <map:serialize type="xml-document"/>
> > > </map:match>
> > >
> > > The dispatcher needs wellformed input data as raw data.
> > >
> > > Somebody has an idea?
> > >
> > > salu2
> > > --
> > > thorsten
> > >
> > > "Together we stand, divided we fall!"
> > > Hey you (Pink Floyd)
--
thorsten
"Together we stand, divided we fall!"
Hey you (Pink Floyd)
Re: Why does forrest-issues.xml produce not wellformed xml?
Posted by Thorsten Scherler <th...@apache.org>.
El mié, 08-02-2006 a las 15:08 +1100, David Crossley escribió:
> Thorsten Scherler wrote:
> >
> > is there a reason why
> > http://localhost:8888/forrest-issues.xml on site-author
> > produces not well-formed markup?
>
> It is Jira RSS providing this. See the project.issues-rss-url
> in site-author/forrest.properties file.
see more down ;-)
<map:match pattern="forrest-issues.xml">
<!--getting jira rss (valid xml) -->
<!-- e.g.
http://issues.apache.org/jira/secure/IssueNavigator.jspa?view=rss&pid=12310000&fixfor=12310040&resolutionIds=-1&sorter/field=priority&sorter/order=DESC&tempMax=25&reset=true&decorator=none
-->
<map:generate type="file" src="{lm:forrest.issues-rss-url}" />
<!--transform it-->
<map:transform src="{lm:transform.rssissues.document}" />
<map:serialize type="xml-document"/>
</map:match>
> Each Issue Description
> is wrapped in a CDATA section. Perhaps our stylesheet does
> not handle that properly.
That is point of view of defining properly (well-formed vs.
well-presented).
Forrest get something like:
<description><![CDATA[html-to-document.xsl no longer converts content to
an XDoc. Instead it renders converts documents to XDoc, instead it
allows H1, H2 etc. elements to pass through.
<br>
<br>
The result is a page that seems to render correctly and in the single test case I have used it still renders correctly in PDF and Text format. However, this is a backward incompatible change that will break sites that use includes with XPath statements such as /section[@id="foo"] (sections are no longer created)
<br>
<br>
]]></description>
Then
forrest-trunk/main/webapp/resources/stylesheets/rssissues-to-document.xsl
...
<xsl:value-of select="description" disable-output-escaping="yes" />
...
will transform that as markup when @disable-output-escaping="yes". If
you remove @disable-output-escaping then it will transformed to
<br>
That looses the markup information but is wellformed markup. I prefer
well-formed over well-presented, but best would be both. ;-)
I am unsure how to fix that so somebody an idea?
salu2
>
> -David
>
> > ...
> > <br>
> >
> > <br>
> > Does this indicate a memory leak?
> > <br>
> >
> > ...
> >
> > *************
> > This non valid markup produces in the dispatcher for http://localhost:8888/forrest-issues.html
> >
> > dispatcherError: 500 - Internal server error
> > The contract "content-main" has thrown thrown an exception by resolving raw data from "cocoon://forrest-issues.body.xml".
> >
> > dispatcherErrorStack:
> > org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>".
> >
> >
> > Thanks to the error handling in the dispatcher it did not took me long to find forrest-issues.xml
> > in site-author//sitemap.xmap and not on the file system.
> >
> > <map:match pattern="forrest-issues.xml">
> > <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
> > <map:transform src="{lm:transform.rssissues.document}" />
> > <map:serialize type="xml-document"/>
> > </map:match>
> >
> > The dispatcher needs wellformed input data as raw data.
> >
> > Somebody has an idea?
> >
> > salu2
> > --
> > thorsten
> >
> > "Together we stand, divided we fall!"
> > Hey you (Pink Floyd)
--
thorsten
"Together we stand, divided we fall!"
Hey you (Pink Floyd)
Re: Why does forrest-issues.xml produce not wellformed xml?
Posted by David Crossley <cr...@apache.org>.
Thorsten Scherler wrote:
>
> is there a reason why
> http://localhost:8888/forrest-issues.xml on site-author
> produces not well-formed markup?
It is Jira RSS providing this. See the project.issues-rss-url
in site-author/forrest.properties file. Each Issue Description
is wrapped in a CDATA section. Perhaps our stylesheet does
not handle that properly.
-David
> ...
> <br>
>
> <br>
> Does this indicate a memory leak?
> <br>
>
> ...
>
> *************
> This non valid markup produces in the dispatcher for http://localhost:8888/forrest-issues.html
>
> dispatcherError: 500 - Internal server error
> The contract "content-main" has thrown thrown an exception by resolving raw data from "cocoon://forrest-issues.body.xml".
>
> dispatcherErrorStack:
> org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>".
>
>
> Thanks to the error handling in the dispatcher it did not took me long to find forrest-issues.xml
> in site-author//sitemap.xmap and not on the file system.
>
> <map:match pattern="forrest-issues.xml">
> <map:generate type="file" src="{lm:forrest.issues-rss-url}" />
> <map:transform src="{lm:transform.rssissues.document}" />
> <map:serialize type="xml-document"/>
> </map:match>
>
> The dispatcher needs wellformed input data as raw data.
>
> Somebody has an idea?
>
> salu2
> --
> thorsten
>
> "Together we stand, divided we fall!"
> Hey you (Pink Floyd)