You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@cocoon.apache.org by Andreas Hartmann <an...@apache.org> on 2007/05/10 16:37:28 UTC
Un-Escaping XML in transformer
Hi Cocooners,
I have a SAX stream containing fragments of escaped XML, e.g.
<p> this is a <a href="...">link</a> </p>
and want to convert the characters into SAX events:
<p> this is a <a href="...">link</a> </p>
I collect and assemble the character events, but I don't know how
to parse the resulting string and generate SAX events without
too much effort.
I tried StringXMLizable and XMLByteStreamInterpreter, but ran
into problems because contentHandler.startElement() is called
or the prolog is not correct.
What's the best way to do this?
TIA for any pointers!
-- Andreas
--
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: Un-Escaping XML in transformer
Posted by Andreas Hartmann <an...@apache.org>.
Grzegorz Kossakowski schrieb:
> Andreas Hartmann pisze:
>> Hi Cocooners,
>>
>> I have a SAX stream containing fragments of escaped XML, e.g.
>>
>> <p> this is a <a href="...">link</a> </p>
>>
>> and want to convert the characters into SAX events:
>>
>> <p> this is a <a href="...">link</a> </p>
>>
>> I collect and assemble the character events, but I don't know how
>> to parse the resulting string and generate SAX events without
>> too much effort.
>>
>> I tried StringXMLizable and XMLByteStreamInterpreter, but ran
>> into problems because contentHandler.startElement() is called
>> or the prolog is not correct.
>>
>> What's the best way to do this?
>>
>> TIA for any pointers!
>
> I fear that your only option is to serialize XML, replace all escaped
> characters and parse it again. Serializing and parsing is really easy
> even inside transformer.
Thanks!
Here's something that doesn't look nice, but basically seems to work:
String string = "<unescape:wrap
xmlns:unescape=\"http://apache.org/lenya/unescape/1.0\">"
+ this.buffer.toString() + "</unescape:wrap>";
StringXMLizable xml = new StringXMLizable(string);
FragmentHandler fragmentHandler = new FragmentHandler(this.contentHandler);
xml.toSAX(fragmentHandler);
The FragmentHandler filters the startDocument() and endDocument() events
and the start/end events for the <wrap> element.
I'll do some more testing.
-- Andreas
--
Andreas Hartmann, CTO
BeCompany GmbH
http://www.becompany.ch
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org
Re: Un-Escaping XML in transformer
Posted by Grzegorz Kossakowski <gk...@apache.org>.
Andreas Hartmann pisze:
> Hi Cocooners,
>
> I have a SAX stream containing fragments of escaped XML, e.g.
>
> <p> this is a <a href="...">link</a> </p>
>
> and want to convert the characters into SAX events:
>
> <p> this is a <a href="...">link</a> </p>
>
> I collect and assemble the character events, but I don't know how
> to parse the resulting string and generate SAX events without
> too much effort.
>
> I tried StringXMLizable and XMLByteStreamInterpreter, but ran
> into problems because contentHandler.startElement() is called
> or the prolog is not correct.
>
> What's the best way to do this?
>
> TIA for any pointers!
I fear that your only option is to serialize XML, replace all escaped characters and parse it again. Serializing and parsing is really easy
even inside transformer.
--
Grzegorz Kossakowski
http://reflectingonthevicissitudes.wordpress.com/
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@cocoon.apache.org
For additional commands, e-mail: users-help@cocoon.apache.org