You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Chris Bowditch <bo...@hotmail.com> on 2013/05/28 18:41:33 UTC
Problem with serializing Text Data
Hi All,
I've been searching JIRA for any issue serializing text data that
contains CDATA keyword (but is not a fully formed CDATA section) I
couldn't see one, so I'm posting here before I starting debugging the
Serializer code to see if anyone has seen this issue.
In the input XML we have the following text node:
<value><![CDATA[-1]]></value>
Our application is using Xerces to parse this XML and its correctly
recognized as a character event. If I try to serialize this same
character event, the resulting XML ends up like:
<value><![CDATA[-1]]]]><![CDATA[></value>
This looks wrong to me and results in a malformed XML File.
Any input would be welcomed.
Thanks,
Chris
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org
Re: Problem with serializing Text Data
Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Chris,
I would generally expect any XML serializer to escape the '<' and '>'
characters that appear in textual content when writing to an OutputStream.
If they're not being escaped that is odd.
Thanks.
Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
Chris Bowditch <bo...@hotmail.com> wrote on 05/29/2013 05:25:42
AM:
> Hi Michael,
>
> Thanks for your reply. The code to serialize is fairly straight forward:
>
> Properties serializerProperties = properties;
> serializerProperties =
> OutputPropertiesFactory.getDefaultMethodProperties(Method.XML);
> Serializer serializer =
> SerializerFactory.getSerializer(serializerProperties);
> serializer.setOutputStream(m_out);
>
> and then serializer.asContentHandler() is called to get a content
> handler and the SAX Events from the parsing chain are tied to that. I've
> used a debugger to examine the SAX Events and the below text is treated
> as characters event all the way through. There are no calls to
> startCDATA()/endCDATA() around this text.
>
> The XML itself is provided by a customer of mine. The mixture of escaped
> and unescaped characters in the CDATA definition is very unusual,
> although still well formed, I don't know the reason why my customer has
> choosen to use such a strange sequence.
>
> Thanks,
>
> Chris
>
> On 28/05/2013 18:06, Michael Glavassevich wrote:
> > Hi Chris,
> >
> > What XML API are you using for serializing your document?
> >
> > A code snippet showing what you did might help.
> >
> > Thanks.
> >
> > Michael Glavassevich
> > XML Technologies and WAS Development
> > IBM Toronto Lab
> > E-mail: mrglavas@ca.ibm.com
> > E-mail: mrglavas@apache.org
> >
> > Chris Bowditch <bo...@hotmail.com> wrote on 05/28/2013
12:41:33
> > PM:
> >
> >> Hi All,
> >>
> >> I've been searching JIRA for any issue serializing text data that
> >> contains CDATA keyword (but is not a fully formed CDATA section) I
> >> couldn't see one, so I'm posting here before I starting debugging the
> >> Serializer code to see if anyone has seen this issue.
> >>
> >> In the input XML we have the following text node:
> >>
> >> <value><![CDATA[-1]]></value>
> >>
> >> Our application is using Xerces to parse this XML and its correctly
> >> recognized as a character event. If I try to serialize this same
> >> character event, the resulting XML ends up like:
> >>
> >> <value><![CDATA[-1]]]]><![CDATA[></value>
> >>
> >> This looks wrong to me and results in a malformed XML File.
> >>
> >> Any input would be welcomed.
> >>
> >> Thanks,
> >>
> >> Chris
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> >> For additional commands, e-mail: j-users-help@xerces.apache.org
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> > For additional commands, e-mail: j-users-help@xerces.apache.org
> >
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org
Re: Problem with serializing Text Data
Posted by Chris Bowditch <bo...@hotmail.com>.
Hi Michael,
Thanks for your reply. The code to serialize is fairly straight forward:
Properties serializerProperties = properties;
serializerProperties =
OutputPropertiesFactory.getDefaultMethodProperties(Method.XML);
Serializer serializer =
SerializerFactory.getSerializer(serializerProperties);
serializer.setOutputStream(m_out);
and then serializer.asContentHandler() is called to get a content
handler and the SAX Events from the parsing chain are tied to that. I've
used a debugger to examine the SAX Events and the below text is treated
as characters event all the way through. There are no calls to
startCDATA()/endCDATA() around this text.
The XML itself is provided by a customer of mine. The mixture of escaped
and unescaped characters in the CDATA definition is very unusual,
although still well formed, I don't know the reason why my customer has
choosen to use such a strange sequence.
Thanks,
Chris
On 28/05/2013 18:06, Michael Glavassevich wrote:
> Hi Chris,
>
> What XML API are you using for serializing your document?
>
> A code snippet showing what you did might help.
>
> Thanks.
>
> Michael Glavassevich
> XML Technologies and WAS Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> Chris Bowditch <bo...@hotmail.com> wrote on 05/28/2013 12:41:33
> PM:
>
>> Hi All,
>>
>> I've been searching JIRA for any issue serializing text data that
>> contains CDATA keyword (but is not a fully formed CDATA section) I
>> couldn't see one, so I'm posting here before I starting debugging the
>> Serializer code to see if anyone has seen this issue.
>>
>> In the input XML we have the following text node:
>>
>> <value><![CDATA[-1]]></value>
>>
>> Our application is using Xerces to parse this XML and its correctly
>> recognized as a character event. If I try to serialize this same
>> character event, the resulting XML ends up like:
>>
>> <value><![CDATA[-1]]]]><![CDATA[></value>
>>
>> This looks wrong to me and results in a malformed XML File.
>>
>> Any input would be welcomed.
>>
>> Thanks,
>>
>> Chris
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: j-users-help@xerces.apache.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org
Re: Problem with serializing Text Data
Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Chris,
What XML API are you using for serializing your document?
A code snippet showing what you did might help.
Thanks.
Michael Glavassevich
XML Technologies and WAS Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org
Chris Bowditch <bo...@hotmail.com> wrote on 05/28/2013 12:41:33
PM:
> Hi All,
>
> I've been searching JIRA for any issue serializing text data that
> contains CDATA keyword (but is not a fully formed CDATA section) I
> couldn't see one, so I'm posting here before I starting debugging the
> Serializer code to see if anyone has seen this issue.
>
> In the input XML we have the following text node:
>
> <value><![CDATA[-1]]></value>
>
> Our application is using Xerces to parse this XML and its correctly
> recognized as a character event. If I try to serialize this same
> character event, the resulting XML ends up like:
>
> <value><![CDATA[-1]]]]><![CDATA[></value>
>
> This looks wrong to me and results in a malformed XML File.
>
> Any input would be welcomed.
>
> Thanks,
>
> Chris
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
> For additional commands, e-mail: j-users-help@xerces.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org