You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Mark Kurley <MK...@p21.com> on 2002/02/11 17:11:49 UTC

CDATA - whitespace

I am having a problem with the xerces parser adding newline characters to
the beginning and end of the data in a CDATA section.  I have a system that
creates an xml document with a CDATA section and sends it to another system
for processing over JMS.  The other system parses the document with the DOM
and pulls the data out of the CDATA section.  The problem is that the data
is surrounded with two newline characters at the beginning and end of the
data. 

This is part of the xml document that I created on the originating system.

	<UPDATEITEM>
         		<![CDATA[item1	something]]>
	</UPDATEITEM>

When the receiving system pulls the data out it is prefixed and suffixed
with two newline characters.  It would be easy to trim() the string but I
don't want to assume that I should always trim the CDATA section because
sometimes it will remove one too many newline characters.

Any advice would be appreciated.

Thanks
-mark


Visit our website at http://www.p21.com/visit 
The information in this e-mail is confidential and may contain legally
privileged information.  It is intended solely for the person or entity to
which it is addressed.  Access to this e-mail by anyone else is
unauthorized. If you are not the intended recipient, any disclosure,
copying, distribution, action taken, or action omitted to be taken in
reliance on it, is prohibited and may be unlawful.  If you received this
e-mail in error, please contact the sender and delete the material from any
computer. 



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: CDATA - whitespace

Posted by Andy Clark <an...@apache.org>.
Mark Kurley wrote:
> This is part of the xml document that I created on the originating system.
> 
>         <UPDATEITEM>
>                         <![CDATA[item1  something]]>
>         </UPDATEITEM>
> 
> When the receiving system pulls the data out it is prefixed and suffixed
> with two newline characters.  It would be easy to trim() the string but I

I'm using your data and I do not get the "inserted" newline
characters that you are talking about. Here is the output from
the sax.Writer sample program:

setDocumentLocator(locator=org.apache.xerces.parsers.AbstractSAXParser$LocatorProxy@93dcd)
startDocument()

startElement(uri="",localName="UPDATEITEM",qname="UPDATEITEM",attributes={})
  characters(text="\n   ")
  startCDATA()
   characters(text="item1  something")
  endCDATA()
  characters(text="\n")
 endElement(uri="",localName="UPDATEITEM",qname="UPDATEITEM")
endDocument()

If you are referring to the characters call before the
startCDATA method and the characters call after the endCDATA 
method, then this is not incorrect behavior. This is to be
expected. 

If you do not want the newlines in your data, then generate 
your document without the leading and trailing newlines. For
example:

  <UPDATEITEM><![CDATA[item1  something]]></UPDATEITEM>

-- 
Andy Clark * andyc@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org