You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Cirip Tomas <to...@theimo.com> on 2003/06/30 22:13:26 UTC

SAX Parser and CDATA

Hi all,

I have xml file and I use SAX parser to parse it. Why is SAX parser parsing
content of CDATA section? I am using default configuration of SAX parser.
All I need to do is to get content of CDATA section and dump it to the file,
all this using SAX parser. Any suggestions? Thanks

<?xml version = '1.0' encoding = 'UTF-8'?>
<test:root xmlns:test="urn:mytest">
   <test:action><![CDATA[<?xml version="1.0"?>
	<Employee>
         <NAME>Steve Smith</NAME>
         <FLIGHT>
            <FLIGHT_ROW num="1">
               <FLIGHTID>190021</FLIGHTID>
               <FLIGHTNUMBER>UA85</FLIGHTNUMBER>
               <DEPART_FROM>SFO</DEPART_FROM>
               <ARRIVE_TO>JFK</ARRIVE_TO>
               <SCHEDULED_DATE>11/18/2098 0:0:0</SCHEDULED_DATE>
               <SCHEDULED_TIME>17:05</SCHEDULED_TIME>
               <EQUIPMENT>12312231</EQUIPMENT>
               <CREW>8001</CREW>
            </FLIGHT_ROW>
            <FLIGHT_ROW num="2">
               <FLIGHTID>19828</FLIGHTID>
               <FLIGHTNUMBER>UA86</FLIGHTNUMBER>
               <DEPART_FROM>JFK</DEPART_FROM>
               <ARRIVE_TO>SFO</ARRIVE_TO>
               <SCHEDULED_DATE>11/18/2098 0:0:0</SCHEDULED_DATE>
               <SCHEDULED_TIME>21:00</SCHEDULED_TIME>
               <EQUIPMENT>5534222</EQUIPMENT>
               <CREW>8002</CREW>
            </FLIGHT_ROW>
         </FLIGHT>
      </Employee>]]>
   </test:action>
</test:root>

---
Tomas Cirip



*----------------------------------------
This message is intended only for the use of the intended recipients, and it may be privileged and confidential. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, circulation or other use of this message is strictly prohibited. If you are not the intended recipient, please notify me immediately by return e-mail, and delete this message from your system.
*----------------------------------------


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: SAX Parser and CDATA

Posted by Michael Glavassevich <mr...@apache.org>.
Hi Tomas,

There seem to be articles floating around the net which claim that CDATA
sections are 'ignored' by XML parsers. One such article is at:
http://www.w3schools.com/xml/xml_cdata.asp. It contains quite a bit of
erroneous information, including the claim that CDATA sections are
ignored. CDATA sections are parsed. They are not ignored (or skipped).

CDATA sections are identical to regular character data, except that you're
allowed to include '<' and '&' in these sections without having to escape
them. In fact you cannot escape any characters in CDATA.

If you've registered a ContentHandler with the parser, the content of the
CDATA section will be reported in the characters callback. You'll have to
buffer the text yourself, as the parser is free to split the section into
multiple callbacks for performance reasons.

Hope that helps.

On Mon, 30 Jun 2003, Cirip Tomas wrote:

> Hi all,
>
> I have xml file and I use SAX parser to parse it. Why is SAX parser parsing
> content of CDATA section? I am using default configuration of SAX parser.
> All I need to do is to get content of CDATA section and dump it to the file,
> all this using SAX parser. Any suggestions? Thanks
>
> <?xml version = '1.0' encoding = 'UTF-8'?>
> <test:root xmlns:test="urn:mytest">
>    <test:action><![CDATA[<?xml version="1.0"?>
> 	<Employee>
>          <NAME>Steve Smith</NAME>
>          <FLIGHT>
>             <FLIGHT_ROW num="1">
>                <FLIGHTID>190021</FLIGHTID>
>                <FLIGHTNUMBER>UA85</FLIGHTNUMBER>
>                <DEPART_FROM>SFO</DEPART_FROM>
>                <ARRIVE_TO>JFK</ARRIVE_TO>
>                <SCHEDULED_DATE>11/18/2098 0:0:0</SCHEDULED_DATE>
>                <SCHEDULED_TIME>17:05</SCHEDULED_TIME>
>                <EQUIPMENT>12312231</EQUIPMENT>
>                <CREW>8001</CREW>
>             </FLIGHT_ROW>
>             <FLIGHT_ROW num="2">
>                <FLIGHTID>19828</FLIGHTID>
>                <FLIGHTNUMBER>UA86</FLIGHTNUMBER>
>                <DEPART_FROM>JFK</DEPART_FROM>
>                <ARRIVE_TO>SFO</ARRIVE_TO>
>                <SCHEDULED_DATE>11/18/2098 0:0:0</SCHEDULED_DATE>
>                <SCHEDULED_TIME>21:00</SCHEDULED_TIME>
>                <EQUIPMENT>5534222</EQUIPMENT>
>                <CREW>8002</CREW>
>             </FLIGHT_ROW>
>          </FLIGHT>
>       </Employee>]]>
>    </test:action>
> </test:root>
>
> ---
> Tomas Cirip
>
>
>
> *----------------------------------------
> This message is intended only for the use of the intended recipients, and it may be privileged and confidential. If you are not the intended recipient, you are hereby notified that any review, retransmission, conversion to hard copy, copying, circulation or other use of this message is strictly prohibited. If you are not the intended recipient, please notify me immediately by return e-mail, and delete this message from your system.
> *----------------------------------------
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-j-user-help@xml.apache.org
>
>

--------------------
Michael Glavassevich
mrglavas@apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org