You are viewing a plain text version of this content. The canonical link for it is here.

Posted to c-users@xerces.apache.org by minggi <ar...@townux.ch> on 2008/11/11 09:32:46 UTC

Ignore Whitespace

I tried to parse the following xml:
<person>
	<name>test</name>
	<tel>test</tel>
	<city>test</city>
</person>

The getFirstChild on the 'person' node returns a #text node. The parser adds
a empty text node.
I tried with setIgnoringElementContentWhitespace(true) and
setIgnoringElementContentWhitespace(false)... nothing works.

Have you an example for me?

thanks
-- 
View this message in context: http://www.nabble.com/Ignore-Whitespace-tp20435351p20435351.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.

Re: Ignore Whitespace

Posted by Lucian Cosoi <lu...@gmail.com>.

2008/11/11 minggi

>
>
> Lucian Cosoi wrote:
> >
> > 2008/11/11 minggi
> >
> >>
> >> I tried to parse the following xml:
> >> <person>
> >>        <name>test</name>
> >>        <tel>test</tel>
> >>        <city>test</city>
> >> </person>
> >>
> >> The getFirstChild on the 'person' node returns a #text node. The parser
> >> adds
> >> a empty text node.
> >> I tried with setIgnoringElementContentWhitespace(true) and
> >> setIgnoringElementContentWhitespace(false)... nothing works.
> >>
> >> Have you an example for me?
> >>
> >> thanks
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Ignore-Whitespace-tp20435351p20435351.html
> >> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
> >>
> >
> > I found checking the node type when unserializing satisfactory:
> >
> > if ((currentNode->getNodeType() == DOMNode::TEXT_NODE) /*empty text node
> /
> > newline*/
> > || (currentNode->getNodeType() == DOMNode::COMMENT_NODE) /*comment*/)
> > {
> > // skip uninteresting nodes
> > continue;
> > }
> >
> > Best regards,
> > Lucian
> >
> >
>
> Thanks for the answer.
>
> Is there no simple possibility to solve that problem with the Parser?
>
> Minggi
>
>
Whether it is a problem with the Parser is debatable, as it can't know
_which_ white space can be ignored - some may be important to the caller.
I believe if you specify a schema to validate the xml against, the Parser
will properly ignore extra white-spaces, provided you set the "ignore"
property.

I haven't tried it myself, but that is how I expect the library to behave.

Re: Ignore Whitespace

Posted by minggi <ar...@townux.ch>.



Lucian Cosoi wrote:
> 
> 2008/11/11 minggi
> 
>>
>> I tried to parse the following xml:
>> <person>
>>        <name>test</name>
>>        <tel>test</tel>
>>        <city>test</city>
>> </person>
>>
>> The getFirstChild on the 'person' node returns a #text node. The parser
>> adds
>> a empty text node.
>> I tried with setIgnoringElementContentWhitespace(true) and
>> setIgnoringElementContentWhitespace(false)... nothing works.
>>
>> Have you an example for me?
>>
>> thanks
>> --
>> View this message in context:
>> http://www.nabble.com/Ignore-Whitespace-tp20435351p20435351.html
>> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
>>
> 
> I found checking the node type when unserializing satisfactory:
> 
> if ((currentNode->getNodeType() == DOMNode::TEXT_NODE) /*empty text node /
> newline*/
> || (currentNode->getNodeType() == DOMNode::COMMENT_NODE) /*comment*/)
> {
> // skip uninteresting nodes
> continue;
> }
> 
> Best regards,
> Lucian
> 
> 

Thanks for the answer.

Is there no simple possibility to solve that problem with the Parser?

Minggi

-- 
View this message in context: http://www.nabble.com/Ignore-Whitespace-tp20435351p20436025.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.

Re: Ignore Whitespace

Posted by Lucian Cosoi <lu...@gmail.com>.

2008/11/11 minggi

>
> I tried to parse the following xml:
> <person>
>        <name>test</name>
>        <tel>test</tel>
>        <city>test</city>
> </person>
>
> The getFirstChild on the 'person' node returns a #text node. The parser
> adds
> a empty text node.
> I tried with setIgnoringElementContentWhitespace(true) and
> setIgnoringElementContentWhitespace(false)... nothing works.
>
> Have you an example for me?
>
> thanks
> --
> View this message in context:
> http://www.nabble.com/Ignore-Whitespace-tp20435351p20435351.html
> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
>

I found checking the node type when unserializing satisfactory:

if ((currentNode->getNodeType() == DOMNode::TEXT_NODE) /*empty text node /
newline*/
|| (currentNode->getNodeType() == DOMNode::COMMENT_NODE) /*comment*/)
{
// skip uninteresting nodes
continue;
}

Best regards,
Lucian

Re: Ignore Whitespace

Posted by David Bertoni <db...@apache.org>.

minggi wrote:
> I tried to parse the following xml:
> <person>
> 	<name>test</name>
> 	<tel>test</tel>
> 	<city>test</city>
> </person>
> 
> The getFirstChild on the 'person' node returns a #text node. The parser adds
> a empty text node.
> I tried with setIgnoringElementContentWhitespace(true) and
> setIgnoringElementContentWhitespace(false)... nothing works.
> 
> Have you an example for me?
The XML recommendation requires the parser to report all whitespace if 
there's no DTD or the DTD doesn't define an element's content as 
whitespace-only.  If you have a DTD and it defines the "person" element 
as containing only element content, then the parser will report 
"ignorable whitespace" or "element content whitespace."

What version of Xerces-C are you using with this member function?  I 
can't find it anywhere.

Dave