You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by minggi <ar...@townux.ch> on 2008/11/11 09:32:46 UTC
Ignore Whitespace
I tried to parse the following xml:
<person>
<name>test</name>
<tel>test</tel>
<city>test</city>
</person>
The getFirstChild on the 'person' node returns a #text node. The parser adds
a empty text node.
I tried with setIgnoringElementContentWhitespace(true) and
setIgnoringElementContentWhitespace(false)... nothing works.
Have you an example for me?
thanks
--
View this message in context: http://www.nabble.com/Ignore-Whitespace-tp20435351p20435351.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.
Re: Ignore Whitespace
Posted by Lucian Cosoi <lu...@gmail.com>.
2008/11/11 minggi
>
>
> Lucian Cosoi wrote:
> >
> > 2008/11/11 minggi
> >
> >>
> >> I tried to parse the following xml:
> >> <person>
> >> <name>test</name>
> >> <tel>test</tel>
> >> <city>test</city>
> >> </person>
> >>
> >> The getFirstChild on the 'person' node returns a #text node. The parser
> >> adds
> >> a empty text node.
> >> I tried with setIgnoringElementContentWhitespace(true) and
> >> setIgnoringElementContentWhitespace(false)... nothing works.
> >>
> >> Have you an example for me?
> >>
> >> thanks
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Ignore-Whitespace-tp20435351p20435351.html
> >> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
> >>
> >
> > I found checking the node type when unserializing satisfactory:
> >
> > if ((currentNode->getNodeType() == DOMNode::TEXT_NODE) /*empty text node
> /
> > newline*/
> > || (currentNode->getNodeType() == DOMNode::COMMENT_NODE) /*comment*/)
> > {
> > // skip uninteresting nodes
> > continue;
> > }
> >
> > Best regards,
> > Lucian
> >
> >
>
> Thanks for the answer.
>
> Is there no simple possibility to solve that problem with the Parser?
>
> Minggi
>
>
Whether it is a problem with the Parser is debatable, as it can't know
_which_ white space can be ignored - some may be important to the caller.
I believe if you specify a schema to validate the xml against, the Parser
will properly ignore extra white-spaces, provided you set the "ignore"
property.
I haven't tried it myself, but that is how I expect the library to behave.
Re: Ignore Whitespace
Posted by minggi <ar...@townux.ch>.
Lucian Cosoi wrote:
>
> 2008/11/11 minggi
>
>>
>> I tried to parse the following xml:
>> <person>
>> <name>test</name>
>> <tel>test</tel>
>> <city>test</city>
>> </person>
>>
>> The getFirstChild on the 'person' node returns a #text node. The parser
>> adds
>> a empty text node.
>> I tried with setIgnoringElementContentWhitespace(true) and
>> setIgnoringElementContentWhitespace(false)... nothing works.
>>
>> Have you an example for me?
>>
>> thanks
>> --
>> View this message in context:
>> http://www.nabble.com/Ignore-Whitespace-tp20435351p20435351.html
>> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
>>
>
> I found checking the node type when unserializing satisfactory:
>
> if ((currentNode->getNodeType() == DOMNode::TEXT_NODE) /*empty text node /
> newline*/
> || (currentNode->getNodeType() == DOMNode::COMMENT_NODE) /*comment*/)
> {
> // skip uninteresting nodes
> continue;
> }
>
> Best regards,
> Lucian
>
>
Thanks for the answer.
Is there no simple possibility to solve that problem with the Parser?
Minggi
--
View this message in context: http://www.nabble.com/Ignore-Whitespace-tp20435351p20436025.html
Sent from the Xerces - C - Users mailing list archive at Nabble.com.
Re: Ignore Whitespace
Posted by Lucian Cosoi <lu...@gmail.com>.
2008/11/11 minggi
>
> I tried to parse the following xml:
> <person>
> <name>test</name>
> <tel>test</tel>
> <city>test</city>
> </person>
>
> The getFirstChild on the 'person' node returns a #text node. The parser
> adds
> a empty text node.
> I tried with setIgnoringElementContentWhitespace(true) and
> setIgnoringElementContentWhitespace(false)... nothing works.
>
> Have you an example for me?
>
> thanks
> --
> View this message in context:
> http://www.nabble.com/Ignore-Whitespace-tp20435351p20435351.html
> Sent from the Xerces - C - Users mailing list archive at Nabble.com.
>
I found checking the node type when unserializing satisfactory:
if ((currentNode->getNodeType() == DOMNode::TEXT_NODE) /*empty text node /
newline*/
|| (currentNode->getNodeType() == DOMNode::COMMENT_NODE) /*comment*/)
{
// skip uninteresting nodes
continue;
}
Best regards,
Lucian
Re: Ignore Whitespace
Posted by David Bertoni <db...@apache.org>.
minggi wrote:
> I tried to parse the following xml:
> <person>
> <name>test</name>
> <tel>test</tel>
> <city>test</city>
> </person>
>
> The getFirstChild on the 'person' node returns a #text node. The parser adds
> a empty text node.
> I tried with setIgnoringElementContentWhitespace(true) and
> setIgnoringElementContentWhitespace(false)... nothing works.
>
> Have you an example for me?
The XML recommendation requires the parser to report all whitespace if
there's no DTD or the DTD doesn't define an element's content as
whitespace-only. If you have a DTD and it defines the "person" element
as containing only element content, then the parser will report
"ignorable whitespace" or "element content whitespace."
What version of Xerces-C are you using with this member function? I
can't find it anywhere.
Dave