You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Jacob Kjome <ho...@visi.com> on 2006/08/07 14:12:36 UTC

Xerces2 DOMParser and getting rid of whitespace

I'm wondering how I get rid of extra whitespace.  I seem to be 
getting an extra text node in the DOM tree for every return 
character/linefeed at the end of a line.  I've tried the following 
with no apparent change in behavior....

fConfiguration.setFeature("http://apache.org/xml/features/dom/include-ignorable-whitespace", 
false);


Are return characters/line feeds not "ignorable-whitespace"?  Seems 
to me they are unless they are inside <pre> or the like.  Then again, 
maybe I'm missing something?


Jake 


---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org


Re: Xerces2 DOMParser and getting rid of whitespace

Posted by Stanimir Stamenkov <st...@myrealbox.com>.
/Jacob Kjome/:

> I'm wondering how I get rid of extra whitespace.  I seem to be getting 
> an extra text node in the DOM tree for every return character/linefeed 
> at the end of a line.  I've tried the following with no apparent change 
> in behavior....
> 
> fConfiguration.setFeature("http://apache.org/xml/features/dom/include-ignorable-whitespace", 
> false);

You need to supply a DTD as noted in the documentation 
<http://xerces.apache.org/xerces2-j/features.html#dom.include-ignorable-whitespace>.

"Or you could use the DOM Level 3 LSParser and filter out content 
(including whitespace) you're not interested in with an 
LSParserFilter." as noted by Michael Glavassevich in a previous 
message 
<http://mail-archives.apache.org/mod_mbox/xerces-j-users/200606.mbox/%3cOFD0484F74.C9FEA5E2-ON85257198.0068407E-85257198.00698F6A@ca.ibm.com%3e>.

> Are return characters/line feeds not "ignorable-whitespace"?  Seems to 
> me they are unless they are inside <pre> or the like.  Then again, maybe 
> I'm missing something?

You could see the 
DocumentBuilderFactory.setIgnoringElementContentWhitespace(boolean) 
documentation 
<http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/DocumentBuilderFactory.html#setIgnoringElementContentWhitespace(boolean)> 
which further refers to the XML spec.

-- 
Stanimir

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org