You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Christian Nelson <cn...@slac.com> on 2003/05/08 22:38:41 UTC

DOM Parsing and Whitespace

Greetings...

I'm parsing XML using a DOM parser and everything works as expected until
whiutespace is added to the xml between elements (e.g. load file in xml
spy and format xml).  When I parse the xml which contains I'm hitting a
bunch of unexpected text elements.

Ideally, I'd like to have the parser ignore the whitespace.  It sounds
like
setting the folowing features should do that (but it's not):

"http://xml.org/sax/features/validation" true
"http://apache.org/xml/features/validation/schema" true
"http://apache.org/xml/features/validation/schema-full-checking" true
"http://apache.org/xml/features/dom/include-ignorable-whitespace" false

Note: I do have a schema and it's referenced by the file.

What might I be doing wrong?  I've attached a short test program...

Thanks in advance...
Christian

---------------------------------------------------------------------------
 Christian 'xian' Nelson                                  cnelson@slac.com
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    "Don't ask yourself what the world needs.  Ask yourself what makes
  you come alive, and go do that, because what the world needs is people
                  who have come alive." -- Howard Thurman
---------------------------------------------------------------------------


Re: DOM Parsing and Whitespace

Posted by Michael Rafael Glavassevich <mr...@engmail.uwaterloo.ca>.
Hi Christian,

The 'include-ignorable-whitespace' DOM feature is only relevant when your
document has a DTD grammar. Have a look at the docs (specifically the
note) for the feature at:
http://xml.apache.org/xerces2-j/features.html#dom.include-ignorable-whitespace.

-----------------------------
Michael Glavassevich
mrglavas@engmail.uwaterloo.ca
4B Computer Engineering
University of Waterloo

On Thu, 8 May 2003, Christian Nelson wrote:

>
> Greetings...
>
> I'm parsing XML using a DOM parser and everything works as expected until
> whiutespace is added to the xml between elements (e.g. load file in xml
> spy and format xml).  When I parse the xml which contains I'm hitting a
> bunch of unexpected text elements.
>
> Ideally, I'd like to have the parser ignore the whitespace.  It sounds
> like
> setting the folowing features should do that (but it's not):
>
> "http://xml.org/sax/features/validation" true
> "http://apache.org/xml/features/validation/schema" true
> "http://apache.org/xml/features/validation/schema-full-checking" true
> "http://apache.org/xml/features/dom/include-ignorable-whitespace" false
>
> Note: I do have a schema and it's referenced by the file.
>
> What might I be doing wrong?  I've attached a short test program...
>
> Thanks in advance...
> Christian
>
> ---------------------------------------------------------------------------
>  Christian 'xian' Nelson                                  cnelson@slac.com
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>     "Don't ask yourself what the world needs.  Ask yourself what makes
>   you come alive, and go do that, because what the world needs is people
>                   who have come alive." -- Howard Thurman
> ---------------------------------------------------------------------------
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: DOM Parsing and Whitespace

Posted by Christian Nelson <cn...@slac.com>.
THe document does point to a schema, and the parser finds that fine and
validates the document correctly.  It just doesn't ax the whitespace.
Someone else mention that the include-ignorable-whitespace feature only
works for DTDs, not schemas.

Now the question is, what's the equivilent feature or option for when one
is using a schema in place of a DTD?

Thanks,
Christian

On Thu, 8 May 2003, Joseph Kesselman wrote:

> Does your document point to a DTD, so the parser can tell what is
> whitespace-in-element-content and what isn't?
>
> Note that the schema working group decided that schemas do _not_ set this
> flag in the Infoset.
>
> ______________________________________
> Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more.
> "The world changed profoundly and unpredictably the day Tim Berners Lee
> got bitten by a radioactive spider." -- Sandy Tyra, in r.m.filk

---------------------------------------------------------------------------
 Christian 'xian' Nelson                                  cnelson@slac.com
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
    "Don't ask yourself what the world needs.  Ask yourself what makes
  you come alive, and go do that, because what the world needs is people
                  who have come alive." -- Howard Thurman
---------------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: DOM Parsing and Whitespace

Posted by Joseph Kesselman <ke...@us.ibm.com>.
Does your document point to a DTD, so the parser can tell what is 
whitespace-in-element-content and what isn't?

Note that the schema working group decided that schemas do _not_ set this 
flag in the Infoset.

______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more. 
"The world changed profoundly and unpredictably the day Tim Berners Lee 
got bitten by a radioactive spider." -- Sandy Tyra, in r.m.filk