You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by "Nguyen, Linh" <Li...@ccra-adrc.gc.ca> on 2003/02/21 18:19:45 UTC

vs

Hi everyone,

I need some help on a Xerces SAX issue.

If a mandatory field (mandatory as defined by the schema) is empty and the
full empty tax syntax is used, Xerces calls startElement before throwing the
exception.  But if the short form is used, Xerces never calls
startElement/endElement, so I don't have a chance to find out the name of
the tag in error.

Does anyone observe the same behaviour?

Is there any way to obtain the name of the tag in error, if the short form
of empty tag is used?  Xerces eventually includes the tag name in its
textual exception message, but who wants to parse it for information.

TIA
-- Linh

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org


Re: vs

Posted by Elliotte Rusty Harold <el...@metalab.unc.edu>.
>Hi everyone,
>
>I need some help on a Xerces SAX issue.
>
>If a mandatory field (mandatory as defined by the schema) is empty and the
>full empty tax syntax is used, Xerces calls startElement before throwing the
>exception.  But if the short form is used, Xerces never calls
>startElement/endElement, so I don't have a chance to find out the name of
>the tag in error.

Curious. I'm not sure if it's really legal or not for Xerces to treat 
an empty-element tag as different from a start-tag end-tag pair in 
this case. They mean the same thing, but is it legal to detect the 
validity error earlier or later depending on form? I suspect what's 
going on is that when Xercecs sees the end of the start-tag it 
doesn't know yet the element is empty so it calls startElement() and 
then parses and discovers the problem. With an empty-element tag, it 
notices the emptiness immediately. I don't really know the code base, 
but I suggest Xercces should be fixed so these two cases are treated 
the same; that is, call startElement() before reporting the exception 
in both cases.

>Is there any way to obtain the name of the tag in error, if the short form
>of empty tag is used?  Xerces eventually includes the tag name in its
>textual exception message, but who wants to parse it for information.

Yes. The exception is not actually thrown by the parser. (At least it 
shouldn't be. If it is, that's a bug.) It is reported to the 
registered ErrorHandler through the error() method. You can trap this 
and do what you want with it, including looking at some field 
somewhere that contains the name of the last or next element seen. 
You do not need to stop parsing due to the validity error. In fact, 
in most circumstances, I recommend that you do not stop parsing. 
However, it's still tricky to figure out whether you need the next or 
the last element, but I suspect that it's possible if a little 
hackish.
-- 

+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|           Processing XML with Java (Addison-Wesley, 2002)          |
|              http://www.cafeconleche.org/books/xmljava             |
| http://www.amazon.com/exec/obidos/ISBN%3D0201771861/cafeaulaitA  |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://www.cafeaulait.org/      |
|  Read Cafe con Leche for XML News: http://www.cafeconleche.org/    |
+----------------------------------+---------------------------------+

---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-j-user-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-j-user-help@xml.apache.org