You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Guillaume Deshors <gu...@deshors.net> on 2010/05/21 11:13:25 UTC

not finding all bad data elements at once

Hi

I am facing exactly the issue reported by this oxygen user :
http://www.oxygenxml.com/forum/topic2869.html?sid=a4b044cdcc3c06ae07059f924156f6d3
When using several keyrefs causing errors, Xerces reports only the first and
then gives up.

I would really like to have Xerces report all foreign key errors, like I've
seen with the saxon or libXml parsers. Is there a way to do that ?

I've done a few hours of search and I've read that it is a fatal error to
Xerces and that's why it stops. I've tried to set continue-after-fatal-error
to true but it didn't change anything ; anyway I've read that it's not
recommended to set this option in a production environment.

I'm joining a sample XML and XSD files that show my problem. In my example,
each ASSREGARD references a ASSFILDO, I've created two bad ASSREGARD that
reference non-existent ASSFILDO. Then Xerces reports only :
     Description: [Xerces] cvc-identity-constraint.4.3: Key
'FK_ASSFILDOREGARD' with value 'non-existent1' not found for identity
constraint of element 'FeatureCollection'.
     URL: http://www.w3.org/TR/xmlschema-1/#cvc-identity-constraint

Instead of this I would like it to report both errors, on non-existent1 and
non-existent2. Thank you very much for any support !

Regards, Guillaume

Here are the two files :

----> keyrefDemo.xml

<?xml version="1.0" encoding="UTF-8"?>

<glml:FeatureCollection
    xmlns:gml="http://www.opengis.net/gml"
    xmlns:glml="http://www.grandlyon.com/glml"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://www.grandlyon.com/glml ./keyrefDemo.xsd"  >

    <glml:ASSFILDO>
        <glml:IID_IDENTFILDO>28532</glml:IID_IDENTFILDO>
    </glml:ASSFILDO>

    <glml:ASSREGARD>
        <glml:IID_IDENTREGARD>23797</glml:IID_IDENTREGARD>
        <glml:IID_ASSFILDO>28532</glml:IID_ASSFILDO>
    </glml:ASSREGARD>

    <glml:ASSREGARD>
        <glml:IID_IDENTREGARD>bad1</glml:IID_IDENTREGARD>
        <glml:IID_ASSFILDO>non-existent1</glml:IID_ASSFILDO>
    </glml:ASSREGARD>

    <glml:ASSREGARD>
        <glml:IID_IDENTREGARD>bad2</glml:IID_IDENTREGARD>
        <glml:IID_ASSFILDO>non-existent2</glml:IID_ASSFILDO>
    </glml:ASSREGARD>

</glml:FeatureCollection>


-----> keyrefDemo.xsd

<?xml version="1.0" encoding="UTF-8" ?>
<schema targetNamespace="http://www.grandlyon.com/glml" xmlns:glml="
http://www.grandlyon.com/glml"
    xmlns:xlink="http://www.w3.org/1999/xlink"
    xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
version="0.9">

    <!-- element -->
    <element name="FeatureCollection">
        <complexType>
            <sequence>

                <element name="ASSFILDO" maxOccurs="unbounded" >
                    <complexType>
                        <sequence>
                            <element name="IID_IDENTFILDO" type="string" />
                        </sequence>
                    </complexType>
                </element>

                <element name="ASSREGARD" maxOccurs="unbounded" >
                    <complexType>
                        <sequence>
                            <element name="IID_IDENTREGARD" type="string" />
                            <element name="IID_ASSFILDO" type="string" />
                        </sequence>
                    </complexType>
                </element>

            </sequence>
        </complexType>

        <key name="PK_ASSFILDO">
            <selector xpath="glml:ASSFILDO"/>
            <field xpath="glml:IID_IDENTFILDO"/>
        </key>

        <key name="PK_ASSREGARD">
            <selector xpath="glml:ASSREGARD"/>
            <field xpath="glml:IID_IDENTREGARD"/>
        </key>

        <keyref name="FK_ASSFILDOREGARD" refer="glml:PK_ASSFILDO">
            <selector xpath="glml:ASSREGARD"/>
            <field xpath="glml:IID_ASSFILDO"/>
        </keyref>

    </element>

</schema>

Re: not finding all bad data elements at once

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
You're right. Only one error message is reported. It's been a very long
time (years) since I've looked at identity constraints in Xerces. Couldn't
tell you why it was implemented that way but that seems to be the
behaviour. It isn't wrong. Xerces would be producing the correct PSVI (i.e.
same error code applies for all the keyref errors). It's just less helpful
for troubleshooting.

Thanks.

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

guillaume.deshors@gmail.com wrote on 05/21/2010 12:48:15 PM:

> Hi
>
> Yes that's what I did, though it's not directly the class Validator
> class that's directly involved. I used a SAXParser on which I set a
> custom DefaultHandler, like this :
>
>             SAXParserFactory spf = SAXParserFactory.newInstance();
>             spf.setNamespaceAware(true);
>             spf.setValidating(true);
>             SAXParser sp = spf.newSAXParser();
>             sp.setProperty(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);
>             sp.setProperty(JAXP_SCHEMA_SOURCE, "file:" + xsdUri);
>             sp.parse(xml, new MessageContentHandler(type));
>
> Note : I've checked in the code of SaxParser ( http://kickjava.com/
> src/javax/xml/parsers/SAXParser.java.htm ) and it affects the
> HandlerBase as the ErrorHandler.
> Here is my custom class MessageContentHandler (part of it) :
>
>     private class MessageContentHandler extends DefaultHandler {
>         private String type;
>
>         private int niveau = 0;
>         private int numBalise = 0;
>
>
>         public MessageContentHandler(String type) {
>             super();
>             this.type = type;
>         }
>
>         public void warning(SAXParseException exception) throws
SAXException {
>             creerAlerte(exception.getLocalizedMessage(), type,
> getNumBalise());
>         }
>
>         public void fatalError(SAXParseException exception) throws
> SAXException {
>             creerAlerte(exception.getLocalizedMessage(), type,
> getNumBalise());
>         }
>
>         public void error(SAXParseException exception) throws
SAXException {
>             creerAlerte(exception.getLocalizedMessage(), type,
> getNumBalise());
>         }
>
> (snip)
>
> Please note that the method "creerAlerte" doesn't throw any
> exception, it just registers the messages.
>
> Hoping that you have any hint...
>
> Regards,
> Guillaume

> 2010/5/21 Michael Glavassevich <mr...@ca.ibm.com>
> Hi Guillaume,
>
> Which API are you using and did you register an error handler?
>
> The default error handler for a JAXP Validator is defined as:
>
>  class DraconianErrorHandler implements ErrorHandler {
>      public void fatalError( SAXParseException e ) throws SAXException {
>          throw e;
>      }
>      public void error( SAXParseException e ) throws SAXException {
>          throw e;
>      }
>      public void warning( SAXParseException e ) throws SAXException {
>          // noop
>      }
>  }
>
> If you're using this API and didn't supply your own error handler,
> the validator will always stop on the first error.
>
> Thanks.
>
> [1] http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/
> validation/Validator.html#setErrorHandler(org.xml.sax.ErrorHandler)
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> guillaume.deshors@gmail.com wrote on 05/21/2010 05:13:25 AM:

Re: not finding all bad data elements at once

Posted by Guillaume Deshors <gu...@deshors.net>.
Hi

Yes that's what I did, though it's not directly the class Validator class
that's directly involved. I used a SAXParser on which I set a custom
DefaultHandler, like this :

            SAXParserFactory spf = SAXParserFactory.newInstance();
            spf.setNamespaceAware(true);
            spf.setValidating(true);
            SAXParser sp = spf.newSAXParser();
            sp.setProperty(JAXP_SCHEMA_LANGUAGE, W3C_XML_SCHEMA);
            sp.setProperty(JAXP_SCHEMA_SOURCE, "file:" + xsdUri);
            sp.parse(xml, new MessageContentHandler(type));

Note : I've checked in the code of SaxParser (
http://kickjava.com/src/javax/xml/parsers/SAXParser.java.htm ) and it
affects the HandlerBase as the ErrorHandler.
Here is my custom class MessageContentHandler (part of it) :

    private class MessageContentHandler extends DefaultHandler {
        private String type;

        private int niveau = 0;
        private int numBalise = 0;


        public MessageContentHandler(String type) {
            super();
            this.type = type;
        }

        public void warning(SAXParseException exception) throws SAXException
{
            creerAlerte(exception.getLocalizedMessage(), type,
getNumBalise());
        }

        public void fatalError(SAXParseException exception) throws
SAXException {
            creerAlerte(exception.getLocalizedMessage(), type,
getNumBalise());
        }

        public void error(SAXParseException exception) throws SAXException {
            creerAlerte(exception.getLocalizedMessage(), type,
getNumBalise());
        }

(snip)

Please note that the method "creerAlerte" doesn't throw any exception, it
just registers the messages.

Hoping that you have any hint...

Regards,
Guillaume

2010/5/21 Michael Glavassevich <mr...@ca.ibm.com>

> Hi Guillaume,
>
> Which API are you using and did you register an error handler?
>
> The default error handler for a JAXP Validator is defined as:
>
>  class DraconianErrorHandler implements ErrorHandler {
>      public void fatalError( SAXParseException e ) throws SAXException {
>          throw e;
>      }
>      public void error( SAXParseException e ) throws SAXException {
>          throw e;
>      }
>      public void warning( SAXParseException e ) throws SAXException {
>          // noop
>      }
>  }
>
> If you're using this API and didn't supply your own error handler, the
> validator will always stop on the first error.
>
> Thanks.
>
> [1]
> http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/validation/Validator.html#setErrorHandler(org.xml.sax.ErrorHandler)<http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/validation/Validator.html#setErrorHandler%28org.xml.sax.ErrorHandler%29>
>
> Michael Glavassevich
> XML Parser Development
> IBM Toronto Lab
> E-mail: mrglavas@ca.ibm.com
> E-mail: mrglavas@apache.org
>
> guillaume.deshors@gmail.com wrote on 05/21/2010 05:13:25 AM:
>
>

Re: not finding all bad data elements at once

Posted by Michael Glavassevich <mr...@ca.ibm.com>.
Hi Guillaume,

Which API are you using and did you register an error handler?

The default error handler for a JAXP Validator is defined as:?

 class DraconianErrorHandler implements ErrorHandler {
     public void fatalError( SAXParseException e ) throws SAXException {?
         throw e;
     }
     public void error( SAXParseException e ) throws SAXException {
         throw e;
     }
     public void warning( SAXParseException e ) throws SAXException {
         // noop
     }
 }

If you're using this API and didn't supply your own error handler, the
validator will always stop on the first error.

Thanks.

[1]
http://xerces.apache.org/xerces2-j/javadocs/api/javax/xml/validation/Validator.html#setErrorHandler
(org.xml.sax.ErrorHandler)

Michael Glavassevich
XML Parser Development
IBM Toronto Lab
E-mail: mrglavas@ca.ibm.com
E-mail: mrglavas@apache.org

guillaume.deshors@gmail.com wrote on 05/21/2010 05:13:25 AM:

> Hi
>
> I am facing exactly the issue reported by this oxygen user : http://
>
www.oxygenxml.com/forum/topic2869.html?sid=a4b044cdcc3c06ae07059f924156f6d3
> When using several keyrefs causing errors, Xerces reports only the
> first and then gives up.
>
> I would really like to have Xerces report all foreign key errors,
> like I've seen with the saxon or libXml parsers. Is there a way to do
that ?
>
> I've done a few hours of search and I've read that it is a fatal
> error to Xerces and that's why it stops. I've tried to set continue-
> after-fatal-error to true but it didn't change anything ; anyway
> I've read that it's not recommended to set this option in a
> production environment.
>
> I'm joining a sample XML and XSD files that show my problem. In my
> example, each ASSREGARD references a ASSFILDO, I've created two bad
> ASSREGARD that reference non-existent ASSFILDO. Then Xerces reports
only :
>      Description: [Xerces] cvc-identity-constraint.4.3: Key
> 'FK_ASSFILDOREGARD' with value 'non-existent1' not found for
> identity constraint of element 'FeatureCollection'.
>      URL: http://www.w3.org/TR/xmlschema-1/#cvc-identity-constraint
>
> Instead of this I would like it to report both errors, on non-
> existent1 and non-existent2. Thank you very much for any support !
>
> Regards, Guillaume
>
> Here are the two files :
>
> ----> keyrefDemo.xml
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <glml:FeatureCollection
>     xmlns:gml="http://www.opengis.net/gml"
>     xmlns:glml="http://www.grandlyon.com/glml"
>     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>     xsi:schemaLocation="http://www.grandlyon.com/glml ./keyrefDemo.xsd"
>
>
>     <glml:ASSFILDO>
>         <glml:IID_IDENTFILDO>28532</glml:IID_IDENTFILDO>
>     </glml:ASSFILDO>
>
>     <glml:ASSREGARD>
>         <glml:IID_IDENTREGARD>23797</glml:IID_IDENTREGARD>
>         <glml:IID_ASSFILDO>28532</glml:IID_ASSFILDO>
>     </glml:ASSREGARD>
>
>     <glml:ASSREGARD>
>         <glml:IID_IDENTREGARD>bad1</glml:IID_IDENTREGARD>
>         <glml:IID_ASSFILDO>non-existent1</glml:IID_ASSFILDO>
>     </glml:ASSREGARD>
>
>     <glml:ASSREGARD>
>         <glml:IID_IDENTREGARD>bad2</glml:IID_IDENTREGARD>
>         <glml:IID_ASSFILDO>non-existent2</glml:IID_ASSFILDO>
>     </glml:ASSREGARD>
>
> </glml:FeatureCollection>
>
>
> -----> keyrefDemo.xsd
>
> <?xml version="1.0" encoding="UTF-8" ?>
> <schema targetNamespace="http://www.grandlyon.com/glml" xmlns:glml="
> http://www.grandlyon.com/glml"
>     xmlns:xlink="http://www.w3.org/1999/xlink"
>     xmlns="http://www.w3.org/2001/XMLSchema"
> elementFormDefault="qualified" version="0.9">
>
>     <!-- element -->
>     <element name="FeatureCollection">
>         <complexType>
>             <sequence>
>
>                 <element name="ASSFILDO" maxOccurs="unbounded" >
>                     <complexType>
>                         <sequence>
>                             <element name="IID_IDENTFILDO"
type="string" />
>                         </sequence>
>                     </complexType>
>                 </element>
>
>                 <element name="ASSREGARD" maxOccurs="unbounded" >
>                     <complexType>
>                         <sequence>
>                             <element name="IID_IDENTREGARD"
type="string" />
>                             <element name="IID_ASSFILDO" type="string" />
>                         </sequence>
>                     </complexType>
>                 </element>
>
>             </sequence>
>         </complexType>
>
>         <key name="PK_ASSFILDO">
>             <selector xpath="glml:ASSFILDO"/>
>             <field xpath="glml:IID_IDENTFILDO"/>
>         </key>
>
>         <key name="PK_ASSREGARD">
>             <selector xpath="glml:ASSREGARD"/>
>             <field xpath="glml:IID_IDENTREGARD"/>
>         </key>
>
>         <keyref name="FK_ASSFILDOREGARD" refer="glml:PK_ASSFILDO">
>             <selector xpath="glml:ASSREGARD"/>
>             <field xpath="glml:IID_ASSFILDO"/>
>         </keyref>
>
>     </element>
>
> </schema>

Re: not finding all bad data elements at once

Posted by Guillaume Deshors <gu...@deshors.net>.
Hi Mukul

Thanks for your answer. Understand me correctly, I'm not reporting this as a
possible bug ; this behavior is understandable but I hoped there would be a
way to nevertheless get all occurrences of this error.

In my case it is necessaray, let me explain why : the system we're building
validates the same XML three times against three different XSD, which are
basically the same but with a growing severity ; each one contains more
checks than the previous. The last XSD contains validations that are
functionnaly only considered warnings for our application, and that's where
foreign keys lie. The warnings can be ignored to process the step anyway,
but it's definitely not identical to the user if there's one warning or one
hundred. And in that case I would be glad that it's unpleasant to see !

I'll check the other answer, thank you again.

2010/5/21 Mukul Gandhi <mu...@apache.org>

> Hi Guillaume,
>   First of all, this implementation behavior could not be a bug with
> a XML Schema engine, as XSD specification doesn't recommend anything
> about details of the error reporting, upon XML Schema validation
> failure.
>
> To my opinion, at the least XSD implementations are expected to report
> validation success or failure (i.e, a kind of 'true' or 'false'
> validation outcome) after a XML schema validation episode. How
> validation error messages are constructed, and how much of error
> reporting is performed by schema engines, is implementation dependent.
>
> In this particular case, reporting all instances of
> identity-constraint violations may be a nice to have feature (but
> IMHO, it's still debatable, whether this is a best design in this
> case), but not doing so, doesn't make Xerces non-compliant to XSD spec
> :)
>
> It seems to me, that seeing all of identity-constraint violations if
> error instances are as less as 2-3, may be good to see. But if let's
> say we make the error list unbounded, producing all the error
> instances (imagine this list size to be say, 500!) could be unpleasant
> to see.
>

Re: not finding all bad data elements at once

Posted by Mukul Gandhi <mu...@apache.org>.
Hi Guillaume,
   First of all, this implementation behavior could not be a bug with
a XML Schema engine, as XSD specification doesn't recommend anything
about details of the error reporting, upon XML Schema validation
failure.

To my opinion, at the least XSD implementations are expected to report
validation success or failure (i.e, a kind of 'true' or 'false'
validation outcome) after a XML schema validation episode. How
validation error messages are constructed, and how much of error
reporting is performed by schema engines, is implementation dependent.

In this particular case, reporting all instances of
identity-constraint violations may be a nice to have feature (but
IMHO, it's still debatable, whether this is a best design in this
case), but not doing so, doesn't make Xerces non-compliant to XSD spec
:)

It seems to me, that seeing all of identity-constraint violations if
error instances are as less as 2-3, may be good to see. But if let's
say we make the error list unbounded, producing all the error
instances (imagine this list size to be say, 500!) could be unpleasant
to see.

On Fri, May 21, 2010 at 2:43 PM, Guillaume Deshors
<gu...@deshors.net> wrote:
> Hi
>
> I am facing exactly the issue reported by this oxygen user :
> http://www.oxygenxml.com/forum/topic2869.html?sid=a4b044cdcc3c06ae07059f924156f6d3
> When using several keyrefs causing errors, Xerces reports only the first and
> then gives up.
>
> I would really like to have Xerces report all foreign key errors, like I've
> seen with the saxon or libXml parsers. Is there a way to do that ?
>
> I've done a few hours of search and I've read that it is a fatal error to
> Xerces and that's why it stops. I've tried to set continue-after-fatal-error
> to true but it didn't change anything ; anyway I've read that it's not
> recommended to set this option in a production environment.
>
> I'm joining a sample XML and XSD files that show my problem. In my example,
> each ASSREGARD references a ASSFILDO, I've created two bad ASSREGARD that
> reference non-existent ASSFILDO. Then Xerces reports only :
>      Description: [Xerces] cvc-identity-constraint.4.3: Key
> 'FK_ASSFILDOREGARD' with value 'non-existent1' not found for identity
> constraint of element 'FeatureCollection'.
>      URL: http://www.w3.org/TR/xmlschema-1/#cvc-identity-constraint
>
> Instead of this I would like it to report both errors, on non-existent1 and
> non-existent2. Thank you very much for any support !
>
> Regards, Guillaume
>
> Here are the two files :
>
> ----> keyrefDemo.xml
>
> <?xml version="1.0" encoding="UTF-8"?>
>
> <glml:FeatureCollection
>     xmlns:gml="http://www.opengis.net/gml"
>     xmlns:glml="http://www.grandlyon.com/glml"
>     xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
>     xsi:schemaLocation="http://www.grandlyon.com/glml ./keyrefDemo.xsd"  >
>
>     <glml:ASSFILDO>
>         <glml:IID_IDENTFILDO>28532</glml:IID_IDENTFILDO>
>     </glml:ASSFILDO>
>
>     <glml:ASSREGARD>
>         <glml:IID_IDENTREGARD>23797</glml:IID_IDENTREGARD>
>         <glml:IID_ASSFILDO>28532</glml:IID_ASSFILDO>
>     </glml:ASSREGARD>
>
>     <glml:ASSREGARD>
>         <glml:IID_IDENTREGARD>bad1</glml:IID_IDENTREGARD>
>         <glml:IID_ASSFILDO>non-existent1</glml:IID_ASSFILDO>
>     </glml:ASSREGARD>
>
>     <glml:ASSREGARD>
>         <glml:IID_IDENTREGARD>bad2</glml:IID_IDENTREGARD>
>         <glml:IID_ASSFILDO>non-existent2</glml:IID_ASSFILDO>
>     </glml:ASSREGARD>
>
> </glml:FeatureCollection>
>
>
> -----> keyrefDemo.xsd
>
> <?xml version="1.0" encoding="UTF-8" ?>
> <schema targetNamespace="http://www.grandlyon.com/glml"
> xmlns:glml="http://www.grandlyon.com/glml"
>     xmlns:xlink="http://www.w3.org/1999/xlink"
>     xmlns="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"
> version="0.9">
>
>     <!-- element -->
>     <element name="FeatureCollection">
>         <complexType>
>             <sequence>
>
>                 <element name="ASSFILDO" maxOccurs="unbounded" >
>                     <complexType>
>                         <sequence>
>                             <element name="IID_IDENTFILDO" type="string" />
>                         </sequence>
>                     </complexType>
>                 </element>
>
>                 <element name="ASSREGARD" maxOccurs="unbounded" >
>                     <complexType>
>                         <sequence>
>                             <element name="IID_IDENTREGARD" type="string" />
>                             <element name="IID_ASSFILDO" type="string" />
>                         </sequence>
>                     </complexType>
>                 </element>
>
>             </sequence>
>         </complexType>
>
>         <key name="PK_ASSFILDO">
>             <selector xpath="glml:ASSFILDO"/>
>             <field xpath="glml:IID_IDENTFILDO"/>
>         </key>
>
>         <key name="PK_ASSREGARD">
>             <selector xpath="glml:ASSREGARD"/>
>             <field xpath="glml:IID_IDENTREGARD"/>
>         </key>
>
>         <keyref name="FK_ASSFILDOREGARD" refer="glml:PK_ASSFILDO">
>             <selector xpath="glml:ASSREGARD"/>
>             <field xpath="glml:IID_ASSFILDO"/>
>         </keyref>
>
>     </element>
>
> </schema>



-- 
Regards,
Mukul Gandhi

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org