You are viewing a plain text version of this content. The canonical link for it is here.
Posted to j-users@xerces.apache.org by Mukul Gandhi <mu...@apache.org> on 2010/03/18 10:37:16 UTC

Re: Parsing and validating an XML Document that obeys a schema but doesn't declare it.

The following works for me:

public class SAXParserTest extends DefaultHandler {
    SAXParserFactory spf = SAXParserFactory.newInstance();
    spf.setValidating(true);
    spf.setNamespaceAware(true);
    SAXParser parser = spf.newSAXParser();
    parser.setProperty("http://java.sun.com/xml/jaxp/properties/schemaLanguage",
	                        "http://www.w3.org/2001/XMLSchema");
    parser.setProperty("http://apache.org/xml/properties/schema/external-
      noNamespaceSchemaLocation", xsdFile);   // xsdFile could be for
eg, "test.xsd"

    parser.parse(new File(xmlFile), this);   // xmlFile could be for
eg, "test.xml"

On Thu, Mar 18, 2010 at 1:38 PM, stefcl <st...@gmail.com> wrote:
>
> Hello,
>
> I'd like to use a validating SAX parser to read huge XML documents (>200mo).
> Due to some legacy reasons, some of these files might not declare anything
> related to schema or namespaces.
>
> They look like this :
>
> <items>
>  <product id="1">...
>  <product id="2">..
> </items>
>
> While the correct form would be :
>
> <pre:itemsxmlns:pre="http://xml.prediggo.com/schema/ItemsSchema">
>  <product id="1">...
>  <product id="2">..
> </pre:items>
>
>
> Is there anything I can do to "cheat" the schema declaration on the document
> root element so that I can benefit from schema validation anyway? Writing
> all the validation stuff myself would be many days of work.
>
> I tried the following :
>
> QName qName = new QName("http://xml.prediggo.com/schema/ItemsSchema", "root"
> );
>
>
> xmlReader.setFeature("http://apache.org/xml/features/validation/schema" ,
> true);
>
> xmlReader.setProperty("http://apache.org/xml/properties/validation/schema/root-type-definition",
> qName );
>
>
> But it didn't help, I don't know if it's possible to solve the issue using
> configuration and how I should do that. I consider adding the ns declaration
> in the xml file using java code but it would require a complete rewrite of
> the file which is rather big.
>
> When we were still using jaxb I could *cheat* the validator by using a Sax
> filter like this one.
>
> @Override
>        public void startElement(String uri, String localName, String qName,
> Attributes attributes) throws SAXException
>        {
>
>             if( localName.equals( "items" ))
>             {
>                 //inject namespace uri...
>                 super.startElement(
> "http://xml.prediggo.com/schema/ItemsSchema" , localName, qName,
> attributes);
>             }
>             else
>             {
>                 //do things normally...
>                 super.startElement( uri , localName, qName, attributes);
>             }
>
>        }
>
> But now, validation exceptions are thrown before my filter is even called.
>
> Any advice would be greatly appreciated....



-- 
Regards,
Mukul Gandhi

---------------------------------------------------------------------
To unsubscribe, e-mail: j-users-unsubscribe@xerces.apache.org
For additional commands, e-mail: j-users-help@xerces.apache.org