You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-users@xerces.apache.org by Omar Solaiman <om...@nteligen.com> on 2017/04/24 21:07:46 UTC

Schema Validation has issues with data restrictions/occurances

Hello,

 

I'm working on an XML validation program utilizing Xerces C. It takes a list
of XML namespace URIs + schema locations, a No Namespace schema location and
validates the XML given by the XML Path.

I'm utilizing the XercesDOMParser to attempt validation.

If the XML has elements not in the schema, it fails. If the elements and
values held by the elements are valid against schema, it passes. 

 

However the program is SEGFAULTing upon two cases. If the Element has a
value that's outside the restriction of the schema, the program SEGFAULTs.
If the Element has a minOccurance of 1, but there's nothing in it, it will
SEGFAULT. 

 

This is confusing, as the schema validator is definitely working, but
somehow crashes when these ordinary validation cases are being given.

 

I set up the XercesDOMParser, have a errorHandler, and set the parser as
needed

*parser = new XercesDOMParser ( );
*errorHandler = new DOMErrorHandler  ( )
( *parser )->setErrorHandler ( *errorHandler );

( *parser )->setCreateEntityReferencenodes (false);

( *parser )->setIncludeIgnorableWhitespace (false );

( *parser )->setDoNamespaces (true);

( *parser )->setValidationScheme ( XercesDOMParser::Val_Always );
( *parser )->setDoSchema (true );
( *parser )->setValidationSchemaFullChecking ( true );
( *parser )->setValidationConstraintFatal (true );

( *parser )->setExitOnFirstFatalError ( true );

( *parser )->cacheGrammarFromParse ( true );

( *parser )->setStandardUriConformant ( false );

 

 

The "no Namespace Schema" has %20s in the spaces. The "uri/XSD schema list"
is in the appropriate string format. 


Then when parsing with Validation, if there's an issue with XML data not
matching schema values, it will SEGFAULT. The Try-Catch does not work. The
ErrorHandler is not reached, as found by the gdb breakpoint not being
triggered. It will just crash. 
ex: ( restriction minInclusive value 1, maxInclusive value 16384. XML value
123456 ) ( element name="SingleChar" minOccurs="1" maxOccurs="1" |
<SingleChar></SingleChar> )

However if schema states there's an element named "Alpha" which is not
present in the XML, it will be caught. If the schema does not state there's
an element called "Beta" that is present in the XML, that will be caught.
And there is no SEGFAULTing in these cases.

 

Any ideas on what could be causing this?

 

bool validdoc = false;

try 
{
  parser->resetDocumentPool ( );
  parser->parse( xmlPath.c_str ( ) );
}
catch ( const XMLException& toCatch )
{
  errMsg = utilXMLChToString ( toCatch.getMessage ( ) );
  return validdoc;
}
catch ( const DOMException& toCatch )
{
  errMsg = utilXMLChToString ( toCatch.getMessage ( ) );

  return validdoc;

}

catch ( . )

{

  errMsg = "Unknown Exception caught while attempting parsing"
  return validdoc;

}

if ( errorHandler->getFoundErrors ( ) )

{
  errorHandler->getLastError ( errMsg );

}

else
{
  validdoc = true;
}
return validdoc;

 

 

 

Any help with figuring out what is going on with Xerces is greatly
appreciated. 

Thanks,
Omar