You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Marc Sturm <st...@gmail.com> on 2009/07/15 10:11:09 UTC

Slow identity constraint validation

Hi all,

first, I' like to thank the developers for their effort. We use Xerces-C 
a lot and are quite happy with it!

Recently we encountered a problem with the evaluation speed of identity 
constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
At http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/ you 
can find an XML example file and two schema versions.
Validating the example without identity constraints takes about 30 
seconds. With identity constraints it takes like an hour.

Is this a known problem, or do we do something wrong? I attached the 
code used to perform the validation.

Thanks in advance,
  Marc


---------------------------------------------------------------------------------
//initialize
try
{
    XMLPlatformUtils::Initialize();
}
catch (const XMLException& toCatch)
{
    //...
}

SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
parser->setFeature(XMLUni::fgXercesDynamic, false);
parser->setFeature(XMLUni::fgXercesSchema, true);
parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);

//set this class as error handler
parser->setErrorHandler(this);
parser->setContentHandler(NULL);
parser->setEntityResolver(NULL);

//load schema
LocalFileInputSource schema_file(schema);
parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);

// try to validate the file
LocalFileInputSource source(filename);
try
{
    parser->parse(source);
    delete(parser);
}
catch (...)
{
    ///...
}

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Slow identity constraint validation

Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi,

Marc Sturm <st...@gmail.com> writes:

> Thanks, now I manged to build the the Trunk version.
>
> The validation with the Trunk version still takes 47 minutes.
> An improvement, but still too long.

I agree, this is not normal and I think we need a bug report
for this. 

Marc, would you be able to create an issue in Jira[1]? If you
can provide a test that reproduces the problem, that would be
very helpful as well.

[1] http://xerces.apache.org/xerces-c/bug-report.html

Thanks,
	Boris

-- 
Boris Kolpackov, Code Synthesis Tools   http://codesynthesis.com/~boris/blog
Open source XML data binding for C++:   http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Slow identity constraint validation

Posted by Marc Sturm <st...@gmail.com>.
Hi Alberto,

Thanks, now I manged to build the the Trunk version.

The validation with the Trunk version still takes 47 minutes.
An improvement, but still too long.

Best,
  Marc

Alberto Massari wrote:
> When building from SVN you need to run an extra step to set up the 
> autoconf framework; run ./reconf before ./configure
>
> Alberto
>
> Marc Sturm wrote:
>> Hi Alberto,
>>
>> i first tried with the release version 3.0.1, but that did not help. 
>> The execution times remained the same.
>>
>> When I try to configure the Trunk version I get this error message:
>> > ./configure --prefix /share_pride/usr/sturm/contrib/build 
>> --disable-network --disable-transcoder-iconv --disable-transcoder-icu 
>> --disable-shared --with-pic  
>> CXX=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/c++ 
>> CC=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/gcc 
>>
>> configure: error: cannot find install-sh or install.sh in config 
>> "."/config
>>
>> Any ideas?
>>
>> -Marc
>>
>>
>>
>> Alberto Massari wrote:
>>> Hi Marc,
>>> could you try with the current trunk of 3.x (the SVN repository is 
>>> at https://svn.apache.org/repos/asf/xerces/c/trunk)? There was an 
>>> excessive memory usage in identity validation that made performance 
>>> degrade exponentially.
>>>
>>> Alberto
>>>
>>> Marc Sturm wrote:
>>>> Hi all,
>>>>
>>>> first, I' like to thank the developers for their effort. We use 
>>>> Xerces-C a lot and are quite happy with it!
>>>>
>>>> Recently we encountered a problem with the evaluation speed of 
>>>> identity constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
>>>> At 
>>>> http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/ 
>>>> you can find an XML example file and two schema versions.
>>>> Validating the example without identity constraints takes about 30 
>>>> seconds. With identity constraints it takes like an hour.
>>>>
>>>> Is this a known problem, or do we do something wrong? I attached 
>>>> the code used to perform the validation.
>>>>
>>>> Thanks in advance,
>>>>  Marc
>>>>
>>>>
>>>> --------------------------------------------------------------------------------- 
>>>>
>>>> //initialize
>>>> try
>>>> {
>>>>    XMLPlatformUtils::Initialize();
>>>> }
>>>> catch (const XMLException& toCatch)
>>>> {
>>>>    //...
>>>> }
>>>>
>>>> SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
>>>> parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
>>>> parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
>>>> parser->setFeature(XMLUni::fgXercesDynamic, false);
>>>> parser->setFeature(XMLUni::fgXercesSchema, true);
>>>> parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
>>>>
>>>> //set this class as error handler
>>>> parser->setErrorHandler(this);
>>>> parser->setContentHandler(NULL);
>>>> parser->setEntityResolver(NULL);
>>>>
>>>> //load schema
>>>> LocalFileInputSource schema_file(schema);
>>>> parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
>>>> parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>>>>
>>>> // try to validate the file
>>>> LocalFileInputSource source(filename);
>>>> try
>>>> {
>>>>    parser->parse(source);
>>>>    delete(parser);
>>>> }
>>>> catch (...)
>>>> {
>>>>    ///...
>>>> }
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Slow identity constraint validation

Posted by Alberto Massari <am...@datadirect.com>.
When building from SVN you need to run an extra step to set up the 
autoconf framework; run ./reconf before ./configure

Alberto

Marc Sturm wrote:
> Hi Alberto,
>
> i first tried with the release version 3.0.1, but that did not help. 
> The execution times remained the same.
>
> When I try to configure the Trunk version I get this error message:
> > ./configure --prefix /share_pride/usr/sturm/contrib/build 
> --disable-network --disable-transcoder-iconv --disable-transcoder-icu 
> --disable-shared --with-pic  
> CXX=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/c++ 
> CC=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/gcc
> configure: error: cannot find install-sh or install.sh in config 
> "."/config
>
> Any ideas?
>
> -Marc
>
>
>
> Alberto Massari wrote:
>> Hi Marc,
>> could you try with the current trunk of 3.x (the SVN repository is at 
>> https://svn.apache.org/repos/asf/xerces/c/trunk)? There was an 
>> excessive memory usage in identity validation that made performance 
>> degrade exponentially.
>>
>> Alberto
>>
>> Marc Sturm wrote:
>>> Hi all,
>>>
>>> first, I' like to thank the developers for their effort. We use 
>>> Xerces-C a lot and are quite happy with it!
>>>
>>> Recently we encountered a problem with the evaluation speed of 
>>> identity constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
>>> At http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/ 
>>> you can find an XML example file and two schema versions.
>>> Validating the example without identity constraints takes about 30 
>>> seconds. With identity constraints it takes like an hour.
>>>
>>> Is this a known problem, or do we do something wrong? I attached the 
>>> code used to perform the validation.
>>>
>>> Thanks in advance,
>>>  Marc
>>>
>>>
>>> --------------------------------------------------------------------------------- 
>>>
>>> //initialize
>>> try
>>> {
>>>    XMLPlatformUtils::Initialize();
>>> }
>>> catch (const XMLException& toCatch)
>>> {
>>>    //...
>>> }
>>>
>>> SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
>>> parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
>>> parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
>>> parser->setFeature(XMLUni::fgXercesDynamic, false);
>>> parser->setFeature(XMLUni::fgXercesSchema, true);
>>> parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
>>>
>>> //set this class as error handler
>>> parser->setErrorHandler(this);
>>> parser->setContentHandler(NULL);
>>> parser->setEntityResolver(NULL);
>>>
>>> //load schema
>>> LocalFileInputSource schema_file(schema);
>>> parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
>>> parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>>>
>>> // try to validate the file
>>> LocalFileInputSource source(filename);
>>> try
>>> {
>>>    parser->parse(source);
>>>    delete(parser);
>>> }
>>> catch (...)
>>> {
>>>    ///...
>>> }
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Slow identity constraint validation

Posted by Marc Sturm <st...@gmail.com>.
Hi Alberto,

i first tried with the release version 3.0.1, but that did not help. The 
execution times remained the same.

When I try to configure the Trunk version I get this error message:
 > ./configure --prefix /share_pride/usr/sturm/contrib/build 
--disable-network --disable-transcoder-iconv --disable-transcoder-icu 
--disable-shared --with-pic  
CXX=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/c++ 
CC=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/gcc
configure: error: cannot find install-sh or install.sh in config "."/config

Any ideas?

-Marc



Alberto Massari wrote:
> Hi Marc,
> could you try with the current trunk of 3.x (the SVN repository is at 
> https://svn.apache.org/repos/asf/xerces/c/trunk)? There was an 
> excessive memory usage in identity validation that made performance 
> degrade exponentially.
>
> Alberto
>
> Marc Sturm wrote:
>> Hi all,
>>
>> first, I' like to thank the developers for their effort. We use 
>> Xerces-C a lot and are quite happy with it!
>>
>> Recently we encountered a problem with the evaluation speed of 
>> identity constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
>> At http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/ 
>> you can find an XML example file and two schema versions.
>> Validating the example without identity constraints takes about 30 
>> seconds. With identity constraints it takes like an hour.
>>
>> Is this a known problem, or do we do something wrong? I attached the 
>> code used to perform the validation.
>>
>> Thanks in advance,
>>  Marc
>>
>>
>> --------------------------------------------------------------------------------- 
>>
>> //initialize
>> try
>> {
>>    XMLPlatformUtils::Initialize();
>> }
>> catch (const XMLException& toCatch)
>> {
>>    //...
>> }
>>
>> SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
>> parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
>> parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
>> parser->setFeature(XMLUni::fgXercesDynamic, false);
>> parser->setFeature(XMLUni::fgXercesSchema, true);
>> parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
>>
>> //set this class as error handler
>> parser->setErrorHandler(this);
>> parser->setContentHandler(NULL);
>> parser->setEntityResolver(NULL);
>>
>> //load schema
>> LocalFileInputSource schema_file(schema);
>> parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
>> parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>>
>> // try to validate the file
>> LocalFileInputSource source(filename);
>> try
>> {
>>    parser->parse(source);
>>    delete(parser);
>> }
>> catch (...)
>> {
>>    ///...
>> }
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org


Re: Slow identity constraint validation

Posted by Alberto Massari <am...@datadirect.com>.
Hi Marc,
could you try with the current trunk of 3.x (the SVN repository is at 
https://svn.apache.org/repos/asf/xerces/c/trunk)? There was an excessive 
memory usage in identity validation that made performance degrade 
exponentially.

Alberto

Marc Sturm wrote:
> Hi all,
>
> first, I' like to thank the developers for their effort. We use 
> Xerces-C a lot and are quite happy with it!
>
> Recently we encountered a problem with the evaluation speed of 
> identity constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
> At http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/ 
> you can find an XML example file and two schema versions.
> Validating the example without identity constraints takes about 30 
> seconds. With identity constraints it takes like an hour.
>
> Is this a known problem, or do we do something wrong? I attached the 
> code used to perform the validation.
>
> Thanks in advance,
>  Marc
>
>
> --------------------------------------------------------------------------------- 
>
> //initialize
> try
> {
>    XMLPlatformUtils::Initialize();
> }
> catch (const XMLException& toCatch)
> {
>    //...
> }
>
> SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
> parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
> parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
> parser->setFeature(XMLUni::fgXercesDynamic, false);
> parser->setFeature(XMLUni::fgXercesSchema, true);
> parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
>
> //set this class as error handler
> parser->setErrorHandler(this);
> parser->setContentHandler(NULL);
> parser->setEntityResolver(NULL);
>
> //load schema
> LocalFileInputSource schema_file(schema);
> parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
> parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>
> // try to validate the file
> LocalFileInputSource source(filename);
> try
> {
>    parser->parse(source);
>    delete(parser);
> }
> catch (...)
> {
>    ///...
> }
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org