You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by Marc Sturm <st...@gmail.com> on 2009/07/15 10:11:09 UTC
Slow identity constraint validation
Hi all,
first, I' like to thank the developers for their effort. We use Xerces-C
a lot and are quite happy with it!
Recently we encountered a problem with the evaluation speed of identity
constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
At http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/ you
can find an XML example file and two schema versions.
Validating the example without identity constraints takes about 30
seconds. With identity constraints it takes like an hour.
Is this a known problem, or do we do something wrong? I attached the
code used to perform the validation.
Thanks in advance,
Marc
---------------------------------------------------------------------------------
//initialize
try
{
XMLPlatformUtils::Initialize();
}
catch (const XMLException& toCatch)
{
//...
}
SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
parser->setFeature(XMLUni::fgXercesDynamic, false);
parser->setFeature(XMLUni::fgXercesSchema, true);
parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
//set this class as error handler
parser->setErrorHandler(this);
parser->setContentHandler(NULL);
parser->setEntityResolver(NULL);
//load schema
LocalFileInputSource schema_file(schema);
parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
// try to validate the file
LocalFileInputSource source(filename);
try
{
parser->parse(source);
delete(parser);
}
catch (...)
{
///...
}
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Slow identity constraint validation
Posted by Boris Kolpackov <bo...@codesynthesis.com>.
Hi,
Marc Sturm <st...@gmail.com> writes:
> Thanks, now I manged to build the the Trunk version.
>
> The validation with the Trunk version still takes 47 minutes.
> An improvement, but still too long.
I agree, this is not normal and I think we need a bug report
for this.
Marc, would you be able to create an issue in Jira[1]? If you
can provide a test that reproduces the problem, that would be
very helpful as well.
[1] http://xerces.apache.org/xerces-c/bug-report.html
Thanks,
Boris
--
Boris Kolpackov, Code Synthesis Tools http://codesynthesis.com/~boris/blog
Open source XML data binding for C++: http://codesynthesis.com/products/xsd
Mobile/embedded validating XML parsing: http://codesynthesis.com/products/xsde
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Slow identity constraint validation
Posted by Marc Sturm <st...@gmail.com>.
Hi Alberto,
Thanks, now I manged to build the the Trunk version.
The validation with the Trunk version still takes 47 minutes.
An improvement, but still too long.
Best,
Marc
Alberto Massari wrote:
> When building from SVN you need to run an extra step to set up the
> autoconf framework; run ./reconf before ./configure
>
> Alberto
>
> Marc Sturm wrote:
>> Hi Alberto,
>>
>> i first tried with the release version 3.0.1, but that did not help.
>> The execution times remained the same.
>>
>> When I try to configure the Trunk version I get this error message:
>> > ./configure --prefix /share_pride/usr/sturm/contrib/build
>> --disable-network --disable-transcoder-iconv --disable-transcoder-icu
>> --disable-shared --with-pic
>> CXX=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/c++
>> CC=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/gcc
>>
>> configure: error: cannot find install-sh or install.sh in config
>> "."/config
>>
>> Any ideas?
>>
>> -Marc
>>
>>
>>
>> Alberto Massari wrote:
>>> Hi Marc,
>>> could you try with the current trunk of 3.x (the SVN repository is
>>> at https://svn.apache.org/repos/asf/xerces/c/trunk)? There was an
>>> excessive memory usage in identity validation that made performance
>>> degrade exponentially.
>>>
>>> Alberto
>>>
>>> Marc Sturm wrote:
>>>> Hi all,
>>>>
>>>> first, I' like to thank the developers for their effort. We use
>>>> Xerces-C a lot and are quite happy with it!
>>>>
>>>> Recently we encountered a problem with the evaluation speed of
>>>> identity constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
>>>> At
>>>> http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/
>>>> you can find an XML example file and two schema versions.
>>>> Validating the example without identity constraints takes about 30
>>>> seconds. With identity constraints it takes like an hour.
>>>>
>>>> Is this a known problem, or do we do something wrong? I attached
>>>> the code used to perform the validation.
>>>>
>>>> Thanks in advance,
>>>> Marc
>>>>
>>>>
>>>> ---------------------------------------------------------------------------------
>>>>
>>>> //initialize
>>>> try
>>>> {
>>>> XMLPlatformUtils::Initialize();
>>>> }
>>>> catch (const XMLException& toCatch)
>>>> {
>>>> //...
>>>> }
>>>>
>>>> SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
>>>> parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
>>>> parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
>>>> parser->setFeature(XMLUni::fgXercesDynamic, false);
>>>> parser->setFeature(XMLUni::fgXercesSchema, true);
>>>> parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
>>>>
>>>> //set this class as error handler
>>>> parser->setErrorHandler(this);
>>>> parser->setContentHandler(NULL);
>>>> parser->setEntityResolver(NULL);
>>>>
>>>> //load schema
>>>> LocalFileInputSource schema_file(schema);
>>>> parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
>>>> parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>>>>
>>>> // try to validate the file
>>>> LocalFileInputSource source(filename);
>>>> try
>>>> {
>>>> parser->parse(source);
>>>> delete(parser);
>>>> }
>>>> catch (...)
>>>> {
>>>> ///...
>>>> }
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Slow identity constraint validation
Posted by Alberto Massari <am...@datadirect.com>.
When building from SVN you need to run an extra step to set up the
autoconf framework; run ./reconf before ./configure
Alberto
Marc Sturm wrote:
> Hi Alberto,
>
> i first tried with the release version 3.0.1, but that did not help.
> The execution times remained the same.
>
> When I try to configure the Trunk version I get this error message:
> > ./configure --prefix /share_pride/usr/sturm/contrib/build
> --disable-network --disable-transcoder-iconv --disable-transcoder-icu
> --disable-shared --with-pic
> CXX=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/c++
> CC=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/gcc
> configure: error: cannot find install-sh or install.sh in config
> "."/config
>
> Any ideas?
>
> -Marc
>
>
>
> Alberto Massari wrote:
>> Hi Marc,
>> could you try with the current trunk of 3.x (the SVN repository is at
>> https://svn.apache.org/repos/asf/xerces/c/trunk)? There was an
>> excessive memory usage in identity validation that made performance
>> degrade exponentially.
>>
>> Alberto
>>
>> Marc Sturm wrote:
>>> Hi all,
>>>
>>> first, I' like to thank the developers for their effort. We use
>>> Xerces-C a lot and are quite happy with it!
>>>
>>> Recently we encountered a problem with the evaluation speed of
>>> identity constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
>>> At http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/
>>> you can find an XML example file and two schema versions.
>>> Validating the example without identity constraints takes about 30
>>> seconds. With identity constraints it takes like an hour.
>>>
>>> Is this a known problem, or do we do something wrong? I attached the
>>> code used to perform the validation.
>>>
>>> Thanks in advance,
>>> Marc
>>>
>>>
>>> ---------------------------------------------------------------------------------
>>>
>>> //initialize
>>> try
>>> {
>>> XMLPlatformUtils::Initialize();
>>> }
>>> catch (const XMLException& toCatch)
>>> {
>>> //...
>>> }
>>>
>>> SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
>>> parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
>>> parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
>>> parser->setFeature(XMLUni::fgXercesDynamic, false);
>>> parser->setFeature(XMLUni::fgXercesSchema, true);
>>> parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
>>>
>>> //set this class as error handler
>>> parser->setErrorHandler(this);
>>> parser->setContentHandler(NULL);
>>> parser->setEntityResolver(NULL);
>>>
>>> //load schema
>>> LocalFileInputSource schema_file(schema);
>>> parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
>>> parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>>>
>>> // try to validate the file
>>> LocalFileInputSource source(filename);
>>> try
>>> {
>>> parser->parse(source);
>>> delete(parser);
>>> }
>>> catch (...)
>>> {
>>> ///...
>>> }
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Slow identity constraint validation
Posted by Marc Sturm <st...@gmail.com>.
Hi Alberto,
i first tried with the release version 3.0.1, but that did not help. The
execution times remained the same.
When I try to configure the Trunk version I get this error message:
> ./configure --prefix /share_pride/usr/sturm/contrib/build
--disable-network --disable-transcoder-iconv --disable-transcoder-icu
--disable-shared --with-pic
CXX=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/c++
CC=/share_pride/usr/sturm/config/bash/UNI/scripts/colorgcc_x86_64_sl4/gcc
configure: error: cannot find install-sh or install.sh in config "."/config
Any ideas?
-Marc
Alberto Massari wrote:
> Hi Marc,
> could you try with the current trunk of 3.x (the SVN repository is at
> https://svn.apache.org/repos/asf/xerces/c/trunk)? There was an
> excessive memory usage in identity validation that made performance
> degrade exponentially.
>
> Alberto
>
> Marc Sturm wrote:
>> Hi all,
>>
>> first, I' like to thank the developers for their effort. We use
>> Xerces-C a lot and are quite happy with it!
>>
>> Recently we encountered a problem with the evaluation speed of
>> identity constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
>> At http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/
>> you can find an XML example file and two schema versions.
>> Validating the example without identity constraints takes about 30
>> seconds. With identity constraints it takes like an hour.
>>
>> Is this a known problem, or do we do something wrong? I attached the
>> code used to perform the validation.
>>
>> Thanks in advance,
>> Marc
>>
>>
>> ---------------------------------------------------------------------------------
>>
>> //initialize
>> try
>> {
>> XMLPlatformUtils::Initialize();
>> }
>> catch (const XMLException& toCatch)
>> {
>> //...
>> }
>>
>> SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
>> parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
>> parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
>> parser->setFeature(XMLUni::fgXercesDynamic, false);
>> parser->setFeature(XMLUni::fgXercesSchema, true);
>> parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
>>
>> //set this class as error handler
>> parser->setErrorHandler(this);
>> parser->setContentHandler(NULL);
>> parser->setEntityResolver(NULL);
>>
>> //load schema
>> LocalFileInputSource schema_file(schema);
>> parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
>> parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>>
>> // try to validate the file
>> LocalFileInputSource source(filename);
>> try
>> {
>> parser->parse(source);
>> delete(parser);
>> }
>> catch (...)
>> {
>> ///...
>> }
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
>> For additional commands, e-mail: c-dev-help@xerces.apache.org
>>
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org
Re: Slow identity constraint validation
Posted by Alberto Massari <am...@datadirect.com>.
Hi Marc,
could you try with the current trunk of 3.x (the SVN repository is at
https://svn.apache.org/repos/asf/xerces/c/trunk)? There was an excessive
memory usage in identity validation that made performance degrade
exponentially.
Alberto
Marc Sturm wrote:
> Hi all,
>
> first, I' like to thank the developers for their effort. We use
> Xerces-C a lot and are quite happy with it!
>
> Recently we encountered a problem with the evaluation speed of
> identity constraints (xs:key, xs:keyref, xpath) in Xerces-C 3.0.0.
> At http://www-bs2.informatik.uni-tuebingen.de/services/sturm/public/
> you can find an XML example file and two schema versions.
> Validating the example without identity constraints takes about 30
> seconds. With identity constraints it takes like an hour.
>
> Is this a known problem, or do we do something wrong? I attached the
> code used to perform the validation.
>
> Thanks in advance,
> Marc
>
>
> ---------------------------------------------------------------------------------
>
> //initialize
> try
> {
> XMLPlatformUtils::Initialize();
> }
> catch (const XMLException& toCatch)
> {
> //...
> }
>
> SAX2XMLReader* parser = XMLReaderFactory::createXMLReader();
> parser->setFeature(XMLUni::fgSAX2CoreNameSpaces, true);
> parser->setFeature(XMLUni::fgSAX2CoreValidation, true);
> parser->setFeature(XMLUni::fgXercesDynamic, false);
> parser->setFeature(XMLUni::fgXercesSchema, true);
> parser->setFeature(XMLUni::fgXercesSchemaFullChecking, true);
>
> //set this class as error handler
> parser->setErrorHandler(this);
> parser->setContentHandler(NULL);
> parser->setEntityResolver(NULL);
>
> //load schema
> LocalFileInputSource schema_file(schema);
> parser->loadGrammar(schema_file, Grammar::SchemaGrammarType, true);
> parser->setFeature(XMLUni::fgXercesUseCachedGrammarInParse, true);
>
> // try to validate the file
> LocalFileInputSource source(filename);
> try
> {
> parser->parse(source);
> delete(parser);
> }
> catch (...)
> {
> ///...
> }
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
> For additional commands, e-mail: c-dev-help@xerces.apache.org
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: c-dev-unsubscribe@xerces.apache.org
For additional commands, e-mail: c-dev-help@xerces.apache.org