You are viewing a plain text version of this content. The canonical link for it is here.
Posted to c-dev@xerces.apache.org by f....@chello.at on 2004/12/13 10:40:46 UTC

Problem with SAX2XMLReader Memory Usage

I have noticed a problem using SAX2XMLReader in an application dealing with large XML files: memory consumption does not remain at the same level but rises at a constant rate. This can be easily reproduced using a large XML file with a sample application such as SAX2Count. SAX2Count memory consumption rose from ~3 mb to ~30 mb parsing a ~120 mb XML document (on win32). The document I used for testing purposes did not contain deeply nested elements.

I have validation turned off in my application and still encounter this problem so using validation should not cause the increasing memory consumption.

I would appreciate any advice on how to keep memory usage at a constant level using a Xerces SAX parser as my application is expected to handle arbitrarily large XML files.

Thank you,

Florian Brugger


---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Problem with SAX2XMLReader Memory Usage

Posted by Florian Brugger <f....@chello.at>.
I gave it a try and it works great now, thank you for the fix!

Best regards,

Florian
----- Original Message ----- 
From: "Alberto Massari" <am...@datadirect.com>
To: <xe...@xml.apache.org>
Sent: Thursday, January 06, 2005 2:06 PM
Subject: Re: Problem with SAX2XMLReader Memory Usage


> Hi Florian,
> a few days ago I committed a change that should fix the problem: can you 
> get the latest sources and try your testcase?
>
> Thanks,
> Alberto
>
> At 13.18 06/01/2005 +0100, Florian Brugger wrote:
>>Regarding the SAX2Count memory usage problem:
>>
>>Try using the following program to generate testfiles:
>>
>>#include <stdio.h>
>>
>>#define WITH_ATTRIBUTES
>>#define ELEM_COUNT 3000000
>>
>>int main() {
>>FILE *fp;
>>
>>if(!(fp=fopen("test.xml","wt")))
>>  return -1;
>>
>>fprintf(fp,"<?xml version=\"1.0\"?>\n<root>\n");
>>for(int i=0;i<ELEM_COUNT;++i)
>>#ifdef WITH_ATTRIBUTES
>>  fprintf(fp,"<element attribute1=\"value1\" attribute2=\"value2\" 
>> attribute3=\"value3\"/>\n");
>>#endif
>>#ifndef WITH_ATTRIBUTES
>>  fprintf(fp,"<element/>\n");
>>#endif
>>fprintf(fp,"</root>\n");
>>fclose(fp);
>>
>>return 0;
>>}
>>
>>If attributes are used, SAX2Count's memory usage reaches ~38mb (input file 
>>size: 210mb). If the testfile is generated without attributes (size: 
>>35mb), the memory usage remains constant at 1780k.
>>
>>Best regards,
>>
>>Florian Brugger
>>
>>----- Original Message ----- From: "Alberto Massari" 
>><am...@progress.com>
>>To: <xe...@xml.apache.org>
>>Sent: Thursday, December 23, 2004 2:31 PM
>>Subject: Re: Problem with SAX2XMLReader Memory Usage
>>
>>
>>>Hi Florian,
>>>I tried the SAX2Count sample on the XML files generated by XMark (both 
>>>the 115Mb and the 232Mb versions); in both cases the memory consumption 
>>>was under 5Mb. Can you post the XML you are using?
>>>
>>>Alberto
>>>
>>>At 10.40 13/12/2004 +0100, f.brugger@chello.at wrote:
>>>>I have noticed a problem using SAX2XMLReader in an application dealing 
>>>>with large XML files: memory consumption does not remain at the same 
>>>>level but rises at a constant rate. This can be easily reproduced using 
>>>>a large XML file with a sample application such as SAX2Count. SAX2Count 
>>>>memory consumption rose from ~3 mb to ~30 mb parsing a ~120 mb XML 
>>>>document (on win32). The document I used for testing purposes did not 
>>>>contain deeply nested elements.
>>>>
>>>>I have validation turned off in my application and still encounter this 
>>>>problem so using validation should not cause the increasing memory 
>>>>consumption.
>>>>
>>>>I would appreciate any advice on how to keep memory usage at a constant 
>>>>level using a Xerces SAX parser as my application is expected to handle 
>>>>arbitrarily large XML files.
>>>>
>>>>Thank you,
>>>>
>>>>Florian Brugger
>>>>
>>>>
>>>>---------------------------------------------------------------------
>>>>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>>>>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>>>
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>>>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Problem with SAX2XMLReader Memory Usage

Posted by Alberto Massari <am...@datadirect.com>.
Hi Florian,
a few days ago I committed a change that should fix the problem: can you 
get the latest sources and try your testcase?

Thanks,
Alberto

At 13.18 06/01/2005 +0100, Florian Brugger wrote:
>Regarding the SAX2Count memory usage problem:
>
>Try using the following program to generate testfiles:
>
>#include <stdio.h>
>
>#define WITH_ATTRIBUTES
>#define ELEM_COUNT 3000000
>
>int main() {
>FILE *fp;
>
>if(!(fp=fopen("test.xml","wt")))
>  return -1;
>
>fprintf(fp,"<?xml version=\"1.0\"?>\n<root>\n");
>for(int i=0;i<ELEM_COUNT;++i)
>#ifdef WITH_ATTRIBUTES
>  fprintf(fp,"<element attribute1=\"value1\" attribute2=\"value2\" 
> attribute3=\"value3\"/>\n");
>#endif
>#ifndef WITH_ATTRIBUTES
>  fprintf(fp,"<element/>\n");
>#endif
>fprintf(fp,"</root>\n");
>fclose(fp);
>
>return 0;
>}
>
>If attributes are used, SAX2Count's memory usage reaches ~38mb (input file 
>size: 210mb). If the testfile is generated without attributes (size: 
>35mb), the memory usage remains constant at 1780k.
>
>Best regards,
>
>Florian Brugger
>
>----- Original Message ----- From: "Alberto Massari" <am...@progress.com>
>To: <xe...@xml.apache.org>
>Sent: Thursday, December 23, 2004 2:31 PM
>Subject: Re: Problem with SAX2XMLReader Memory Usage
>
>
>>Hi Florian,
>>I tried the SAX2Count sample on the XML files generated by XMark (both 
>>the 115Mb and the 232Mb versions); in both cases the memory consumption 
>>was under 5Mb. Can you post the XML you are using?
>>
>>Alberto
>>
>>At 10.40 13/12/2004 +0100, f.brugger@chello.at wrote:
>>>I have noticed a problem using SAX2XMLReader in an application dealing 
>>>with large XML files: memory consumption does not remain at the same 
>>>level but rises at a constant rate. This can be easily reproduced using 
>>>a large XML file with a sample application such as SAX2Count. SAX2Count 
>>>memory consumption rose from ~3 mb to ~30 mb parsing a ~120 mb XML 
>>>document (on win32). The document I used for testing purposes did not 
>>>contain deeply nested elements.
>>>
>>>I have validation turned off in my application and still encounter this 
>>>problem so using validation should not cause the increasing memory consumption.
>>>
>>>I would appreciate any advice on how to keep memory usage at a constant 
>>>level using a Xerces SAX parser as my application is expected to handle 
>>>arbitrarily large XML files.
>>>
>>>Thank you,
>>>
>>>Florian Brugger
>>>
>>>
>>>---------------------------------------------------------------------
>>>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>>>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>>
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Problem with SAX2XMLReader Memory Usage

Posted by Florian Brugger <f....@chello.at>.
Regarding the SAX2Count memory usage problem:

Try using the following program to generate testfiles:

#include <stdio.h>

#define WITH_ATTRIBUTES
#define ELEM_COUNT 3000000

int main() {
 FILE *fp;

 if(!(fp=fopen("test.xml","wt")))
  return -1;

 fprintf(fp,"<?xml version=\"1.0\"?>\n<root>\n");
 for(int i=0;i<ELEM_COUNT;++i)
#ifdef WITH_ATTRIBUTES
  fprintf(fp,"<element attribute1=\"value1\" attribute2=\"value2\" 
attribute3=\"value3\"/>\n");
#endif
#ifndef WITH_ATTRIBUTES
  fprintf(fp,"<element/>\n");
#endif
 fprintf(fp,"</root>\n");
 fclose(fp);

 return 0;
}

If attributes are used, SAX2Count's memory usage reaches ~38mb (input file 
size: 210mb). If the testfile is generated without attributes (size: 35mb), 
the memory usage remains constant at 1780k.

Best regards,

Florian Brugger

----- Original Message ----- 
From: "Alberto Massari" <am...@progress.com>
To: <xe...@xml.apache.org>
Sent: Thursday, December 23, 2004 2:31 PM
Subject: Re: Problem with SAX2XMLReader Memory Usage


> Hi Florian,
> I tried the SAX2Count sample on the XML files generated by XMark (both the 
> 115Mb and the 232Mb versions); in both cases the memory consumption was 
> under 5Mb. Can you post the XML you are using?
>
> Alberto
>
> At 10.40 13/12/2004 +0100, f.brugger@chello.at wrote:
>>I have noticed a problem using SAX2XMLReader in an application dealing 
>>with large XML files: memory consumption does not remain at the same level 
>>but rises at a constant rate. This can be easily reproduced using a large 
>>XML file with a sample application such as SAX2Count. SAX2Count memory 
>>consumption rose from ~3 mb to ~30 mb parsing a ~120 mb XML document (on 
>>win32). The document I used for testing purposes did not contain deeply 
>>nested elements.
>>
>>I have validation turned off in my application and still encounter this 
>>problem so using validation should not cause the increasing memory 
>>consumption.
>>
>>I would appreciate any advice on how to keep memory usage at a constant 
>>level using a Xerces SAX parser as my application is expected to handle 
>>arbitrarily large XML files.
>>
>>Thank you,
>>
>>Florian Brugger
>>
>>
>>---------------------------------------------------------------------
>>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
> For additional commands, e-mail: xerces-c-dev-help@xml.apache.org
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org


Re: Problem with SAX2XMLReader Memory Usage

Posted by Alberto Massari <am...@progress.com>.
Hi Florian,
I tried the SAX2Count sample on the XML files generated by XMark (both the 
115Mb and the 232Mb versions); in both cases the memory consumption was 
under 5Mb. Can you post the XML you are using?

Alberto

At 10.40 13/12/2004 +0100, f.brugger@chello.at wrote:
>I have noticed a problem using SAX2XMLReader in an application dealing 
>with large XML files: memory consumption does not remain at the same level 
>but rises at a constant rate. This can be easily reproduced using a large 
>XML file with a sample application such as SAX2Count. SAX2Count memory 
>consumption rose from ~3 mb to ~30 mb parsing a ~120 mb XML document (on 
>win32). The document I used for testing purposes did not contain deeply 
>nested elements.
>
>I have validation turned off in my application and still encounter this 
>problem so using validation should not cause the increasing memory consumption.
>
>I would appreciate any advice on how to keep memory usage at a constant 
>level using a Xerces SAX parser as my application is expected to handle 
>arbitrarily large XML files.
>
>Thank you,
>
>Florian Brugger
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
>For additional commands, e-mail: xerces-c-dev-help@xml.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: xerces-c-dev-unsubscribe@xml.apache.org
For additional commands, e-mail: xerces-c-dev-help@xml.apache.org