You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by "Kristian Waagan (JIRA)" <ji...@apache.org> on 2011/08/18 13:46:27 UTC

[jira] [Commented] (DERBY-5387) Memory leak or unbounded consumption problem when running a utility to copy one database to another using SYSCS_EXPORT_TABLE and SYSCS_IMPORT_TABLE

    [ https://issues.apache.org/jira/browse/DERBY-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086970#comment-13086970 ] 

Kristian Waagan commented on DERBY-5387:
----------------------------------------

This may be somewhat tricky to investigate. To isolate the problem, are you able to first export the tables, and then import the largest file? If that works, there must be a problem with cleaning up / releasing resources.
If it doesn't work there's a problem in the insert/sort code. For instance, the memory figures are wrong - causing Derby to grow the sort buffer too much (i.e. the sort fails to spill to disk).

You could try with a debug build and enable output for the sort buffer. The output is far from perfect, but you'll get some indications of what's going on. This will take even longer to run, so save the output to file and go do something else... (I think this output goes to std err by default). Enable by running the debug build with "-Dderby.debug.true=SortTuning".

If you can somehow provide a link to the heap dump that *could* also help (too large to be attached here). **NOTE**: If the data is confidential, this is not an option!
Finally, a junk data generator and small program to actually run the import would be another way to allow people to debug this.


On another matter, if I understand the code correctly, the dbcopy code is "leaking" connections to the source database (line 537).

> Memory leak or unbounded consumption problem when running a utility to copy one database to another using SYSCS_EXPORT_TABLE and SYSCS_IMPORT_TABLE
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5387
>                 URL: https://issues.apache.org/jira/browse/DERBY-5387
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.1.2
>         Environment: Solaris 10/9, Oracle Java 1.6.0_22, 1Gb heap space (also ran with 8Gb heap space with no difference other than how long it takes to run out of memory).
>            Reporter: Brett Bergquist
>            Priority: Critical
>         Attachments: dbcopy.zip, java_pid2364_Leak_Hunter.zip, java_pid2364_Leak_Suspects.zip
>
>
> I have a utility that copies one database to another by using 'dblook" to export the schema from the first which is then uses to create the copy's schema.  The tables are exported from the first database using the SYSCS_EXPORT_TABLE and imported into the second database using SYSCS_IMPORT_TABLE, processing each table before moving on to the next.  The the constraints and indexes present in the schema generated by 'dblook' are applied to the second database.  The utility runs out of memory regardless of the amount of memory given when run on a very large database (one table has 75 million rows in it and the total database size is 110Gb of disk storage).  The utility does complete on a smaller database.
> I will attach the source code for the utility.  Also added the -XX:+HeapDumpOnOutOfMemory flag and ran with -Xmx1024m heap.  I will attach the suspected leaks report generated by the Eclipse MemoryAnalyzer tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Re: [jira] [Commented] (DERBY-5387) Memory leak or unbounded consumption problem when running a utility to copy one database to another using SYSCS_EXPORT_TABLE and SYSCS_IMPORT_TABLE

Posted by Kristian Waagan <kr...@oracle.com>.
On 19.08.11 20:40, Bergquist, Brett wrote:
> Can you point me to the docs on how to do a debug build.  I have build derby recently both sane and insane builds and know how to run the tests now, so point me in the correct direction and I will investigate more.

The sane build is the debug build.
In a sane build, extra asserts and other code inside "if 
(SanityManager.DEBUG)" blocks will be included. In insane builds these 
blocks are removed by the compiler because SanityManager.DEBUG is set to 
false (and is a constant).


Hope this helps,
-- 
Kristian

>
> Also, I believe you are correct on the leaking.  Good catch!  I will fix that retry and update the JIRA issue if this changes anything.
>
> -----Original Message-----
> From: Kristian Waagan (JIRA) [mailto:jira@apache.org]
> Sent: Thursday, August 18, 2011 7:46 AM
> To: derby-dev@db.apache.org
> Subject: [jira] [Commented] (DERBY-5387) Memory leak or unbounded consumption problem when running a utility to copy one database to another using SYSCS_EXPORT_TABLE and SYSCS_IMPORT_TABLE
>
>
>      [ https://issues.apache.org/jira/browse/DERBY-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086970#comment-13086970 ]
>
> Kristian Waagan commented on DERBY-5387:
> ----------------------------------------
>
> This may be somewhat tricky to investigate. To isolate the problem, are you able to first export the tables, and then import the largest file? If that works, there must be a problem with cleaning up / releasing resources.
> If it doesn't work there's a problem in the insert/sort code. For instance, the memory figures are wrong - causing Derby to grow the sort buffer too much (i.e. the sort fails to spill to disk).
>
> You could try with a debug build and enable output for the sort buffer. The output is far from perfect, but you'll get some indications of what's going on. This will take even longer to run, so save the output to file and go do something else... (I think this output goes to std err by default). Enable by running the debug build with "-Dderby.debug.true=SortTuning".
>
> If you can somehow provide a link to the heap dump that *could* also help (too large to be attached here). **NOTE**: If the data is confidential, this is not an option!
> Finally, a junk data generator and small program to actually run the import would be another way to allow people to debug this.
>
>
> On another matter, if I understand the code correctly, the dbcopy code is "leaking" connections to the source database (line 537).
>
>> Memory leak or unbounded consumption problem when running a utility to copy one database to another using SYSCS_EXPORT_TABLE and SYSCS_IMPORT_TABLE
>> ---------------------------------------------------------------------------------------------------------------------------------------------------
>>
>>                  Key: DERBY-5387
>>                  URL: https://issues.apache.org/jira/browse/DERBY-5387
>>              Project: Derby
>>           Issue Type: Bug
>>           Components: SQL
>>     Affects Versions: 10.8.1.2
>>          Environment: Solaris 10/9, Oracle Java 1.6.0_22, 1Gb heap space (also ran with 8Gb heap space with no difference other than how long it takes to run out of memory).
>>             Reporter: Brett Bergquist
>>             Priority: Critical
>>          Attachments: dbcopy.zip, java_pid2364_Leak_Hunter.zip, java_pid2364_Leak_Suspects.zip
>>
>>
>> I have a utility that copies one database to another by using 'dblook" to export the schema from the first which is then uses to create the copy's schema.  The tables are exported from the first database using the SYSCS_EXPORT_TABLE and imported into the second database using SYSCS_IMPORT_TABLE, processing each table before moving on to the next.  The the constraints and indexes present in the schema generated by 'dblook' are applied to the second database.  The utility runs out of memory regardless of the amount of memory given when run on a very large database (one table has 75 million rows in it and the total database size is 110Gb of disk storage).  The utility does complete on a smaller database.
>> I will attach the source code for the utility.  Also added the -XX:+HeapDumpOnOutOfMemory flag and ran with -Xmx1024m heap.  I will attach the suspected leaks report generated by the Eclipse MemoryAnalyzer tool.
>
> --
> This message is automatically generated by JIRA.
> For more information on JIRA, see: http://www.atlassian.com/software/jira
>
>
>


RE: [jira] [Commented] (DERBY-5387) Memory leak or unbounded consumption problem when running a utility to copy one database to another using SYSCS_EXPORT_TABLE and SYSCS_IMPORT_TABLE

Posted by "Bergquist, Brett" <BB...@canoga.com>.
Can you point me to the docs on how to do a debug build.  I have build derby recently both sane and insane builds and know how to run the tests now, so point me in the correct direction and I will investigate more.

Also, I believe you are correct on the leaking.  Good catch!  I will fix that retry and update the JIRA issue if this changes anything.

-----Original Message-----
From: Kristian Waagan (JIRA) [mailto:jira@apache.org] 
Sent: Thursday, August 18, 2011 7:46 AM
To: derby-dev@db.apache.org
Subject: [jira] [Commented] (DERBY-5387) Memory leak or unbounded consumption problem when running a utility to copy one database to another using SYSCS_EXPORT_TABLE and SYSCS_IMPORT_TABLE


    [ https://issues.apache.org/jira/browse/DERBY-5387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13086970#comment-13086970 ] 

Kristian Waagan commented on DERBY-5387:
----------------------------------------

This may be somewhat tricky to investigate. To isolate the problem, are you able to first export the tables, and then import the largest file? If that works, there must be a problem with cleaning up / releasing resources.
If it doesn't work there's a problem in the insert/sort code. For instance, the memory figures are wrong - causing Derby to grow the sort buffer too much (i.e. the sort fails to spill to disk).

You could try with a debug build and enable output for the sort buffer. The output is far from perfect, but you'll get some indications of what's going on. This will take even longer to run, so save the output to file and go do something else... (I think this output goes to std err by default). Enable by running the debug build with "-Dderby.debug.true=SortTuning".

If you can somehow provide a link to the heap dump that *could* also help (too large to be attached here). **NOTE**: If the data is confidential, this is not an option!
Finally, a junk data generator and small program to actually run the import would be another way to allow people to debug this.


On another matter, if I understand the code correctly, the dbcopy code is "leaking" connections to the source database (line 537).

> Memory leak or unbounded consumption problem when running a utility to copy one database to another using SYSCS_EXPORT_TABLE and SYSCS_IMPORT_TABLE
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-5387
>                 URL: https://issues.apache.org/jira/browse/DERBY-5387
>             Project: Derby
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 10.8.1.2
>         Environment: Solaris 10/9, Oracle Java 1.6.0_22, 1Gb heap space (also ran with 8Gb heap space with no difference other than how long it takes to run out of memory).
>            Reporter: Brett Bergquist
>            Priority: Critical
>         Attachments: dbcopy.zip, java_pid2364_Leak_Hunter.zip, java_pid2364_Leak_Suspects.zip
>
>
> I have a utility that copies one database to another by using 'dblook" to export the schema from the first which is then uses to create the copy's schema.  The tables are exported from the first database using the SYSCS_EXPORT_TABLE and imported into the second database using SYSCS_IMPORT_TABLE, processing each table before moving on to the next.  The the constraints and indexes present in the schema generated by 'dblook' are applied to the second database.  The utility runs out of memory regardless of the amount of memory given when run on a very large database (one table has 75 million rows in it and the total database size is 110Gb of disk storage).  The utility does complete on a smaller database.
> I will attach the source code for the utility.  Also added the -XX:+HeapDumpOnOutOfMemory flag and ran with -Xmx1024m heap.  I will attach the suspected leaks report generated by the Eclipse MemoryAnalyzer tool.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira