You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Briggs <ac...@gmail.com> on 2008/02/08 14:14:45 UTC

Iterating through large result set in network mode causes OutOfMemoryException

So, I have a derby database that could potentially contain millions of
records, but at the moment it has 13K.

I am trying to iterate through that table after calling a basic
"Select * from myTable".  After about 5000 records, I get an out of
memory
exception/error on the server side.  This table has very a couple
small clobs in it (each field contains about only 100 chars at the
moment) and one large clob.  I have no transactions running, the
result set is read only.  I would think that I should be able to loop
through an infinite set of results.  If I do this in embedded mode
it's fine. I have read over 10 million records with no problem. But,
again, in network mode it fails early.

Anyone have any advice?  I would think that a non-scrollable result
set should not keep any of the results in memory (on the server) once
the cursor has passed it.





-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Briggs <ac...@gmail.com>.
We're currently using 10.3.2.1.

On Feb 8, 2008 10:08 AM, Bryan Pendleton <bp...@amberpoint.com> wrote:
> Briggs wrote:
> > After about 5000 records, I get an out of memory
> > exception/error on the server side.  This table has very a couple
> > small clobs in it (each field contains about only 100 chars at the
> > moment) and one large clob.
>
> What release of Derby are you using? There were substantial
> improvements made in this area in both 10.2 and 10.3.
>
> thanks,
>
> bryan
>
>
>



-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Bryan Pendleton <bp...@amberpoint.com>.
Briggs wrote:
> After about 5000 records, I get an out of memory
> exception/error on the server side.  This table has very a couple
> small clobs in it (each field contains about only 100 chars at the
> moment) and one large clob.

What release of Derby are you using? There were substantial
improvements made in this area in both 10.2 and 10.3.

thanks,

bryan



Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Briggs <ac...@gmail.com>.
Doh!  So it seems that 10.2 is broken too.  Ouch, this is going to get
some people a bit annoyed with me since I recommended it.  Hmmm.
Looks like I might have to go all the way back to 10.1 and see if it
works.  Anyone know if it is possible to export databases to older
formats?





On Feb 8, 2008 11:33 AM, Øystein Grøvlen <Oy...@sun.com> wrote:
>
> Briggs wrote:
> > So, I have a derby database that could potentially contain millions of
> > records, but at the moment it has 13K.
> >
> > I am trying to iterate through that table after calling a basic
> > "Select * from myTable".  After about 5000 records, I get an out of
> > memory
> > exception/error on the server side.  This table has very a couple
> > small clobs in it (each field contains about only 100 chars at the
> > moment) and one large clob.  I have no transactions running, the
> > result set is read only.  I would think that I should be able to loop
> > through an infinite set of results.  If I do this in embedded mode
> > it's fine. I have read over 10 million records with no problem. But,
> > again, in network mode it fails early.
> >
> > Anyone have any advice?  I would think that a non-scrollable result
> > set should not keep any of the results in memory (on the server) once
> > the cursor has passed it.
> >
>
> This looks like DERBY-3354.  It is a bug, and as far as I can tell the
> cause has been identified, so it should probably be possible to fix
> this.  For client/server you will probably also need the fix for DERBY-2892.
>
> --
> Øystein
>



-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Kathey Marsden <km...@sbcglobal.net>.
Briggs wrote:
> Well, here is a very simple test case. It's not junit or anything, but
> if you look at the code, by calling the main method it will:
>
> 1) Start up a network server
> 2) Create a test table
> 3) Insert 20K duplicate records
> 4) Attempt to iterate though them, and fail.
>
> If you wish to test the reading portion (after you've created a
> database), you'll need to comment out the code that does so.  Sorry,
> haven't the time to get too fancy with the test case.  But, you can
> see that it will fail.  Plus, when you run out of memory, don't forget
> to kill the process or you'll have a db running on 2222.
>
> If you are running unix, you'll need to change the connection string
> to point to a valid dir (other than c:/tmp).
>
> Is this good enough?
>
> Oh, if you want, you can also just create the database and insert the
> data, start a network server, then connect via ij.  Then just do a
> "select * from test_case.test_lob"; you will eventually get an index
> out of bounds exception because the server caused an error.
>
> Yes, the code is ugly, but I didn't think connection handling and
> transaction management was important.
>
>
> On Feb 8, 2008 1:51 PM, Briggs <ac...@gmail.com> wrote:
>   
>> Working on a test case now...
>>
>>
>> On Feb 8, 2008 1:29 PM, Kathey Marsden <km...@sbcglobal.net> wrote:
>>     
>>> Øystein Grøvlen wrote:
>>>       
>>>> This looks like DERBY-3354.  It is a bug, and as far as I can tell the
>>>> cause has been identified, so it should probably be possible to fix
>>>> this.  For client/server you will probably also need the fix for
>>>> DERBY-2892.
>>>>
>>>>         
>>> Since he said his case works fine in embedded mode, I don't think it is
>>> DERBY-3354.  Briggs, could you post a reproducible test case so we can
>>> see if this is a new issue.
>>>
>>> Kathey
>>>
>>>
>>>       
>>
>>
>> --
>> "Conscious decisions by conscious minds are what make reality real"
>>
>>     
>
>
>
>   
Running against 10.3.2.1 with IBM  JDK 1.5 I am not seeing the OOM 
exception. Should I be running with a smaller heap?  I also had to add 
this line to the program to avoid a no suitable driver error.
    Class.forName("org.apache.derby.jdbc.ClientDriver").newInstance();

The output ends with:
....
Read: 19997 records.
Read: 19998 records.
Read: 19999 records.



Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Øystein Grøvlen <Oy...@Sun.COM>.
Strange.  Seems like a bug.  If you could make a bug report with code to 
reproduce this would be great.

To your example code: I do not see anything that actually fetches the
content of the Blob, but I guess that is just an omission in the email.

I think in order for memory to be released when calling free(),
DERBY-3354 needs to be fixed.  I have just uploaded a patch for
DERBY-3354 for people to try out.

With respect to your original repro, my patch for DERBY-2892 should
fix that problem.  (The DERBY-2892 patch have not been applied to the
development branch yet since I think more test cases need to be
written) However, I needed to slightly modify your earlier repro to
get it to work.  It seems my DERBY-2892 patch only works if you
actually read the Blobs.  Thank you for pinpointing that.  If I added
a call to ResultSet#getBytes in the loop, the repro did not fail when
I ran with the two patches I mentioned above.

In other words, I think we should be able to solve your problems in
the next Derby release.  In the meantime, you may consider building
your own version which include the above mentioned patches.

HTH,

--
Øystein


Briggs wrote:
 > Yes, I meant the server.  Sorry.  But, it's my application that causes
 > it (the server) to fail.
 >
 >> The app?  Your first posting indicated that the server ran out of 
memory.
 >
 >>> Also, the occasional commit is also causing a problem.  I end up
 >>> eventually getting this:
 >>>
 >>> java.sql.SQLException: The data in this BLOB or CLOB is no longer
 >>> available.  The BLOB/CLOB's transaction may be committed, or its
 >>> connection is closed.
 >
 > Well, I am only calling commit after I have read the blob and pulled
 > all data from it. So, the strange issue is that I can get back a bunch
 > of records (and I am calling commit() after I retrieve all the data
 > from the current curror) then call ResultSet#next(). So, I am getting
 > through a bunch, but eventually the SQLException pops up.  I know it's
 > not good for performance to call commit() after every result, but I am
 > just testing.
 >
 > So, I basically test this like:
 >
 > while(rs.next()) {
 >      final String content;
 >      content = blobToString(rs.getBlob("content"));
 >      conn.commit();
 > }
 >
 > //very rudimentary method to convert to a string
 > private String blobToString(final Blob blob) throws SQLException{
 >    final byte[] bytes;
 >    final String result;
 >
 >     //not worried about the length at the moment, we don't store data 
that
 >     //large in our application, max is 64K
 >    bytes = new byte[(int)blob.length()];
 >    result = new String(bytes);
 >
 >    blob.free();
 >
 >    return result;
 > }
 >
 > That's the jist.  Still trying to figure out why that didn't work.
 >
 > Again, thanks for all your time.  Great list of people here!
 >
 >> A Blob/Clob object is not valid after the transaction it was created in
 >> has committed.
 >>
 >> --
 >> Øystein
 >>
 >
 >
 >


-- 
Øystein Grøvlen, Senior Staff Engineer
Architectural Lead, Java DB
Sun Microsystems, Database Technology Group
Trondheim, Norway
Subject:   Re: Iterating through large result set in network mode causes 
OutOfMemoryException
To:        Derby Discussion <de...@db.apache.org>
Cc:
Bcc:       oystein.grovlen@sun.com

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Briggs <ac...@gmail.com>.
Yes, I meant the server.  Sorry.  But, it's my application that causes
it (the server) to fail.

> The app?  Your first posting indicated that the server ran out of memory.

> > Also, the occasional commit is also causing a problem.  I end up
> > eventually getting this:
> >
> > java.sql.SQLException: The data in this BLOB or CLOB is no longer
> > available.  The BLOB/CLOB's transaction may be committed, or its
> > connection is closed.
>

Well, I am only calling commit after I have read the blob and pulled
all data from it. So, the strange issue is that I can get back a bunch
of records (and I am calling commit() after I retrieve all the data
from the current curror) then call ResultSet#next(). So, I am getting
through a bunch, but eventually the SQLException pops up.  I know it's
not good for performance to call commit() after every result, but I am
just testing.

So, I basically test this like:

while(rs.next()) {
     final String content;
     content = blobToString(rs.getBlob("content"));
     conn.commit();
}

//very rudimentary method to convert to a string
private String blobToString(final Blob blob) throws SQLException{
   final byte[] bytes;
   final String result;

    //not worried about the length at the moment, we don't store data that
    //large in our application, max is 64K
   bytes = new byte[(int)blob.length()];
   result = new String(bytes);

   blob.free();

   return result;
}

That's the jist.  Still trying to figure out why that didn't work.

Again, thanks for all your time.  Great list of people here!

> A Blob/Clob object is not valid after the transaction it was created in
> has committed.
>
> --
> Øystein
>



-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Øystein Grøvlen <Oy...@Sun.COM>.
Briggs wrote:
> Well, unfortunately, freeing the Blob didn't work, the app still runs
> out of memory.

The app?  Your first posting indicated that the server ran out of memory.

> 
> Also, the occasional commit is also causing a problem.  I end up
> eventually getting this:
> 
> java.sql.SQLException: The data in this BLOB or CLOB is no longer
> available.  The BLOB/CLOB's transaction may be committed, or its
> connection is closed.

A Blob/Clob object is not valid after the transaction it was created in 
has committed.

-- 
Øystein

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Briggs <ac...@gmail.com>.
Well, unfortunately, freeing the Blob didn't work, the app still runs
out of memory.

Also, the occasional commit is also causing a problem.  I end up
eventually getting this:

java.sql.SQLException: The data in this BLOB or CLOB is no longer
available.  The BLOB/CLOB's transaction may be committed, or its
connection is closed.

I am trying to figure out what could be causing this, it works in the
test case, but fails pretty darn quick in my application.  Very very
strange.





On Feb 8, 2008 11:26 PM, Briggs <ac...@gmail.com> wrote:
> Hey, again, thanks!   Will try it out.
>
>
> On Feb 8, 2008 7:19 PM, Kathey Marsden <km...@sbcglobal.net> wrote:
> > Briggs wrote:
> > > Thank you so much.  I guess I should have mentioned I was using jdk 6.
> > >
> > With Sun jdk 1.6 I was able to reproduce the OutOfMemory error and
> > Øystein was right;  it does look like DERBY-3354.  You were not seeing
> > it with embedded because you never made  a getBlob() or getObject() call
> > which is what triggers the bug. For Network Server that call happens
> > under the covers.
> >
> > There are a couple of workarounds that I can think of.
> > 1) commit every 1000 rows or so.
> >      for(int i = 0; rs.next(); i++) {
> >             System.out.println("Read: " + i + " records.");
> >         if (i % 1000 == 0)
> >         conn.commit();
> >         }
> >
> > 2) do a getBlob and a free. This will work for network server but not
> > embedded where it would still leak if you do a getBlob().
> >        for(int i = 0; rs.next(); i++) {
> >             System.out.println("Read: " + i + " records.");
> >         Blob b = rs.getBlob(1);
> >         b.free();
> >         }
> >
> > HTH
> >
> > Kathey
> >
> >
> >
> >
> >
>
>
>
>
> --
> "Conscious decisions by conscious minds are what make reality real"
>



-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Briggs <ac...@gmail.com>.
Hey, again, thanks!   Will try it out.

On Feb 8, 2008 7:19 PM, Kathey Marsden <km...@sbcglobal.net> wrote:
> Briggs wrote:
> > Thank you so much.  I guess I should have mentioned I was using jdk 6.
> >
> With Sun jdk 1.6 I was able to reproduce the OutOfMemory error and
> Øystein was right;  it does look like DERBY-3354.  You were not seeing
> it with embedded because you never made  a getBlob() or getObject() call
> which is what triggers the bug. For Network Server that call happens
> under the covers.
>
> There are a couple of workarounds that I can think of.
> 1) commit every 1000 rows or so.
>      for(int i = 0; rs.next(); i++) {
>             System.out.println("Read: " + i + " records.");
>         if (i % 1000 == 0)
>         conn.commit();
>         }
>
> 2) do a getBlob and a free. This will work for network server but not
> embedded where it would still leak if you do a getBlob().
>        for(int i = 0; rs.next(); i++) {
>             System.out.println("Read: " + i + " records.");
>         Blob b = rs.getBlob(1);
>         b.free();
>         }
>
> HTH
>
> Kathey
>
>
>
>
>



-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Kathey Marsden <km...@sbcglobal.net>.
Briggs wrote:
> Thank you so much.  I guess I should have mentioned I was using jdk 6.
>   
With Sun jdk 1.6 I was able to reproduce the OutOfMemory error and 
Øystein was right;  it does look like DERBY-3354.  You were not seeing 
it with embedded because you never made  a getBlob() or getObject() call 
which is what triggers the bug. For Network Server that call happens 
under the covers.

There are a couple of workarounds that I can think of.
1) commit every 1000 rows or so.
     for(int i = 0; rs.next(); i++) {
            System.out.println("Read: " + i + " records.");
        if (i % 1000 == 0)
        conn.commit();
        }

2) do a getBlob and a free. This will work for network server but not 
embedded where it would still leak if you do a getBlob().
       for(int i = 0; rs.next(); i++) {
            System.out.println("Read: " + i + " records.");
        Blob b = rs.getBlob(1);
        b.free();
        }

HTH

Kathey





Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Briggs <ac...@gmail.com>.
Thank you so much.  I guess I should have mentioned I was using jdk 6.
 Not sure which release version at the moment, since that is on my
work machine and it's Friday night now for me!.  Perhaps it started
with a larger heap for you?  But, you did get it to fail, so that
'cool' I guess.  I'll go check out the branch.

Thanks again for your help!


On Feb 8, 2008 5:44 PM, Kathey Marsden <km...@sbcglobal.net> wrote:
> Briggs wrote:
> > Oh, if you want, you can also just create the database and insert the
> > data, start a network server, then connect via ij.  Then just do a
> > "select * from test_case.test_lob"; you will eventually get an index
> > out of bounds exception because the server caused an error.
> >
> >
> This one I could reproduce and it looks like DERBY-3243.  I was able to
> confirm that it is fixed with the latest on the 10.3 branch with the
> DERBY-3243 fix.
> ....
> 20000 rows selected
>
>
>
>



-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Kathey Marsden <km...@sbcglobal.net>.
Briggs wrote:
> Oh, if you want, you can also just create the database and insert the
> data, start a network server, then connect via ij.  Then just do a
> "select * from test_case.test_lob"; you will eventually get an index
> out of bounds exception because the server caused an error.
>
>   
This one I could reproduce and it looks like DERBY-3243.  I was able to 
confirm that it is fixed with the latest on the 10.3 branch with the 
DERBY-3243 fix.
....
20000 rows selected




Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Briggs <ac...@gmail.com>.
Well, here is a very simple test case. It's not junit or anything, but
if you look at the code, by calling the main method it will:

1) Start up a network server
2) Create a test table
3) Insert 20K duplicate records
4) Attempt to iterate though them, and fail.

If you wish to test the reading portion (after you've created a
database), you'll need to comment out the code that does so.  Sorry,
haven't the time to get too fancy with the test case.  But, you can
see that it will fail.  Plus, when you run out of memory, don't forget
to kill the process or you'll have a db running on 2222.

If you are running unix, you'll need to change the connection string
to point to a valid dir (other than c:/tmp).

Is this good enough?

Oh, if you want, you can also just create the database and insert the
data, start a network server, then connect via ij.  Then just do a
"select * from test_case.test_lob"; you will eventually get an index
out of bounds exception because the server caused an error.

Yes, the code is ugly, but I didn't think connection handling and
transaction management was important.


On Feb 8, 2008 1:51 PM, Briggs <ac...@gmail.com> wrote:
> Working on a test case now...
>
>
> On Feb 8, 2008 1:29 PM, Kathey Marsden <km...@sbcglobal.net> wrote:
> > Øystein Grøvlen wrote:
> > >>
> > >
> > > This looks like DERBY-3354.  It is a bug, and as far as I can tell the
> > > cause has been identified, so it should probably be possible to fix
> > > this.  For client/server you will probably also need the fix for
> > > DERBY-2892.
> > >
> > Since he said his case works fine in embedded mode, I don't think it is
> > DERBY-3354.  Briggs, could you post a reproducible test case so we can
> > see if this is a new issue.
> >
> > Kathey
> >
> >
>
>
>
>
> --
> "Conscious decisions by conscious minds are what make reality real"
>



-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Briggs <ac...@gmail.com>.
Working on a test case now...

On Feb 8, 2008 1:29 PM, Kathey Marsden <km...@sbcglobal.net> wrote:
> Øystein Grøvlen wrote:
> >>
> >
> > This looks like DERBY-3354.  It is a bug, and as far as I can tell the
> > cause has been identified, so it should probably be possible to fix
> > this.  For client/server you will probably also need the fix for
> > DERBY-2892.
> >
> Since he said his case works fine in embedded mode, I don't think it is
> DERBY-3354.  Briggs, could you post a reproducible test case so we can
> see if this is a new issue.
>
> Kathey
>
>



-- 
"Conscious decisions by conscious minds are what make reality real"

Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Kathey Marsden <km...@sbcglobal.net>.
Øystein Grøvlen wrote:
>>
>
> This looks like DERBY-3354.  It is a bug, and as far as I can tell the 
> cause has been identified, so it should probably be possible to fix 
> this.  For client/server you will probably also need the fix for 
> DERBY-2892.
>
Since he said his case works fine in embedded mode, I don't think it is 
DERBY-3354.  Briggs, could you post a reproducible test case so we can 
see if this is a new issue.

Kathey


Re: Iterating through large result set in network mode causes OutOfMemoryException

Posted by Øystein Grøvlen <Oy...@Sun.COM>.
Briggs wrote:
> So, I have a derby database that could potentially contain millions of
> records, but at the moment it has 13K.
> 
> I am trying to iterate through that table after calling a basic
> "Select * from myTable".  After about 5000 records, I get an out of
> memory
> exception/error on the server side.  This table has very a couple
> small clobs in it (each field contains about only 100 chars at the
> moment) and one large clob.  I have no transactions running, the
> result set is read only.  I would think that I should be able to loop
> through an infinite set of results.  If I do this in embedded mode
> it's fine. I have read over 10 million records with no problem. But,
> again, in network mode it fails early.
> 
> Anyone have any advice?  I would think that a non-scrollable result
> set should not keep any of the results in memory (on the server) once
> the cursor has passed it.
> 

This looks like DERBY-3354.  It is a bug, and as far as I can tell the 
cause has been identified, so it should probably be possible to fix 
this.  For client/server you will probably also need the fix for DERBY-2892.

-- 
Øystein