You are viewing a plain text version of this content. The canonical link for it is here.

Posted to modperl@perl.apache.org by Chris Ochs <sn...@gmail.com> on 2005/01/18 02:06:47 UTC

DBI memory usage

Is there a way to do large queries that return lots of data without
having my apache process grow by the equivalent size in ram of the
data returned?  The only thing I can think of is to run a separate
script outside of mod perl for queries like this.

Chris

Re: DBI memory usage

Posted by Fred Moyer <fr...@taperfriendlymusic.org>.

>> > It looks like $r->child_terminate does what I need.  In the case of
>> > Postgresql it eats the memory when you execute the query, regardless
>> > of whether you actually fetch any results.
>>
>> That can be worked around, however. By using a cursor, you can specify
>> exactly how much data you want at a time:
>
> DBD::Pg doesn't support cursors.  From the DBD::PG manpage:

This is true that DBD::Pg doesn't support cursors, but you can write a
pl/pgsql function that returns a cursor reference and implement in this
manner:

$sth = $dbh->prepare("FETCH ALL FROM $your_ref_cursor");
while($sth->fetch){}

pl/perl was just introduced also, but I haven't looked at it enough to
know if you could use it to write the function.

Re: DBI memory usage

Posted by Peter Haworth <pm...@edison.ioppublishing.com>.

On Tue, 18 Jan 2005 10:12:06 -0800, Chris Ochs wrote:
> On Tue, 18 Jan 2005 10:43:02 +0000, Peter Haworth
> <pm...@edison.ioppublishing.com> wrote:
> > By using a cursor, you can specify exactly how much data you want
> > at a time:
>
> DBD::Pg doesn't support cursors. From the DBD::PG manpage:
>
>   "Although PostgreSQL has a cursor concept, it has not been used in
>   the current implementation."

No, that says that DBD::Pg doesn't *use* cursors in its implementation.
It certainly doesn't stop you from using them yourself. From DBD::Pg's
point of view, using cursors is exactly like executing any other SQL
query.

The code I posted was based on that used in 2 or 3 working applications
of mine, all of which use DBD::Pg, so it's not like I've posted
completely untested code based on guesswork.

-- 
	Peter Haworth	pmh@edison.ioppublishing.com
"Maybe that's [Java's] niche, it's a language for people who like pain."
		-- Dean Wilson

Re: DBI memory usage

Posted by Chris Ochs <sn...@gmail.com>.

On Tue, 18 Jan 2005 10:43:02 +0000, Peter Haworth
<pm...@edison.ioppublishing.com> wrote:
> On Mon, 17 Jan 2005 17:46:17 -0800, Chris Ochs wrote:
> > It looks like $r->child_terminate does what I need.  In the case of
> > Postgresql it eats the memory when you execute the query, regardless
> > of whether you actually fetch any results.
> 
> That can be worked around, however. By using a cursor, you can specify
> exactly how much data you want at a time:

DBD::Pg doesn't support cursors.  From the DBD::PG manpage:

"Although PostgreSQL has a cursor concept, it has not been used in the
       current implementation. Cursors in PostgreSQL can only be used inside a
       transaction block. Because only one transaction block at a time is
       allowed, this would have implied the restriction, not to use any nested
       SELECT statements. Hence the execute method fetches all data at once
       into data structures located in the frontend application. This has to
       be considered when selecting large amounts of data!"

Re: DBI memory usage

Posted by Peter Haworth <pm...@edison.ioppublishing.com>.

On Mon, 17 Jan 2005 17:46:17 -0800, Chris Ochs wrote:
> It looks like $r->child_terminate does what I need.  In the case of
> Postgresql it eats the memory when you execute the query, regardless
> of whether you actually fetch any results.

That can be worked around, however. By using a cursor, you can specify
exactly how much data you want at a time:

  $dbh->{AutoCommit}=0; # Cursors only work inside transactions
  $dbh->do(q(
    declare c cursor for
    -- YOUR SELECT GOES HERE
    for read only
  ));
  my $sth=$dbh->prepare('fetch forward 100 from c'); # Adjust number to taste
  do {
    $sth->execute;
    while(my $row=$sth->fetchrow_arrayref){
      # YOUR NORMAL LOOP BODY GOES HERE
    }
  } while $sth->rows+0; # It will be '0E0' (0 but true) the last time

-- 
	Peter Haworth	pmh@edison.ioppublishing.com
"The trouble with emacs is that it is too big, and not everywhere.
 Also, it keeps adding :wq! to the end of all my documents."
		-- Redvers Davies

Re: DBI memory usage

Posted by Chris Ochs <sn...@gmail.com>.

> > Yes.  Many databases support sending the results a few rows at a time
> > instead of all at once.  For specific advice on this, you might check
> > your DBD documentation or ask on the dbi-users list.  You can also have
> > your program schedule the current apache process to exit after finishing
> > the current request if you load a large amount of data into it.
> 
> Also, see http://modperlbook.org/html/ch20_02.html.

Thanks Perrin,

It looks like $r->child_terminate does what I need.  In the case of
Postgresql it eats the memory when you execute the query, regardless
of whether you actually fetch any results.

Chris

Re: DBI memory usage

Posted by Perrin Harkins <pe...@elem.com>.

Perrin Harkins wrote:

> Chris Ochs wrote:
> 
>> Is there a way to do large queries that return lots of data without
>> having my apache process grow by the equivalent size in ram of the
>> data returned?
> 
> 
> Yes.  Many databases support sending the results a few rows at a time 
> instead of all at once.  For specific advice on this, you might check 
> your DBD documentation or ask on the dbi-users list.  You can also have 
> your program schedule the current apache process to exit after finishing 
> the current request if you load a large amount of data into it.

Also, see http://modperlbook.org/html/ch20_02.html.

- Perrin

Re: DBI memory usage

Posted by Sean Davis <sd...@mail.nih.gov>.

On Jan 17, 2005, at 11:44 PM, Chris Ochs wrote:

>>
>> Also ask yourself if you really need all of the data at
>> once. You may be able to filter it down in the query or
>> build some incremental data structures using row-by-row
>> iteration instead of fetchall_arrayref.
>
> Ya I do, it's basically a customer list export from the database that
> I write out to a temp file, archive using zip, and then print it to
> the browser.  However I am just iterating over a fetchrow, not using
> fetchall.  DBD::Pg sticks everything into memory when you run the
> query, before you fetch anything.

Chris,

In this particular case, could you grab just the unique IDs for the 
rows of interest for your first query and then iterate over them 
(similar to what Class::DBI does)?  Depending on the row size, this may 
save a good deal of memory.  It will obviously be a performance hit....

Sean

Re: DBI memory usage

Posted by Chris Ochs <sn...@gmail.com>.

> 
> Also ask yourself if you really need all of the data at
> once. You may be able to filter it down in the query or
> build some incremental data structures using row-by-row
> iteration instead of fetchall_arrayref.

Ya I do, it's basically a customer list export from the database that
I write out to a temp file, archive using zip, and then print it to
the browser.  However I am just iterating over a fetchrow, not using
fetchall.  DBD::Pg sticks everything into memory when you run the
query, before you fetch anything.

Chris
> 
> --
> Steven Lembark                                       85-09 90th Street
> Workhorse Computing                                Woodhaven, NY 11421
> lembark@wrkhors.com                                     1 888 359 3508
>

Re: DBI memory usage

Posted by Steven Lembark <le...@wrkhors.com>.

>> Is there a way to do large queries that return lots of data without
>> having my apache process grow by the equivalent size in ram of the
>> data returned?
>
> Yes.  Many databases support sending the results a few rows at a time
> instead of all at once.  For specific advice on this, you might check
> your DBD documentation or ask on the dbi-users list.  You can also have
> your program schedule the current apache process to exit after finishing
> the current request if you load a large amount of data into it.

Also ask yourself if you really need all of the data at
once. You may be able to filter it down in the query or
build some incremental data structures using row-by-row
iteration instead of fetchall_arrayref.


-- 
Steven Lembark                                       85-09 90th Street
Workhorse Computing                                Woodhaven, NY 11421
lembark@wrkhors.com                                     1 888 359 3508

Re: DBI memory usage

Posted by Perrin Harkins <pe...@elem.com>.

Chris Ochs wrote:
> Is there a way to do large queries that return lots of data without
> having my apache process grow by the equivalent size in ram of the
> data returned?

Yes.  Many databases support sending the results a few rows at a time 
instead of all at once.  For specific advice on this, you might check 
your DBD documentation or ask on the dbi-users list.  You can also have 
your program schedule the current apache process to exit after finishing 
the current request if you load a large amount of data into it.

- Perrin