You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cayenne.apache.org by Andrey Razumovsky <ra...@gmail.com> on 2009/05/15 11:03:13 UTC
Re: Big selects on PostGres : Configuring Statement.setFetchSize() of
a selectquery
Hi,
Yes of course. You can use SelectQuery.setFetchLimit(int) and
setFetchStart(int) methods. They do just that.
Andrey
2009/5/15 stefcl <st...@gmail.com>
>
> Hello,
>
> Is there a way I could configure the fetch size of a SelectQuery?
>
> In pure JDBC, in order to be able to handle a large resultset datarow by
> datarow, I need to indicate to the JDBC driver that it should make use of
> cursors by calling Statement.setFetchSize( int ).
>
> If I don't, the driver waits until all rows are returned before giving
> access to the ResultSet, which causes OutOfMemory exceptions.
> Unfortunately, I haven't found how I could force a statement level fetch
> size when working with SelectQuery objects in the cayenne API.
> (Warning: I'm not talking about paging here)
>
> Can someone help?
>
> PS: I use postgresql jdbc4 driver.
>
> --
> View this message in context:
> http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23555842.html
> Sent from the Cayenne - User mailing list archive at Nabble.com.
>
>
Re: Big selects on PostGres : Configuring Statement.setFetchSize()
of a selectquery
Posted by stefcl <st...@gmail.com>.
Hello,
Setting fetchSize with postgres does make a difference, even in a simple
JDBC select.
If you consider the following code copy-pasted from my test case:
PreparedStatement stmt =
sqlMapper.getCurrentConnection().prepareStatement("SELECT
itemID,sellingprice,margin,popularityrating,name,att1,att2,att3,att4,att5,longtailpos,timest
FROM itemdetails WHERE profileID= 10 ORDER BY longtailpos");
stmt.setFetchSize(1000);
stmt.execute();
ResultSet set = stmt.executeQuery();
int i = 0;
while( set.next())
{
i++;
System.out.println(set.getString(1));
}
It can iterates through approx 1'500'000 rows without any memory issues
(java process stays at approx 20mo while the loop executes). I can also see
that it takes no more than a few seconds before the first row is printed in
the output console, meaning that results are still being fetched from the DB
at the moment the resultset is accessed.
Now if I comment the line "stmt.setFetchSize(1000);" in the above code,
execution stops at "ResultSet set = stmt.executeQuery();" for approx 15
seconds and then an OutOfMemoryException is thrown from JDBC code. Also in
cayenne code, that would be before the resultset is even accessed (making
ResultIterator useless).
A similar problem is exposed here, with a sample stacktrace :
http://www.nabble.com/java.sql.Statement-generates-java.lang.OutOfMemoryError-in-big-tabe-td21568763.html
And the reason is detailed here :
http://jdbc.postgresql.org/documentation/83/query.html#query-with-cursor
Perhaps other drivers that automatically use server side cursors aren't
affected or just ignore this setting but I can say for sure that this
setting is necessary with postGres.
Andrey, you say you keep getting OutOfMemory exceptions no matter which
value you set. Do you get these exceptions while iterating through the
results or during the call to executeQuery?
If you get it while looping through the resultset, it could mean that
somehow the rows objects aren't garbage collected, it can happen with
cayenne version 2.0 due to the dataContext holding strong references to the
dataObjects. I didn't want to write a fix for v3 at first, but I realized
that in v2, I had to manually unregister the dataObjects from the context in
order to keep the memory usage in safe areas.
If I can be of any help, please ask.
Best regards...
Stéphane
--
View this message in context: http://www.nabble.com/Re%3A-Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of--a-selectquery-tp23652781p23703577.html
Sent from the Cayenne - Dev mailing list archive at Nabble.com.
Re: Big selects on PostGres : Configuring Statement.setFetchSize() of a selectquery
Posted by Andrus Adamchik <an...@objectstyle.org>.
Somehow I am not surprised. I may also try that on Oracle when the
code becomes available on trunk.
(BTW not sure that Stephane is subscribed to the dev list, so cc'ying
this message.)
Andrus
On May 24, 2009, at 12:52 PM, Andrey Razumovsky wrote:
> I should say my tests on Postgres and mysql didn't show any results.
> If I
> use small heap size, i get OutOfMemory no matter which fetch size
> was set.
> Fetch speed and memory usage are the same. Seems drivers just ignore
> this
> parameter. Stephane, did your workaround help you?
>
> Drivers of hsql, postres, mysql didn't throw any exceptions.. Maybe
> other
> DBMS drivers work well with this param. I'm going to commit today or
> tomorrow. Nothing will break anything if we add this ability.
>
> Andrey
>
> 2009/5/21 Andrus Adamchik <an...@objectstyle.org>
>
>> Cool :-)
>>
>> On May 21, 2009, at 3:59 PM, Andrey Razumovsky wrote:
>>
>> Now when I know of this JDBC feature, I'd prefer to have it in
>> Cayenne
>>> sooner that later. We would also want to have it for other queries
>>> than
>>> just
>>> only SelectQuery.
>>>
>>
>> Absolutely. This has to be a part of the QueryMetadata on the
>> backend. On
>> the frontend any query that can potentially select data should have a
>> corresponding setter.
>>
>> Could someone help me testing it against different types
>>> of database if I commit?
>>>
>>
>> I can test it on almost all DB's that we support. Of course we
>> should have
>> Cayenne unit tests that will provide regression (i.e. driver XYZ
>> doesn't
>> throw UnsupportedOperationException when we call a corresponding JDBC
>> method).
>>
>> In addition to that I'd like to see if there's really memory/speed
>> savings
>> when using that (i.e. is it really worth it). For that I suggest
>> writing a
>> JDBC test outside of Cayenne, that can be run in profiler against
>> different
>> DB's.
>>
>> Andrus
>>
Re: Big selects on PostGres : Configuring Statement.setFetchSize() of
a selectquery
Posted by Andrey Razumovsky <ra...@gmail.com>.
I should say my tests on Postgres and mysql didn't show any results. If I
use small heap size, i get OutOfMemory no matter which fetch size was set.
Fetch speed and memory usage are the same. Seems drivers just ignore this
parameter. Stephane, did your workaround help you?
Drivers of hsql, postres, mysql didn't throw any exceptions.. Maybe other
DBMS drivers work well with this param. I'm going to commit today or
tomorrow. Nothing will break anything if we add this ability.
Andrey
2009/5/21 Andrus Adamchik <an...@objectstyle.org>
> Cool :-)
>
> On May 21, 2009, at 3:59 PM, Andrey Razumovsky wrote:
>
> Now when I know of this JDBC feature, I'd prefer to have it in Cayenne
>> sooner that later. We would also want to have it for other queries than
>> just
>> only SelectQuery.
>>
>
> Absolutely. This has to be a part of the QueryMetadata on the backend. On
> the frontend any query that can potentially select data should have a
> corresponding setter.
>
> Could someone help me testing it against different types
>> of database if I commit?
>>
>
> I can test it on almost all DB's that we support. Of course we should have
> Cayenne unit tests that will provide regression (i.e. driver XYZ doesn't
> throw UnsupportedOperationException when we call a corresponding JDBC
> method).
>
> In addition to that I'd like to see if there's really memory/speed savings
> when using that (i.e. is it really worth it). For that I suggest writing a
> JDBC test outside of Cayenne, that can be run in profiler against different
> DB's.
>
> Andrus
>
Re: Big selects on PostGres : Configuring Statement.setFetchSize() of a selectquery
Posted by Andrus Adamchik <an...@objectstyle.org>.
Cool :-)
On May 21, 2009, at 3:59 PM, Andrey Razumovsky wrote:
> Now when I know of this JDBC feature, I'd prefer to have it in Cayenne
> sooner that later. We would also want to have it for other queries
> than just
> only SelectQuery.
Absolutely. This has to be a part of the QueryMetadata on the backend.
On the frontend any query that can potentially select data should have
a corresponding setter.
> Could someone help me testing it against different types
> of database if I commit?
I can test it on almost all DB's that we support. Of course we should
have Cayenne unit tests that will provide regression (i.e. driver XYZ
doesn't throw UnsupportedOperationException when we call a
corresponding JDBC method).
In addition to that I'd like to see if there's really memory/speed
savings when using that (i.e. is it really worth it). For that I
suggest writing a JDBC test outside of Cayenne, that can be run in
profiler against different DB's.
Andrus
Re: Big selects on PostGres : Configuring Statement.setFetchSize() of
a selectquery
Posted by Andrey Razumovsky <ra...@gmail.com>.
Now when I know of this JDBC feature, I'd prefer to have it in Cayenne
sooner that later. We would also want to have it for other queries than just
only SelectQuery. Could someone help me testing it against different types
of database if I commit?
2009/5/20 stefcl <st...@gmail.com>
>
> I have posted a small fix in the jira.
>
>
> I added the following in SelectQuery.java :
> ***************************************
> //quick fix : property for setting the jdbc fetch size
>
> protected Integer cursorFetchSize;
>
> /**
> * Gets the JDBC fetch size for this query.
> * @return the fetch size defined for this query, null if not set.
> */
> public Integer getCursorFetchSize()
> {
> return cursorFetchSize;
> }
>
>
> /**
> * Sets the JDBC fetch size for this query.
> * @param cursorFetchSize the desired fetch size, or null to reset
> it.
> */
> public void setCursorFetchSize(Integer cursorFetchSize)
> {
> this.cursorFetchSize = cursorFetchSize;
> }
>
> //end quick fix
> ************************************
>
> And modified SelectAction.java to add the check in performAction :
> public void performAction(Connection connection, OperationObserver
> observer)
> throws SQLException, Exception {
>
> long t1 = System.currentTimeMillis();
>
> SelectTranslator translator = createTranslator(connection);
> PreparedStatement prepStmt = translator.createStatement();
>
> //quick fix : sets the fetch size
> if( query.getCursorFetchSize() != null )
> prepStmt.setFetchSize( query.getCursorFetchSize() );
> //quick fix end
>
> ResultSet rs = prepStmt.executeQuery();
> QueryMetadata md = query.getMetaData(getEntityResolver());
>
> (...)
>
> ********************************************************
>
>
>
>
> --
> View this message in context:
> http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23635409.html
> Sent from the Cayenne - User mailing list archive at Nabble.com.
>
>
Re: Big selects on PostGres : Configuring Statement.setFetchSize()
of a selectquery
Posted by stefcl <st...@gmail.com>.
I have posted a small fix in the jira.
I added the following in SelectQuery.java :
***************************************
//quick fix : property for setting the jdbc fetch size
protected Integer cursorFetchSize;
/**
* Gets the JDBC fetch size for this query.
* @return the fetch size defined for this query, null if not set.
*/
public Integer getCursorFetchSize()
{
return cursorFetchSize;
}
/**
* Sets the JDBC fetch size for this query.
* @param cursorFetchSize the desired fetch size, or null to reset it.
*/
public void setCursorFetchSize(Integer cursorFetchSize)
{
this.cursorFetchSize = cursorFetchSize;
}
//end quick fix
************************************
And modified SelectAction.java to add the check in performAction :
public void performAction(Connection connection, OperationObserver observer)
throws SQLException, Exception {
long t1 = System.currentTimeMillis();
SelectTranslator translator = createTranslator(connection);
PreparedStatement prepStmt = translator.createStatement();
//quick fix : sets the fetch size
if( query.getCursorFetchSize() != null )
prepStmt.setFetchSize( query.getCursorFetchSize() );
//quick fix end
ResultSet rs = prepStmt.executeQuery();
QueryMetadata md = query.getMetaData(getEntityResolver());
(...)
********************************************************
--
View this message in context: http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23635409.html
Sent from the Cayenne - User mailing list archive at Nabble.com.
Re: Big selects on PostGres : Configuring Statement.setFetchSize()
of a selectquery
Posted by stefcl <st...@gmail.com>.
Just to say, because it's a standard JDBC feature I don't think that letting
the user set this param can do much harm. Nevertheless, I perfectly
understand that you can't take any change lightly.
In fact most other OR tools let you set this param. In Ibatis you can do it
directly in your query mapping file. Hibernate criteria queries have a
setFetchSize() method and, looking for similar issues on the Jira I found
something like this concerning OJB. Don't get me wrong, I'm not trying to
say that it's a shame or anything negative, just sharing what I could learn
elsewhere on this topic.
I try to avoid this as much as possible but I think I'm going to make this
change myself so I can begin working on my project while you do all the
required testing. I really like cayenne and I have seen that there's a nice
community that kindly answers your questions without trying to make you feel
like an idiot.
Thanks everyone...
Andrus Adamchik wrote:
>
> Thanks for the Jira. Can't give you a time estimate on that, we are an
> open source project after all. All I can say now is that adding this
> API to Cayenne is fairly trivial (you can even take a shot at this
> yourself, if you feel like hacking the Cayenne internals). But before
> we commit anything like that, I'd like us to research the effects of
> this on at least a few of the databases that Cayenne supports, and
> various driver-specific caveats that I am sure will arise. So this
> will take some time to make it a production quality feature.
>
> Andrus
>
>
--
View this message in context: http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23586272.html
Sent from the Cayenne - User mailing list archive at Nabble.com.
Re: Big selects on PostGres : Configuring Statement.setFetchSize() of a selectquery
Posted by Andrus Adamchik <an...@objectstyle.org>.
Thanks for the Jira. Can't give you a time estimate on that, we are an
open source project after all. All I can say now is that adding this
API to Cayenne is fairly trivial (you can even take a shot at this
yourself, if you feel like hacking the Cayenne internals). But before
we commit anything like that, I'd like us to research the effects of
this on at least a few of the databases that Cayenne supports, and
various driver-specific caveats that I am sure will arise. So this
will take some time to make it a production quality feature.
Andrus
On May 17, 2009, at 12:52 PM, stefcl wrote:
>
> According to the JDBC javadoc, setFetchSize(int) is an *hint* you
> can give,
> the driver may decide to ignore it. Nevetheless I'm sure it's useful
> because
> other OR/mapper like IBatis provide a way to set this parameter.
>
> I have filled a Jira issue about this :
> https://issues.apache.org/jira/browse/CAY-1225
> https://issues.apache.org/jira/browse/CAY-1225
>
> I really hope there's a chance to get a fix shortly, because Cayenne
> is the
> only OR/mapper that gave me satisfaction.
>
>
>
> Andrus Adamchik wrote:
>>
>> I haven't experimented with that much (so no idea which drivers take
>> advantage of that and which would simply ignore it), but that would
>> indeed be a useful parameter for ResultIterator scenario. So maybe
>> open a feature request in Jira?
>>
>> Andrus
>>
> --
> View this message in context: http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23581761.html
> Sent from the Cayenne - User mailing list archive at Nabble.com.
>
>
Re: Big selects on PostGres : Configuring Statement.setFetchSize()
of a selectquery
Posted by stefcl <st...@gmail.com>.
According to the JDBC javadoc, setFetchSize(int) is an *hint* you can give,
the driver may decide to ignore it. Nevetheless I'm sure it's useful because
other OR/mapper like IBatis provide a way to set this parameter.
I have filled a Jira issue about this :
https://issues.apache.org/jira/browse/CAY-1225
https://issues.apache.org/jira/browse/CAY-1225
I really hope there's a chance to get a fix shortly, because Cayenne is the
only OR/mapper that gave me satisfaction.
Andrus Adamchik wrote:
>
> I haven't experimented with that much (so no idea which drivers take
> advantage of that and which would simply ignore it), but that would
> indeed be a useful parameter for ResultIterator scenario. So maybe
> open a feature request in Jira?
>
> Andrus
>
--
View this message in context: http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23581761.html
Sent from the Cayenne - User mailing list archive at Nabble.com.
Re: Big selects on PostGres : Configuring Statement.setFetchSize() of a selectquery
Posted by Andrus Adamchik <an...@objectstyle.org>.
I haven't experimented with that much (so no idea which drivers take
advantage of that and which would simply ignore it), but that would
indeed be a useful parameter for ResultIterator scenario. So maybe
open a feature request in Jira?
Andrus
On May 15, 2009, at 5:45 PM, stefcl wrote:
>
> Thanks,
>
> Since the beginning I am convinced that resultiterators are the
> right way,
> unfortunately they are useless if the jdbc driver loads everything
> in memory
> before giving access to the resultset.
>
> I hope there's something to do about it because it's a showstopper
> for me.
>
>
> Robert Zeigler-6 wrote:
>>
>> Take a look at paginated queries (which isn't /quite/ what you want,
>> but may help you):
>>
>> http://cayenne.apache.org/doc/paginated-queries.html
>>
>> As well as ResultIterator:
>> http://cayenne.apache.org/doc/iterating-through-data-rows.html
>>
>> Robert
>>
>> On May 15, 2009, at 5/159:21 AM , stefcl wrote:
>>
>>>
>>>
>>> Thanks but my problem is not related to paging.
>>>
>>> I would like to be able to execute a select query which returns
>>> approx
>>> 100000 rows with a single selectQuery and process them one by one,
>>> while
>>> keeping only a few of them in memory (using the resultiterator).
>>>
>>> In jdbc, Statement.setFetchSize(1000) tells the jdbc driver to
>>> retrieve
>>> results from the database 1000 at a time and ask for the next 1000
>>> as you're
>>> iterating the resultset. Otherwise its default behavior is to
>>> retrieve the
>>> 100'000 rows in memory before you can start iterating the resultset.
>>>
>>> It's not the same thing as inserting a TOP or a LIMIT clause in the
>>> query.
>>>
>>> Any help appreciated
>>>
>>>
>>> Andrey Razumovsky wrote:
>>>>
>>>> Hi,
>>>>
>>>> Yes of course. You can use SelectQuery.setFetchLimit(int) and
>>>> setFetchStart(int) methods. They do just that.
>>>>
>>>> Andrey
>>>>
>>>
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23560831.html
>>> Sent from the Cayenne - User mailing list archive at Nabble.com.
>>>
>>
>>
>>
>
> --
> View this message in context: http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23561312.html
> Sent from the Cayenne - User mailing list archive at Nabble.com.
>
>
Re: Big selects on PostGres : Configuring Statement.setFetchSize()
of a selectquery
Posted by stefcl <st...@gmail.com>.
Thanks,
Since the beginning I am convinced that resultiterators are the right way,
unfortunately they are useless if the jdbc driver loads everything in memory
before giving access to the resultset.
I hope there's something to do about it because it's a showstopper for me.
Robert Zeigler-6 wrote:
>
> Take a look at paginated queries (which isn't /quite/ what you want,
> but may help you):
>
> http://cayenne.apache.org/doc/paginated-queries.html
>
> As well as ResultIterator:
> http://cayenne.apache.org/doc/iterating-through-data-rows.html
>
> Robert
>
> On May 15, 2009, at 5/159:21 AM , stefcl wrote:
>
>>
>>
>> Thanks but my problem is not related to paging.
>>
>> I would like to be able to execute a select query which returns approx
>> 100000 rows with a single selectQuery and process them one by one,
>> while
>> keeping only a few of them in memory (using the resultiterator).
>>
>> In jdbc, Statement.setFetchSize(1000) tells the jdbc driver to
>> retrieve
>> results from the database 1000 at a time and ask for the next 1000
>> as you're
>> iterating the resultset. Otherwise its default behavior is to
>> retrieve the
>> 100'000 rows in memory before you can start iterating the resultset.
>>
>> It's not the same thing as inserting a TOP or a LIMIT clause in the
>> query.
>>
>> Any help appreciated
>>
>>
>> Andrey Razumovsky wrote:
>>>
>>> Hi,
>>>
>>> Yes of course. You can use SelectQuery.setFetchLimit(int) and
>>> setFetchStart(int) methods. They do just that.
>>>
>>> Andrey
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23560831.html
>> Sent from the Cayenne - User mailing list archive at Nabble.com.
>>
>
>
>
--
View this message in context: http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23561312.html
Sent from the Cayenne - User mailing list archive at Nabble.com.
Re: Big selects on PostGres : Configuring Statement.setFetchSize() of a selectquery
Posted by Robert Zeigler <ro...@roxanemy.com>.
Take a look at paginated queries (which isn't /quite/ what you want,
but may help you):
http://cayenne.apache.org/doc/paginated-queries.html
As well as ResultIterator:
http://cayenne.apache.org/doc/iterating-through-data-rows.html
Robert
On May 15, 2009, at 5/159:21 AM , stefcl wrote:
>
>
> Thanks but my problem is not related to paging.
>
> I would like to be able to execute a select query which returns approx
> 100000 rows with a single selectQuery and process them one by one,
> while
> keeping only a few of them in memory (using the resultiterator).
>
> In jdbc, Statement.setFetchSize(1000) tells the jdbc driver to
> retrieve
> results from the database 1000 at a time and ask for the next 1000
> as you're
> iterating the resultset. Otherwise its default behavior is to
> retrieve the
> 100'000 rows in memory before you can start iterating the resultset.
>
> It's not the same thing as inserting a TOP or a LIMIT clause in the
> query.
>
> Any help appreciated
>
>
> Andrey Razumovsky wrote:
>>
>> Hi,
>>
>> Yes of course. You can use SelectQuery.setFetchLimit(int) and
>> setFetchStart(int) methods. They do just that.
>>
>> Andrey
>>
>
> --
> View this message in context: http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23560831.html
> Sent from the Cayenne - User mailing list archive at Nabble.com.
>
Re: Big selects on PostGres : Configuring Statement.setFetchSize()
of a selectquery
Posted by stefcl <st...@gmail.com>.
Thanks but my problem is not related to paging.
I would like to be able to execute a select query which returns approx
100000 rows with a single selectQuery and process them one by one, while
keeping only a few of them in memory (using the resultiterator).
In jdbc, Statement.setFetchSize(1000) tells the jdbc driver to retrieve
results from the database 1000 at a time and ask for the next 1000 as you're
iterating the resultset. Otherwise its default behavior is to retrieve the
100'000 rows in memory before you can start iterating the resultset.
It's not the same thing as inserting a TOP or a LIMIT clause in the query.
Any help appreciated
Andrey Razumovsky wrote:
>
> Hi,
>
> Yes of course. You can use SelectQuery.setFetchLimit(int) and
> setFetchStart(int) methods. They do just that.
>
> Andrey
>
--
View this message in context: http://www.nabble.com/Big-selects-on-PostGres-%3A-Configuring-Statement.setFetchSize%28%29-of-a-selectquery-tp23555842p23560831.html
Sent from the Cayenne - User mailing list archive at Nabble.com.