You are viewing a plain text version of this content. The canonical link for it is here.
Posted to torque-user@db.apache.org by Yannick Richard <Ya...@matricis.com> on 2007/10/10 15:02:17 UTC

LargeSelect example ?

Hi,

 

I am currently working on a Torque project that will handle database
synchronization. 

The problem we have is an Out of Memory exception while selecting a big
bunch of data from the database.

 

Here is the command we are using :

List ObjectsFromDB = ObjectPeer.doSelect(criteria, connection);

 

I saw the LargeSelect class you worked on but cannot find any Java
example that could help me go forward.

Could you help me point to an example or help me understand how to
integrate LargeSelect ?

 

Thanks a lot for what you can do.

 

Regards,

Yannick Richard

 


Re: LargeSelect example ?

Posted by YannickR <Ya...@matricis.com>.
After a few days testing different scenarios, my guess is that TORQUE-84
needs to be completed and tested in order to fully support Limit/Offset in
SQL server 2005. 

Without the patch, offset seems to be unsupported: I need to have PageSize *
MemoryPageLimit that cover total records, or some will be unread. 

With the patch applied, using LargeSelect.getNextResultsAvailable() will
never end the reading process and some extra data will be read. I even tried
to calculate total pages from volume of records and use
LargeSelect.getPage(x), same thing...

For now, I will put that on ice since I have a lot of other tasks to do ;-)
I will keep an eye on this thread and on TORQUE-84 development progress.

Thanks to all for your precious help,

Regards,
Yannick


Thomas Vandahl-2 wrote:
> 
> Scott Eade wrote:
>> LargeSelect exists specifically so that you can read through a large
>> number of records in chunks of size memoryPageLimit.
> 
> If I understand Yannick correctly he has an issue with MS-SQL not
> working correctly with LargeSelect. I remember some problems with the
> native limit/offset support of MS-SQL which should have been fixed some
> time ago. If the problem can be reproduced, it probably isn't fixed. So
> I would recommend to provide some test case which fails and open a JIRA
> issue.
> 
> Bye, Thomas.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
> For additional commands, e-mail: torque-user-help@db.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/LargeSelect-example---tf4605414.html#a13510951
Sent from the Apache DB - Torque Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


Re: LargeSelect example ?

Posted by Thomas Vandahl <tv...@apache.org>.
Scott Eade wrote:
> LargeSelect exists specifically so that you can read through a large
> number of records in chunks of size memoryPageLimit.

If I understand Yannick correctly he has an issue with MS-SQL not
working correctly with LargeSelect. I remember some problems with the
native limit/offset support of MS-SQL which should have been fixed some
time ago. If the problem can be reproduced, it probably isn't fixed. So
I would recommend to provide some test case which fails and open a JIRA
issue.

Bye, Thomas.


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


Re: LargeSelect example ?

Posted by Scott Eade <se...@backstagetech.com.au>.
LargeSelect exists specifically so that you can read through a large 
number of records in chunks of size memoryPageLimit.

If you want to read all of the records into memory at once you would not 
use LargeSelect.  If all the records will not fit into memory then you 
use LargeSelect to automatically break up the query result using 
offset/limit queries (which is not supported by Torque for all RDBMSs).

If you are presenting data to users you use pageSize to determine the 
number of records at a time that you want to pull from those available 
(memoryPageLimit).  If you are not presenting the data to a user you 
should set pageSize to the same value as memoryPageLimit.

LargeSelect is essentially a buffering mechanism:
- memoryPageLimit is a buffer of records from the overall query result - 
LargeSelect pulls in more data using offset and limit whenever data 
beyond what it has in memory is requested.
- pageSize determines how much of the buffered data you want to access 
in each request.
When data within the memoryPageBuffer is requested it can be provided 
without executing a further query.  If data outside of the 
memoryPageBuffer is requested the existing data is discarded and a new 
one that contains the requested data is filled by querying for a subset 
for the full query results.

So to run through your 185K records you simply keep retrieving them a 
page at a time using getNextResults() with a memoryPageLimit well below 
185K.  If this is not working for you then I would suggest that the 
offset/limit processing for the RDBMS you are using may not be fully 
implemented, may have a bug, or may be falling back to processing that 
actually retrieves all of the records and discards those outside of the 
offset/limit (this would totally defeat the purpose of using LargeSelect).

Have you run the LargeSelect tests against your RDBMS?  You should do so 
and ensure that the offset/limit is fully implemented.

Scott


YannickR wrote:
> Thank you Scott for your great explanation of LargeSelect.
> 
> I understand that the goal of it is to view a portion of the data at a time
> : (PageSize * MemoryPageLimit) records. We already have a program that use
> Torque objects for synchronization on a 10 second thread basis and map
> objects from one side to the other side. I made some modifications to it so
> we can launch it in batch mode, because too much data was accumulated on one
> database side (around 185 000 records).
> 
> Now I need to read them all with LargeSelect. Like I said, it is not reading
> all records as expected, unless I specified PageSize and MemoryPageLimit
> values high enough to cover the total number of records. As you suggest, I
> would use 435*435 (189 235) in order to read them all with success... This
> would surely generate a Heap memory exception. What should I do if i do not
> know exactly the volume of data to process ? Do I need to calculate it
> dynamically according to total record count ?
> 
> With your example, if I set a PageSize of 100, MemoryPageLimit will equal 5
> :
> (100 * 5 < 250) ? (250 / 100 ) : 5
> I will then miss most of the data, it will only read 1000 records on 185
> 000...
> 
> You said that it is possible to configure LargeSelect to pull in just one
> page of data at a time.
> How can I do that ? That would resolve my issue, since it would stop reading
> all records in one shot while instantiating LargeSelect.
> 
> Thanks for your help ! It is appreciated.
> 
> Regards,
> Yannick
> 
> 
> seade wrote:
>> I hadn't been paying close attention to this thread, but it seems that 
>> the a couple of points are somehow being missed:
>>
>> 1. If you have a large amount of data, how much of it is the user 
>> actually going to practically be able to view.  It is not such a good 
>> idea to provide the user with a means of browsing through a million 
>> records - they will never do so.  You need to provide the ability to 
>> filter the data down to a practical number of records that the user can 
>> then view.
>> 2. If you run a query that pulls in one million records you are more 
>> than likely going to run out of memory.  This is in fact the problem 
>> that LargeSelect seeks to address.  Instead of pulling in all of the 
>> records, in instead pulls in a subset of these that can then be 
>> presented a page at a time.  While you can configure LargeSelect to pull 
>> in just one page of data at a time this may be at odds with the 
>> complexity of the query and the amount of time it takes to execute.  To 
>> counter this, LargeSelect provides the ability to cache a configurable 
>> number of pages worth of data - this way the user can at least browse 
>> through a few pages of data without triggering an expensive query for 
>> every hit.  It is up to you to determine how much data will be presented 
>> on any given page and how many pages of data to read ahead - make the 
>> values too large and you will still run out of memory.
>>
>> I am a heavy user of LargeSelect.  I use a pageSize of between 10 and 
>> 100 (as selected by the user) and a memoryPageLimit of:
>>
>> 	(pageSize * 5 < 250) ? (250 / pageSize ) : 5
>>
>> And everything works nicely.
>>
>> I have no idea whether or not Torque-84 works, but it is unlikely to be 
>> committed without the addition of test cases and even then it will 
>> require a committer to take the time to ensure it behaves correctly. 
>>  From what Greg is saying, for MS SQL Torque-84 should not be required, 
>> but other changes at svn trunk are.
>>
>> LargeSelect is about presenting data to users - as I said above, a user 
>> is never going to look at 1 million records.  You on the other hand are 
>> working on database synchronization, so I assume you are working through 
>> a large number of records (8000 was mentioned somewhere) that are not 
>> actually being presented to users.  The first question I would ask is 
>> whether or not you need to instantiate the data as Torque objects - i.e. 
>> could you get by with using native SQL (most likely quicker when dealing 
>> with bulk data like this).  That said, there should be no reason why you 
>> cannot use LargeSelect for your purposes - i.e. to limit the number of 
>> records in memory at any given time.  To do this I would set pageSize 
>> and memoryPageLimit to the same value, a value that maximises throughput 
>> by balancing the trade-off between memory use and query execution time.
>>
>> HTH,
>>
>> Scott
>>
>> YannickR wrote:
>>> I checked out the current CVS head (without Torque-84 patch) and did some
>>> tests in order to better explain what is happening. It seems that
>>> PageSize *
>>> PageMemoryLimit need to cover the amount of records. For example if you
>>> have
>>> 8000 records to read and PageMemoryLimit is set to the default of 5,
>>> PageSize would have a minimum value of 8000/5/2 = 800. If a value lower
>>> than
>>> 800 is used, some records won't be read... When you have a huge amount as
>>> 185 000 records to read, the limit will be memory : 185 000/5/2 = 18 500
>>> minimum. That means 92 500 records in memory at one time...
>>>
>>> To reproduce the situation, LargeSelect unit tests should not use
>>> PageSize
>>> and PageMemoryLimit in order to fill Authors. By doing so, all records
>>> are
>>> covered and the comportment that I just described won't be reproducible.
>>> Anyway, is 9*9 records a "Large" Select test ?
>>>
>>> As I already said, when I use Torque-84 patch,
>>> LargeSelect.getNextResultsAvailable() will always return true, so reading
>>> in
>>> an infinite loop ;-(
>>>
>>> Could someone clarify, please ?
>>>
>>>
>>> Greg Monroe wrote:
>>>> As a quick aside, it would be much easier to follow your 
>>>> messages, if your embedded comments where not prefixed
>>>> with one or more >'s.  Makes it real hard to see what 
>>>> are new comments and what are old.
>>>>
>>>> That said, I tested the current CVS head (which is 99.9% 
>>>> final release for 3.3) against MS SQL 2000 just last 
>>>> week.  In order for this to pass all the Limit / 
>>>> LargeSelect tests in the test project, I committed some 
>>>> changes to the DBSybase class (which MS SQL extends).
>>>>
>>>> So, try checking out the latest from CVS and using this. 
>>>> This should work with MS 2005.  The support is generic 
>>>> across all MS SQL versions, so it is "psuedo" support that
>>>> requires more data than requested to be read and "trimmed"
>>>> down.
>>>>
>>>>
>>>>> -----Original Message-----
>>>>> From: YannickR [mailto:Yannick.Richard@matricis.com] 
>>>>> Sent: Friday, October 26, 2007 12:26 PM
>>>>> To: torque-user@db.apache.org
>>>>> Subject: Re: LargeSelect example ?
>>>>>
>>>>>
>>>>>> Is the patch working or not ? The status on 
>>>>>>
>>>>> https://issues.apache.org/jira/browse/TORQUE-84?page=com.atlassian.jir
>>>>>> a.plugin.system.issuetabpanels:all-tabpanel
>>>>>> seems to be unresolved...
>>>>>>
>>>>>> Could someone help me on that one ?
>>>>>> Can I still use LargeSelect with MSSQL 2005 ?
>>>>>>
>>>>>> Regards,
>>>>>> Yannick Richard
>>>>>>
>>>> DukeCE Privacy Statement:
>>>> Please be advised that this e-mail and any files transmitted with
>>>> it are confidential communication or may otherwise be privileged or
>>>> confidential and are intended solely for the individual or entity
>>>> to whom they are addressed. If you are not the intended recipient
>>>> you may not rely on the contents of this email or any attachments,
>>>> and we ask that you please not read, copy or retransmit this
>>>> communication, but reply to the sender and destroy the email, its
>>>> contents, and all copies thereof immediately. Any unauthorized
>>>> dissemination, distribution or copying of this communication is
>>>> strictly prohibited.
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
>>>> For additional commands, e-mail: torque-user-help@db.apache.org
>>>>
>>>>
>>>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
>> For additional commands, e-mail: torque-user-help@db.apache.org
>>
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


Re: LargeSelect example ?

Posted by YannickR <Ya...@matricis.com>.
Thank you Scott for your great explanation of LargeSelect.

I understand that the goal of it is to view a portion of the data at a time
: (PageSize * MemoryPageLimit) records. We already have a program that use
Torque objects for synchronization on a 10 second thread basis and map
objects from one side to the other side. I made some modifications to it so
we can launch it in batch mode, because too much data was accumulated on one
database side (around 185 000 records).

Now I need to read them all with LargeSelect. Like I said, it is not reading
all records as expected, unless I specified PageSize and MemoryPageLimit
values high enough to cover the total number of records. As you suggest, I
would use 435*435 (189 235) in order to read them all with success... This
would surely generate a Heap memory exception. What should I do if i do not
know exactly the volume of data to process ? Do I need to calculate it
dynamically according to total record count ?

With your example, if I set a PageSize of 100, MemoryPageLimit will equal 5
:
(100 * 5 < 250) ? (250 / 100 ) : 5
I will then miss most of the data, it will only read 1000 records on 185
000...

You said that it is possible to configure LargeSelect to pull in just one
page of data at a time.
How can I do that ? That would resolve my issue, since it would stop reading
all records in one shot while instantiating LargeSelect.

Thanks for your help ! It is appreciated.

Regards,
Yannick


seade wrote:
> 
> I hadn't been paying close attention to this thread, but it seems that 
> the a couple of points are somehow being missed:
> 
> 1. If you have a large amount of data, how much of it is the user 
> actually going to practically be able to view.  It is not such a good 
> idea to provide the user with a means of browsing through a million 
> records - they will never do so.  You need to provide the ability to 
> filter the data down to a practical number of records that the user can 
> then view.
> 2. If you run a query that pulls in one million records you are more 
> than likely going to run out of memory.  This is in fact the problem 
> that LargeSelect seeks to address.  Instead of pulling in all of the 
> records, in instead pulls in a subset of these that can then be 
> presented a page at a time.  While you can configure LargeSelect to pull 
> in just one page of data at a time this may be at odds with the 
> complexity of the query and the amount of time it takes to execute.  To 
> counter this, LargeSelect provides the ability to cache a configurable 
> number of pages worth of data - this way the user can at least browse 
> through a few pages of data without triggering an expensive query for 
> every hit.  It is up to you to determine how much data will be presented 
> on any given page and how many pages of data to read ahead - make the 
> values too large and you will still run out of memory.
> 
> I am a heavy user of LargeSelect.  I use a pageSize of between 10 and 
> 100 (as selected by the user) and a memoryPageLimit of:
> 
> 	(pageSize * 5 < 250) ? (250 / pageSize ) : 5
> 
> And everything works nicely.
> 
> I have no idea whether or not Torque-84 works, but it is unlikely to be 
> committed without the addition of test cases and even then it will 
> require a committer to take the time to ensure it behaves correctly. 
>  From what Greg is saying, for MS SQL Torque-84 should not be required, 
> but other changes at svn trunk are.
> 
> LargeSelect is about presenting data to users - as I said above, a user 
> is never going to look at 1 million records.  You on the other hand are 
> working on database synchronization, so I assume you are working through 
> a large number of records (8000 was mentioned somewhere) that are not 
> actually being presented to users.  The first question I would ask is 
> whether or not you need to instantiate the data as Torque objects - i.e. 
> could you get by with using native SQL (most likely quicker when dealing 
> with bulk data like this).  That said, there should be no reason why you 
> cannot use LargeSelect for your purposes - i.e. to limit the number of 
> records in memory at any given time.  To do this I would set pageSize 
> and memoryPageLimit to the same value, a value that maximises throughput 
> by balancing the trade-off between memory use and query execution time.
> 
> HTH,
> 
> Scott
> 
> YannickR wrote:
>> I checked out the current CVS head (without Torque-84 patch) and did some
>> tests in order to better explain what is happening. It seems that
>> PageSize *
>> PageMemoryLimit need to cover the amount of records. For example if you
>> have
>> 8000 records to read and PageMemoryLimit is set to the default of 5,
>> PageSize would have a minimum value of 8000/5/2 = 800. If a value lower
>> than
>> 800 is used, some records won't be read... When you have a huge amount as
>> 185 000 records to read, the limit will be memory : 185 000/5/2 = 18 500
>> minimum. That means 92 500 records in memory at one time...
>> 
>> To reproduce the situation, LargeSelect unit tests should not use
>> PageSize
>> and PageMemoryLimit in order to fill Authors. By doing so, all records
>> are
>> covered and the comportment that I just described won't be reproducible.
>> Anyway, is 9*9 records a "Large" Select test ?
>> 
>> As I already said, when I use Torque-84 patch,
>> LargeSelect.getNextResultsAvailable() will always return true, so reading
>> in
>> an infinite loop ;-(
>> 
>> Could someone clarify, please ?
>> 
>> 
>> Greg Monroe wrote:
>>> As a quick aside, it would be much easier to follow your 
>>> messages, if your embedded comments where not prefixed
>>> with one or more >'s.  Makes it real hard to see what 
>>> are new comments and what are old.
>>>
>>> That said, I tested the current CVS head (which is 99.9% 
>>> final release for 3.3) against MS SQL 2000 just last 
>>> week.  In order for this to pass all the Limit / 
>>> LargeSelect tests in the test project, I committed some 
>>> changes to the DBSybase class (which MS SQL extends).
>>>
>>> So, try checking out the latest from CVS and using this. 
>>> This should work with MS 2005.  The support is generic 
>>> across all MS SQL versions, so it is "psuedo" support that
>>> requires more data than requested to be read and "trimmed"
>>> down.
>>>
>>>
>>>> -----Original Message-----
>>>> From: YannickR [mailto:Yannick.Richard@matricis.com] 
>>>> Sent: Friday, October 26, 2007 12:26 PM
>>>> To: torque-user@db.apache.org
>>>> Subject: Re: LargeSelect example ?
>>>>
>>>>
>>>>> Is the patch working or not ? The status on 
>>>>>
>>>> https://issues.apache.org/jira/browse/TORQUE-84?page=com.atlassian.jir
>>>>> a.plugin.system.issuetabpanels:all-tabpanel
>>>>> seems to be unresolved...
>>>>>
>>>>> Could someone help me on that one ?
>>>>> Can I still use LargeSelect with MSSQL 2005 ?
>>>>>
>>>>> Regards,
>>>>> Yannick Richard
>>>>>
>>> DukeCE Privacy Statement:
>>> Please be advised that this e-mail and any files transmitted with
>>> it are confidential communication or may otherwise be privileged or
>>> confidential and are intended solely for the individual or entity
>>> to whom they are addressed. If you are not the intended recipient
>>> you may not rely on the contents of this email or any attachments,
>>> and we ask that you please not read, copy or retransmit this
>>> communication, but reply to the sender and destroy the email, its
>>> contents, and all copies thereof immediately. Any unauthorized
>>> dissemination, distribution or copying of this communication is
>>> strictly prohibited.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
>>> For additional commands, e-mail: torque-user-help@db.apache.org
>>>
>>>
>>>
>> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
> For additional commands, e-mail: torque-user-help@db.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/LargeSelect-example---tf4605414.html#a13488717
Sent from the Apache DB - Torque Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


Re: LargeSelect example ?

Posted by Scott Eade <se...@backstagetech.com.au>.
I hadn't been paying close attention to this thread, but it seems that 
the a couple of points are somehow being missed:

1. If you have a large amount of data, how much of it is the user 
actually going to practically be able to view.  It is not such a good 
idea to provide the user with a means of browsing through a million 
records - they will never do so.  You need to provide the ability to 
filter the data down to a practical number of records that the user can 
then view.
2. If you run a query that pulls in one million records you are more 
than likely going to run out of memory.  This is in fact the problem 
that LargeSelect seeks to address.  Instead of pulling in all of the 
records, in instead pulls in a subset of these that can then be 
presented a page at a time.  While you can configure LargeSelect to pull 
in just one page of data at a time this may be at odds with the 
complexity of the query and the amount of time it takes to execute.  To 
counter this, LargeSelect provides the ability to cache a configurable 
number of pages worth of data - this way the user can at least browse 
through a few pages of data without triggering an expensive query for 
every hit.  It is up to you to determine how much data will be presented 
on any given page and how many pages of data to read ahead - make the 
values too large and you will still run out of memory.

I am a heavy user of LargeSelect.  I use a pageSize of between 10 and 
100 (as selected by the user) and a memoryPageLimit of:

	(pageSize * 5 < 250) ? (250 / pageSize ) : 5

And everything works nicely.

I have no idea whether or not Torque-84 works, but it is unlikely to be 
committed without the addition of test cases and even then it will 
require a committer to take the time to ensure it behaves correctly. 
 From what Greg is saying, for MS SQL Torque-84 should not be required, 
but other changes at svn trunk are.

LargeSelect is about presenting data to users - as I said above, a user 
is never going to look at 1 million records.  You on the other hand are 
working on database synchronization, so I assume you are working through 
a large number of records (8000 was mentioned somewhere) that are not 
actually being presented to users.  The first question I would ask is 
whether or not you need to instantiate the data as Torque objects - i.e. 
could you get by with using native SQL (most likely quicker when dealing 
with bulk data like this).  That said, there should be no reason why you 
cannot use LargeSelect for your purposes - i.e. to limit the number of 
records in memory at any given time.  To do this I would set pageSize 
and memoryPageLimit to the same value, a value that maximises throughput 
by balancing the trade-off between memory use and query execution time.

HTH,

Scott

YannickR wrote:
> I checked out the current CVS head (without Torque-84 patch) and did some
> tests in order to better explain what is happening. It seems that PageSize *
> PageMemoryLimit need to cover the amount of records. For example if you have
> 8000 records to read and PageMemoryLimit is set to the default of 5,
> PageSize would have a minimum value of 8000/5/2 = 800. If a value lower than
> 800 is used, some records won't be read... When you have a huge amount as
> 185 000 records to read, the limit will be memory : 185 000/5/2 = 18 500
> minimum. That means 92 500 records in memory at one time...
> 
> To reproduce the situation, LargeSelect unit tests should not use PageSize
> and PageMemoryLimit in order to fill Authors. By doing so, all records are
> covered and the comportment that I just described won't be reproducible.
> Anyway, is 9*9 records a "Large" Select test ?
> 
> As I already said, when I use Torque-84 patch,
> LargeSelect.getNextResultsAvailable() will always return true, so reading in
> an infinite loop ;-(
> 
> Could someone clarify, please ?
> 
> 
> Greg Monroe wrote:
>> As a quick aside, it would be much easier to follow your 
>> messages, if your embedded comments where not prefixed
>> with one or more >'s.  Makes it real hard to see what 
>> are new comments and what are old.
>>
>> That said, I tested the current CVS head (which is 99.9% 
>> final release for 3.3) against MS SQL 2000 just last 
>> week.  In order for this to pass all the Limit / 
>> LargeSelect tests in the test project, I committed some 
>> changes to the DBSybase class (which MS SQL extends).
>>
>> So, try checking out the latest from CVS and using this. 
>> This should work with MS 2005.  The support is generic 
>> across all MS SQL versions, so it is "psuedo" support that
>> requires more data than requested to be read and "trimmed"
>> down.
>>
>>
>>> -----Original Message-----
>>> From: YannickR [mailto:Yannick.Richard@matricis.com] 
>>> Sent: Friday, October 26, 2007 12:26 PM
>>> To: torque-user@db.apache.org
>>> Subject: Re: LargeSelect example ?
>>>
>>>
>>>> Is the patch working or not ? The status on 
>>>>
>>> https://issues.apache.org/jira/browse/TORQUE-84?page=com.atlassian.jir
>>>> a.plugin.system.issuetabpanels:all-tabpanel
>>>> seems to be unresolved...
>>>>
>>>> Could someone help me on that one ?
>>>> Can I still use LargeSelect with MSSQL 2005 ?
>>>>
>>>> Regards,
>>>> Yannick Richard
>>>>
>> DukeCE Privacy Statement:
>> Please be advised that this e-mail and any files transmitted with
>> it are confidential communication or may otherwise be privileged or
>> confidential and are intended solely for the individual or entity
>> to whom they are addressed. If you are not the intended recipient
>> you may not rely on the contents of this email or any attachments,
>> and we ask that you please not read, copy or retransmit this
>> communication, but reply to the sender and destroy the email, its
>> contents, and all copies thereof immediately. Any unauthorized
>> dissemination, distribution or copying of this communication is
>> strictly prohibited.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
>> For additional commands, e-mail: torque-user-help@db.apache.org
>>
>>
>>
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


RE: LargeSelect example ?

Posted by YannickR <Ya...@matricis.com>.
I checked out the current CVS head (without Torque-84 patch) and did some
tests in order to better explain what is happening. It seems that PageSize *
PageMemoryLimit need to cover the amount of records. For example if you have
8000 records to read and PageMemoryLimit is set to the default of 5,
PageSize would have a minimum value of 8000/5/2 = 800. If a value lower than
800 is used, some records won't be read... When you have a huge amount as
185 000 records to read, the limit will be memory : 185 000/5/2 = 18 500
minimum. That means 92 500 records in memory at one time...

To reproduce the situation, LargeSelect unit tests should not use PageSize
and PageMemoryLimit in order to fill Authors. By doing so, all records are
covered and the comportment that I just described won't be reproducible.
Anyway, is 9*9 records a "Large" Select test ?

As I already said, when I use Torque-84 patch,
LargeSelect.getNextResultsAvailable() will always return true, so reading in
an infinite loop ;-(

Could someone clarify, please ?


Greg Monroe wrote:
> 
> As a quick aside, it would be much easier to follow your 
> messages, if your embedded comments where not prefixed
> with one or more >'s.  Makes it real hard to see what 
> are new comments and what are old.
> 
> That said, I tested the current CVS head (which is 99.9% 
> final release for 3.3) against MS SQL 2000 just last 
> week.  In order for this to pass all the Limit / 
> LargeSelect tests in the test project, I committed some 
> changes to the DBSybase class (which MS SQL extends).
> 
> So, try checking out the latest from CVS and using this. 
> This should work with MS 2005.  The support is generic 
> across all MS SQL versions, so it is "psuedo" support that
> requires more data than requested to be read and "trimmed"
> down.
> 
> 
>> -----Original Message-----
>> From: YannickR [mailto:Yannick.Richard@matricis.com] 
>> Sent: Friday, October 26, 2007 12:26 PM
>> To: torque-user@db.apache.org
>> Subject: Re: LargeSelect example ?
>> 
>> 
>> > 
>> > Is the patch working or not ? The status on 
>> > 
>> https://issues.apache.org/jira/browse/TORQUE-84?page=com.atlassian.jir
>> > a.plugin.system.issuetabpanels:all-tabpanel
>> > seems to be unresolved...
>> > 
>> > Could someone help me on that one ?
>> > Can I still use LargeSelect with MSSQL 2005 ?
>> > 
>> > Regards,
>> > Yannick Richard
>> > 
> DukeCE Privacy Statement:
> Please be advised that this e-mail and any files transmitted with
> it are confidential communication or may otherwise be privileged or
> confidential and are intended solely for the individual or entity
> to whom they are addressed. If you are not the intended recipient
> you may not rely on the contents of this email or any attachments,
> and we ask that you please not read, copy or retransmit this
> communication, but reply to the sender and destroy the email, its
> contents, and all copies thereof immediately. Any unauthorized
> dissemination, distribution or copying of this communication is
> strictly prohibited.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
> For additional commands, e-mail: torque-user-help@db.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/LargeSelect-example---tf4605414.html#a13477318
Sent from the Apache DB - Torque Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


RE: LargeSelect example ?

Posted by Greg Monroe <Gr...@DukeCE.com>.
As a quick aside, it would be much easier to follow your 
messages, if your embedded comments where not prefixed
with one or more >'s.  Makes it real hard to see what 
are new comments and what are old.

That said, I tested the current CVS head (which is 99.9% 
final release for 3.3) against MS SQL 2000 just last 
week.  In order for this to pass all the Limit / 
LargeSelect tests in the test project, I committed some 
changes to the DBSybase class (which MS SQL extends).

So, try checking out the latest from CVS and using this. 
This should work with MS 2005.  The support is generic 
across all MS SQL versions, so it is "psuedo" support that
requires more data than requested to be read and "trimmed"
down.


> -----Original Message-----
> From: YannickR [mailto:Yannick.Richard@matricis.com] 
> Sent: Friday, October 26, 2007 12:26 PM
> To: torque-user@db.apache.org
> Subject: Re: LargeSelect example ?
> 
> 
> > 
> > Is the patch working or not ? The status on 
> > 
> https://issues.apache.org/jira/browse/TORQUE-84?page=com.atlassian.jir
> > a.plugin.system.issuetabpanels:all-tabpanel
> > seems to be unresolved...
> > 
> > Could someone help me on that one ?
> > Can I still use LargeSelect with MSSQL 2005 ?
> > 
> > Regards,
> > Yannick Richard
> > 
DukeCE Privacy Statement:
Please be advised that this e-mail and any files transmitted with
it are confidential communication or may otherwise be privileged or
confidential and are intended solely for the individual or entity
to whom they are addressed. If you are not the intended recipient
you may not rely on the contents of this email or any attachments,
and we ask that you please not read, copy or retransmit this
communication, but reply to the sender and destroy the email, its
contents, and all copies thereof immediately. Any unauthorized
dissemination, distribution or copying of this communication is
strictly prohibited.

---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


Re: LargeSelect example ?

Posted by YannickR <Ya...@matricis.com>.

YannickR wrote:
> 
> 
> YannickR wrote:
>> 
>> 
>> YannickR wrote:
>>> 
>>> 
>>> YannickR wrote:
>>>> 
>>>> 
>>>> Yannick Richard wrote:
>>>>> 
>>>>> 
>>>>> Thomas Vandahl-2 wrote:
>>>>>> 
>>>>>> Yannick Richard wrote:
>>>>>>> Hi,
>>>>>>> 
>>>>>>>  
>>>>>>> 
>>>>>>> I am currently working on a Torque project that will handle database
>>>>>>> synchronization. 
>>>>>>> 
>>>>>>> The problem we have is an Out of Memory exception while selecting a
>>>>>>> big
>>>>>>> bunch of data from the database.
>>>>>>> 
>>>>>>>  
>>>>>>> 
>>>>>>> Here is the command we are using :
>>>>>>> 
>>>>>>> List ObjectsFromDB = ObjectPeer.doSelect(criteria, connection);
>>>>>>> 
>>>>>>>  
>>>>>>> 
>>>>>>> I saw the LargeSelect class you worked on but cannot find any Java
>>>>>>> example that could help me go forward.
>>>>>>> 
>>>>>>> Could you help me point to an example or help me understand how to
>>>>>>> integrate LargeSelect ?
>>>>>> 
>>>>>> Just a few hints, I don't have a complete example at hand:
>>>>>> 
>>>>>> 	LargeSelect ls = new LargeSelect(criteria, pageSize,
>>>>>> 				memoryPageLimit,
>>>>>> 				ObjectPeer.class.getName());
>>>>>> 
>>>>>> where the pageSize defines how many records to get with one call and
>>>>>> the
>>>>>> memoryPageLimit defines how many of these pages to "read ahead".
>>>>>> 
>>>>>> With this object you can now loop through the pages and LargeSelect
>>>>>> will
>>>>>> load the necessary data as needed, (pageSize * memoryPageLimit)
>>>>>> records
>>>>>> at a time. Like:
>>>>>> 
>>>>>> 	while (ls.getNextResultsAvailable())
>>>>>> 	{
>>>>>> 		List ObjectsFromDB = ls.getNextResults();
>>>>>> 		// do what is necessary
>>>>>> 	}
>>>>>> 
>>>>>> See the JavaDoc at
>>>>>> http://db.apache.org/torque/releases/torque-3.3/runtime/apidocs/index.html
>>>>>> for more information.
>>>>>> 
>>>>>> Bye, Thomas.
>>>>>> 
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
>>>>>> For additional commands, e-mail: torque-user-help@db.apache.org
>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> With your explanations I succeeded in running LargeSelect. It is now
>>>>> working while reading 10 000 records with a PageSize of 1000 and a
>>>>> MemoryPageLimit of 20. I don't know if it is normal, but when I use a
>>>>> MemoryPageLimit of 5 (5x1000 records), it is reading 1 - 5000 of 10000
>>>>> again and again and never get out of the loop...
>>>>> 
>>>>> Next, I tried to use PageSize of 5000/MemoryPageLimit of 50 in order
>>>>> to read less than 250 000 records, I had following exception :
>>>>> Exception in thread "Thread-2" java.lang.OutOfMemoryError: Java heap
>>>>> space
>>>>> 
>>>>> I then modified the Eclipse shortcut arguments to better manage Heap
>>>>> memory, etc :
>>>>> -vmargs -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled
>>>>> -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M -Xms128M -Xmx1024M
>>>>> 
>>>>> Unfortunately, it jammed on first call of .getNextResults() for a
>>>>> night...
>>>>> When I debug it I can see that the Thread is sleeping in
>>>>> getResults(start, size) for all night on following command: while
>>>>> (((start + size - 1) > currentlyFilledTo) && !queryCompleted)
>>>>> 
>>>>> Is this a memory problem or I did something wrong ?
>>>>> 
>>>>> Regards,
>>>>> Yannick Richard
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> Never mind, I found the solution :jumping:
>>> Readind Torque documentation, I tought MSSQL was not supported with
>>> Limit and Offset features used by LargeSelect so I installed TORQUE-84
>>> patch from https://issues.apache.org/jira/browse/TORQUE-84 and this was
>>> causing getNextResultsAvailable() to not function correctly...
>>> 
>>> I was also misunderstanding the use of PageSize and MemoryPageLimit.
>>> Total batch size in memory is not only the number of records in a page
>>> but PageSize * MemoryPageLimit... 
>>> 
>>> So without that patch, that is not completed anyway, and some time spent
>>> on RTFM... Everything is fine now ! Thanks for your help !  
>>>  
>>> 
>> 
> 
> Sorry for confusion...
> I tough I succeeded to run LargeSelect with SQL server 2005, but it
> fails... continuing to read even after there are no more records...
> I did install TORQUE-84 patch in order to support Limit/Offset.
> 
> My last post on the forum, talking about removing the Patch, is wrong...
> It only works when your specify PageSize * MemoryPageLimit that will cover
> all your records... If not, it will end reading before the end of the
> records.
> 
> Is the patch working or not ? The status on
> https://issues.apache.org/jira/browse/TORQUE-84?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
> seems to be unresolved...
> 
> Could someone help me on that one ?
> Can I still use LargeSelect with MSSQL 2005 ?
> 
> Regards,
> Yannick Richard
> 
-- 
View this message in context: http://www.nabble.com/LargeSelect-example---tf4605414.html#a13430608
Sent from the Apache DB - Torque Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


Re: LargeSelect example ?

Posted by YannickR <Ya...@matricis.com>.

YannickR wrote:
> 
> 
> YannickR wrote:
>> 
>> 
>> Yannick Richard wrote:
>>> 
>>> 
>>> Thomas Vandahl-2 wrote:
>>>> 
>>>> Yannick Richard wrote:
>>>>> Hi,
>>>>> 
>>>>>  
>>>>> 
>>>>> I am currently working on a Torque project that will handle database
>>>>> synchronization. 
>>>>> 
>>>>> The problem we have is an Out of Memory exception while selecting a
>>>>> big
>>>>> bunch of data from the database.
>>>>> 
>>>>>  
>>>>> 
>>>>> Here is the command we are using :
>>>>> 
>>>>> List ObjectsFromDB = ObjectPeer.doSelect(criteria, connection);
>>>>> 
>>>>>  
>>>>> 
>>>>> I saw the LargeSelect class you worked on but cannot find any Java
>>>>> example that could help me go forward.
>>>>> 
>>>>> Could you help me point to an example or help me understand how to
>>>>> integrate LargeSelect ?
>>>> 
>>>> Just a few hints, I don't have a complete example at hand:
>>>> 
>>>> 	LargeSelect ls = new LargeSelect(criteria, pageSize,
>>>> 				memoryPageLimit,
>>>> 				ObjectPeer.class.getName());
>>>> 
>>>> where the pageSize defines how many records to get with one call and
>>>> the
>>>> memoryPageLimit defines how many of these pages to "read ahead".
>>>> 
>>>> With this object you can now loop through the pages and LargeSelect
>>>> will
>>>> load the necessary data as needed, (pageSize * memoryPageLimit) records
>>>> at a time. Like:
>>>> 
>>>> 	while (ls.getNextResultsAvailable())
>>>> 	{
>>>> 		List ObjectsFromDB = ls.getNextResults();
>>>> 		// do what is necessary
>>>> 	}
>>>> 
>>>> See the JavaDoc at
>>>> http://db.apache.org/torque/releases/torque-3.3/runtime/apidocs/index.html
>>>> for more information.
>>>> 
>>>> Bye, Thomas.
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
>>>> For additional commands, e-mail: torque-user-help@db.apache.org
>>>> 
>>>> 
>>>> 
>>> 
>>> With your explanations I succeeded in running LargeSelect. It is now
>>> working while reading 10 000 records with a PageSize of 1000 and a
>>> MemoryPageLimit of 20. I don't know if it is normal, but when I use a
>>> MemoryPageLimit of 5 (5x1000 records), it is reading 1 - 5000 of 10000
>>> again and again and never get out of the loop...
>>> 
>>> Next, I tried to use PageSize of 5000/MemoryPageLimit of 50 in order to
>>> read less than 250 000 records, I had following exception : Exception in
>>> thread "Thread-2" java.lang.OutOfMemoryError: Java heap space
>>> 
>>> I then modified the Eclipse shortcut arguments to better manage Heap
>>> memory, etc :
>>> -vmargs -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled
>>> -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M -Xms128M -Xmx1024M
>>> 
>>> Unfortunately, it jammed on first call of .getNextResults() for a
>>> night...
>>> When I debug it I can see that the Thread is sleeping in
>>> getResults(start, size) for all night on following command: while
>>> (((start + size - 1) > currentlyFilledTo) && !queryCompleted)
>>> 
>>> Is this a memory problem or I did something wrong ?
>>> 
>>> Regards,
>>> Yannick Richard
>>> 
>>> 
>>> 
>> 
> 
> Never mind, I found the solution :jumping:
> Readind Torque documentation, I tought MSSQL was not supported with Limit
> and Offset features used by LargeSelect so I installed TORQUE-84 patch
> from https://issues.apache.org/jira/browse/TORQUE-84 and this was causing
> getNextResultsAvailable() to not function correctly...
> 
> I was also misunderstanding the use of PageSize and MemoryPageLimit. Total
> batch size in memory is not only the number of records in a page but
> PageSize * MemoryPageLimit... 
> 
> So without that patch, that is not completed anyway, and some time spent
> on RTFM... Everything is fine now ! Thanks for your help !  
>  
> 
-- 
View this message in context: http://www.nabble.com/LargeSelect-example---tf4605414.html#a13427019
Sent from the Apache DB - Torque Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


Re: LargeSelect example ?

Posted by YannickR <Ya...@matricis.com>.

Yannick Richard wrote:
> 
> 
> Thomas Vandahl-2 wrote:
>> 
>> Yannick Richard wrote:
>>> Hi,
>>> 
>>>  
>>> 
>>> I am currently working on a Torque project that will handle database
>>> synchronization. 
>>> 
>>> The problem we have is an Out of Memory exception while selecting a big
>>> bunch of data from the database.
>>> 
>>>  
>>> 
>>> Here is the command we are using :
>>> 
>>> List ObjectsFromDB = ObjectPeer.doSelect(criteria, connection);
>>> 
>>>  
>>> 
>>> I saw the LargeSelect class you worked on but cannot find any Java
>>> example that could help me go forward.
>>> 
>>> Could you help me point to an example or help me understand how to
>>> integrate LargeSelect ?
>> 
>> Just a few hints, I don't have a complete example at hand:
>> 
>> 	LargeSelect ls = new LargeSelect(criteria, pageSize,
>> 				memoryPageLimit,
>> 				ObjectPeer.class.getName());
>> 
>> where the pageSize defines how many records to get with one call and the
>> memoryPageLimit defines how many of these pages to "read ahead".
>> 
>> With this object you can now loop through the pages and LargeSelect will
>> load the necessary data as needed, (pageSize * memoryPageLimit) records
>> at a time. Like:
>> 
>> 	while (ls.getNextResultsAvailable())
>> 	{
>> 		List ObjectsFromDB = ls.getNextResults();
>> 		// do what is necessary
>> 	}
>> 
>> See the JavaDoc at
>> http://db.apache.org/torque/releases/torque-3.3/runtime/apidocs/index.html
>> for more information.
>> 
>> Bye, Thomas.
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
>> For additional commands, e-mail: torque-user-help@db.apache.org
>> 
>> 
>> 
> 
> With your explanations I succeeded in running LargeSelect. It is now
> working while reading 10 000 records with a PageSize of 1000 and a
> MemoryPageLimit of 20. I don't know if it is normal, but when I use a
> MemoryPageLimit of 5 (5x1000 records), it is reading 1 - 5000 of 10000
> again and again and never get out of the loop...
> 
> Next, I tried to use PageSize of 5000/MemoryPageLimit of 50 in order to
> read less than 250 000 records, I had following exception : Exception in
> thread "Thread-2" java.lang.OutOfMemoryError: Java heap space
> 
> I then modified the Eclipse shortcut arguments to better manage Heap
> memory, etc :
> -vmargs -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabled
> -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=256M -Xms128M -Xmx1024M
> 
> Unfortunately, it jammed on first call of .getNextResults() for a night...
> When I debug it I can see that the Thread is sleeping in getResults(start,
> size) for all night on following command: while (((start + size - 1) >
> currentlyFilledTo) && !queryCompleted)
> 
> Is this a memory problem or I did something wrong ?
> 
> Regards,
> Yannick Richard
> 
> 
> 
-- 
View this message in context: http://www.nabble.com/LargeSelect-example---tf4605414.html#a13407505
Sent from the Apache DB - Torque Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org


Re: LargeSelect example ?

Posted by Thomas Vandahl <tv...@apache.org>.
Yannick Richard wrote:
> Hi,
> 
>  
> 
> I am currently working on a Torque project that will handle database
> synchronization. 
> 
> The problem we have is an Out of Memory exception while selecting a big
> bunch of data from the database.
> 
>  
> 
> Here is the command we are using :
> 
> List ObjectsFromDB = ObjectPeer.doSelect(criteria, connection);
> 
>  
> 
> I saw the LargeSelect class you worked on but cannot find any Java
> example that could help me go forward.
> 
> Could you help me point to an example or help me understand how to
> integrate LargeSelect ?

Just a few hints, I don't have a complete example at hand:

	LargeSelect ls = new LargeSelect(criteria, pageSize,
				memoryPageLimit,
				ObjectPeer.class.getName());

where the pageSize defines how many records to get with one call and the
memoryPageLimit defines how many of these pages to "read ahead".

With this object you can now loop through the pages and LargeSelect will
load the necessary data as needed, (pageSize * memoryPageLimit) records
at a time. Like:

	while (ls.getNextResultsAvailable())
	{
		List ObjectsFromDB = ls.getNextResults();
		// do what is necessary
	}

See the JavaDoc at
http://db.apache.org/torque/releases/torque-3.3/runtime/apidocs/index.html
for more information.

Bye, Thomas.

---------------------------------------------------------------------
To unsubscribe, e-mail: torque-user-unsubscribe@db.apache.org
For additional commands, e-mail: torque-user-help@db.apache.org