You are viewing a plain text version of this content. The canonical link for it is here.
Posted to torque-dev@db.apache.org by Scott Eade <se...@backstagetech.com.au> on 2003/05/31 03:19:55 UTC
LargeSelect returning different numbers of rows. (Was: [vote] release
torque-3.0.1)
Federico Spinazzi wrote:
> - LargeSelect gives me different results on the total number of object
> retrieved with different parameters;
LargeSelect doesn't know the total number of records until such time as
the buffer of records hits the last record. Prior to this it just
indicates that more records exist than the number that have been
retrieved so far. Is this the behavior you are seeing or is it
something else? If you do believe there is a problem can you provide a
test case?
Cheers,
Scott
--
Scott Eade
Backstage Technologies Pty. Ltd.
http://www.backstagetech.com.au
seade@backstagetech.com.au
Re: LargeSelect returning different numbers of rows. (Was: [vote]
release torque-3.0.1)
Posted by Scott Eade <se...@backstagetech.com.au>.
Hi Fredrico,
You touch upon a number of issues, I hope the following information is
helpful.
Firstly, LargeSelect isn't going to work very well (and I have certainly
not tested it) for databases that do not natively support the concepts
of limit and offset in their SQL. Torque 3.0 provides support for MySQL
and PostgreSQL (though I think the syntax has changed for PostgreSQL
7.3). I think there is code for Oracle somewhere (either in cvs head or
attached to an issue in Scarab), but I am reasonably sure that there is
no support for this if DB2 is being used (if DB2's SQL dialect supports
these concepts then the first step in getting things to work will be to
implement this in BasePeer). If the database in use does not support
limit/offset or Torque has not been coded to use the native limit/offset
support provided by the database then implementation of these features
will fall back to the Village API. With 37000 rows this will fall flat
on its face (OutOfMemoryException) because Village will run the entire
query and then discard the unwanted rows (using a big chunk of memory
and many CPU cycles in the process). Is there not some filtering that
can be applied to reduce the number of records that need to be
selected? Not sure about you, but the last time I ran a query that
retrieved 37000 rows I only looked at a couple of hundred of them :-).
For an example test case please see LargeSelectTest in cvs. This can be
executed as part of building Torque with maven.
I can only guess that the record count issues are a result of relying on
Village scrolling the records - not to say that the problem is with
Village, but rather that LargeSelect has not been tested in this
situation. The fact that it is producing different results at different
times is strange.
The ConcurrentModificationExceptions you are experiencing may be caused
by the fact that the data is still being processed when you attempt to
retrieve more data. This would be a bug, but certainly not one that I
have seen (perhaps because I haven't tested LargeSelect without native
limit/offset support or with a page size of 5000). It would perhaps be
a good idea to generate a more extreme test case to try and root this
problem out.
If you are interested in contributing to Torque, please see
http://jakarta.apache.org/site/getinvolved.html
Regards,
Scott
Federico Spinazzi wrote:
> Scott Eade wrote:
>
>> Federico Spinazzi wrote:
>>
>>> - LargeSelect gives me different results on the total number of
>>> object retrieved with different parameters;
>>
>>
>>
>> LargeSelect doesn't know the total number of records until such time
>> as the buffer of records hits the last record. Prior to this it just
>> indicates that more records exist than the number that have been
>> retrieved so far. Is this the behavior you are seeing or is it
>> something else? If you do believe there is a problem can you provide
>> a test case?
>>
>> Cheers,
>>
>> Scott
>>
> Hmm,
> I have tried to retrieve about 37000 record from a DB2 table with
> large select because of OutOfMemory error otherwise. I gave up also
> because it was too slow.
> The code is the following
> try {
> Criteria crit = new Criteria();
> LargeSelect ls = new LargeSelect(crit,
> 3000, "it.masterhouse.torque.termopoli.ArticoliPeer");
> int total = 0;
> while (ls.getNextResultsAvailable()) {
> List l = ls.getNextResults();
> total += l.size();
> }
> System.out.println("record selected: " + total);
> } catch (Throwable t) {
> t.printStackTrace();
> }
> (I don't mean this code make sense, I want to spot the 'bug')
> Whan trying to choose the best values for pageSize and memoryPageLimit
> I discovered that many combinations gave me the correct result while a
> pageSize of 3000 ( as in the code) gives me 69000 records instead of
> 37187.
> I can try to move the data in hsql and see if the problem is here
> again, because I' dont know another way to buil a test case ...
> However, I'm just retring the failing test and I'm getting an
> OutOfMemory exception ...
> Moreover, if I try with pageSize 10 and memoryPageLimit 5000 I get an
> java.util.ConcurrentModificationException ...
> I think that large select, if useful, should be reworked.
> As I'll need to use Torque in the future I candidate myself to help.
> Can someone help me to understand how I can do that?
> Thank you very much for you attention.
> Federico Spinazzi
--
Scott Eade
Backstage Technologies Pty. Ltd.
http://www.backstagetech.com.au
Re: LargeSelect returning different numbers of rows. (Was: [vote]
release torque-3.0.1)
Posted by Federico Spinazzi <f....@masterhouse.it>.
Scott Eade wrote:
> Federico Spinazzi wrote:
>
>> - LargeSelect gives me different results on the total number of
>> object retrieved with different parameters;
>
>
> LargeSelect doesn't know the total number of records until such time
> as the buffer of records hits the last record. Prior to this it just
> indicates that more records exist than the number that have been
> retrieved so far. Is this the behavior you are seeing or is it
> something else? If you do believe there is a problem can you provide
> a test case?
>
> Cheers,
>
> Scott
>
Hmm,
I have tried to retrieve about 37000 record from a DB2 table with large
select because of OutOfMemory error otherwise. I gave up also because it
was too slow.
The code is the following
try {
Criteria crit = new Criteria();
LargeSelect ls = new LargeSelect(crit,
3000, "it.masterhouse.torque.termopoli.ArticoliPeer");
int total = 0;
while (ls.getNextResultsAvailable()) {
List l = ls.getNextResults();
total += l.size();
}
System.out.println("record selected: " + total);
} catch (Throwable t) {
t.printStackTrace();
}
(I don't mean this code make sense, I want to spot the 'bug')
Whan trying to choose the best values for pageSize and memoryPageLimit I
discovered that many combinations gave me the correct result while a
pageSize of 3000 ( as in the code) gives me 69000 records instead of 37187.
I can try to move the data in hsql and see if the problem is here again,
because I' dont know another way to buil a test case ...
However, I'm just retring the failing test and I'm getting an
OutOfMemory exception ...
Moreover, if I try with pageSize 10 and memoryPageLimit 5000 I get an
java.util.ConcurrentModificationException ...
I think that large select, if useful, should be reworked.
As I'll need to use Torque in the future I candidate myself to help.
Can someone help me to understand how I can do that?
Thank you very much for you attention.
Federico Spinazzi