You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-user@db.apache.org by Matt Chambers <ch...@imageworks.com> on 2008/05/14 03:11:45 UTC

top N reporting with derby

Hi guys, new to the list.   I'm a new Derby user, just grabbed the 
latest version.

I have a Tomcat application that sends some statistics to a server for 
processing.  The statistics are not mission critical or anything, they 
are just nice to have, so if the statistics server goes down, no big 
deal.  Its actually never gone down but in case it does, I would like to 
spool the updates into a Derby DB.  So I changed the code around a bit 
to always store and forward instead of sending directly to the stats 
server using Derby but I'm stuck.

I have two requirements.
1. that if the stats server is down for days that I should be able to 
process the spool in small chunks over time when it comes back up.
2. each row has to be processed in the order in which i was placed in 
the table or else the stats server could drop them anyway.

Normally, I would do a SELECT * FROM spool_table ORDER BY date_inserted 
ASC LIMIT 100 to grab the oldest entries but Derby can't do limit and 
you can't user ORDER BY in sub selects.  So how do I get my data out of 
the table in small chunks in the same order it went in?

Any ideas?

-Matt



Re: top N reporting with derby

Posted by Six Fried Rice <te...@sixfriedrice.com>.
On May 15, 2008, at 9:02 AM, Matt Chambers wrote:

> Just googling around I would say that the LIMIT/OFFSET feature has  
> to be the #1 requested Derby feature of all time.

Paging is such a necessary part of web development, and MySQL, despite  
its warts, really gets it right. It is a hassle to have to jump  
through hoops (subselects, temp tables, inverted sort orders, ...) to  
do it. It is a shame because apparently the "standard" has no  
provision for this. But when you're putting data on the web, you  
simply must page records. And with MySQL it is a snap to augment any  
query to add paging, and it is fast.

So I am not surprised this is a major FAQ. I have the same problem  
with MSSQL as well. But my vague memory is that I can put ROW_NUMBER  
in a FROM clause in Sybase, and then use it in my WHERE, making it  
easy there as well.

We're pretty much back to square 1 on paging in Derby because we need  
to sort *before* we page (obviously). So now we're trying temporary  
tables (nuisance because our system uses configurable queries) and  
just spinning through the result set in Java.

Geoff

Re: top N reporting with derby

Posted by Bryan Pendleton <bp...@amberpoint.com>.
> Just googling around I would say that the LIMIT/OFFSET feature has to be 
> the #1 requested Derby feature of all time.

It absolutely is! https://issues.apache.org/jira/browse/DERBY-581

And the more people that get involved, the sooner we'll get it built
and operational.

The row_number() changes in DERBY 10.4 are a great step, but there's
still a lot of work left to do, and lots of opportunities for more
people to get involved and help. In addition to feature development
work, we could use help on testing, examples, documentation,
performance analysis, etc.

The effort that the community is putting into investigating the
new 10.4 row_number feature is great! I hope it continues, and I
hope we can incorporate that effort into ongoing improvements to
these features.

thanks,

bryan


Re: top N reporting with derby

Posted by Matt Chambers <ch...@imageworks.com>.
Six Fried Rice wrote:
> On May 14, 2008, at 8:09 PM, Daniel Noll wrote:
>
>> This is quite interesting.  Is there also a way to do this if you 
>> need to join
>> to another table, such that it selects rows 100 to 200 of the first 
>> table
>> before doing the join to the second?
>
> Sorry for the confusion. My post was completely bogus. That's what I 
> get for going from memory. Matt is completely correct: it seems you 
> need a sub select to get a row number in a where clause, and you can't 
> use order by in a sub select. Bummer really. I'm not sure *why* my 
> query doesn't work, but it doesn't. I should say, though, that I'm a 
> total derby newbie so it probably makes great sense.
>
> Geoff
>
Just googling around I would say that the LIMIT/OFFSET feature has to be 
the #1 requested Derby feature of all time.

-- 
-Matt


<http://twiki.spimageworks.com/twiki/bin/view/Software/CueDevelopment> 
<http://twiki.spimageworks.com/twiki/bin/view/Software/MiddleTier>

RE: top N reporting with derby

Posted by Derby Discussion <de...@db.apache.org>.
Silly question... How does Derby handle temp tables?

As an example... IDS (Informix) would allow you to create temp tables on the
fly, index them so you could break the query down in to simpler queries.

You can't do this easily in Oracle because of how they create and maintain
temporary tables. (They really aren't temporary tables.) DB2? Something
similar to Oracle ...

So if you can create the temp table on the fly, just store your inner query
results to a temp table.

HTH

-Mike

> -----Original Message-----
> From: Six Fried Rice [mailto:technet@sixfriedrice.com]
> Sent: Wednesday, May 14, 2008 10:42 PM
> To: Derby Discussion
> Subject: Re: top N reporting with derby
> 
> On May 14, 2008, at 8:09 PM, Daniel Noll wrote:
> 
> > This is quite interesting.  Is there also a way to do this if you
> > need to join
> > to another table, such that it selects rows 100 to 200 of the first
> > table
> > before doing the join to the second?
> 
> Sorry for the confusion. My post was completely bogus. That's what I
> get for going from memory. Matt is completely correct: it seems you
> need a sub select to get a row number in a where clause, and you can't
> use order by in a sub select. Bummer really. I'm not sure *why* my
> query doesn't work, but it doesn't. I should say, though, that I'm a
> total derby newbie so it probably makes great sense.
> 
> Geoff



Re: top N reporting with derby

Posted by Six Fried Rice <te...@sixfriedrice.com>.
On May 14, 2008, at 8:09 PM, Daniel Noll wrote:

> This is quite interesting.  Is there also a way to do this if you  
> need to join
> to another table, such that it selects rows 100 to 200 of the first  
> table
> before doing the join to the second?

Sorry for the confusion. My post was completely bogus. That's what I  
get for going from memory. Matt is completely correct: it seems you  
need a sub select to get a row number in a where clause, and you can't  
use order by in a sub select. Bummer really. I'm not sure *why* my  
query doesn't work, but it doesn't. I should say, though, that I'm a  
total derby newbie so it probably makes great sense.

Geoff

Re: top N reporting with derby

Posted by Matt Chambers <ch...@imageworks.com>.
Daniel Noll wrote:
> On Thursday 15 May 2008 01:12:21 Six Fried Rice wrote:
>   
>> You can write your query like this:
>>
>> SELECT ROW_NUMBER() OVER() as rownum, spool_table.* FROM spool_table
>> ORDER BY date_inserted WHERE rownum <= 100
>>
>> And it should do the trick. This requires Derby 10.4.
>>     
>
> This is quite interesting.  Is there also a way to do this if you need to join 
> to another table, such that it selects rows 100 to 200 of the first table 
> before doing the join to the second?
>
> Daniel
>
>   
I wasn't able to get that one to run with 10.4, it complained about the 
placement of the WHERE clause.

-- 
-Matt

Cue3 Development Progress 
<http://twiki.spimageworks.com/twiki/bin/view/Software/CueDevelopment> 
Middle-Tier Homepage 
<http://twiki.spimageworks.com/twiki/bin/view/Software/MiddleTier>

Re: top N reporting with derby

Posted by Daniel Noll <da...@nuix.com>.
On Thursday 15 May 2008 01:12:21 Six Fried Rice wrote:
> You can write your query like this:
>
> SELECT ROW_NUMBER() OVER() as rownum, spool_table.* FROM spool_table
> ORDER BY date_inserted WHERE rownum <= 100
>
> And it should do the trick. This requires Derby 10.4.

This is quite interesting.  Is there also a way to do this if you need to join 
to another table, such that it selects rows 100 to 200 of the first table 
before doing the join to the second?

Daniel

Re: top N reporting with derby

Posted by Matt Chambers <ch...@imageworks.com>.
Six Fried Rice wrote:
> On May 13, 2008, at 6:11 PM, Matt Chambers wrote:
>
>> Normally, I would do a SELECT * FROM spool_table ORDER BY 
>> date_inserted ASC LIMIT 100 to grab the oldest entries but Derby 
>> can't do limit and you can't user ORDER BY in sub selects.  So how do 
>> I get my data out of the table in small chunks in the same order it 
>> went in?
>
> Matt:
>
> You can write your query like this:
>
> SELECT ROW_NUMBER() OVER() as rownum, spool_table.* FROM spool_table 
> ORDER BY date_inserted WHERE rownum <= 100
>
> And it should do the trick. This requires Derby 10.4.
>
> Hope it helps :)
>
> Geoff
>
Thanks for response.  I tried that one, was a no go.

-bash-3.1$ ./ij
ij version 10.4

That query gives me:
ERROR 42X01: Syntax error: Encountered "WHERE" at line 1, column 75.


-- 
-Matt

Cue3 Development Progress 
<http://twiki.spimageworks.com/twiki/bin/view/Software/CueDevelopment> 
Middle-Tier Homepage 
<http://twiki.spimageworks.com/twiki/bin/view/Software/MiddleTier>

Re: top N reporting with derby

Posted by Six Fried Rice <te...@sixfriedrice.com>.
On May 13, 2008, at 6:11 PM, Matt Chambers wrote:

> Normally, I would do a SELECT * FROM spool_table ORDER BY  
> date_inserted ASC LIMIT 100 to grab the oldest entries but Derby  
> can't do limit and you can't user ORDER BY in sub selects.  So how  
> do I get my data out of the table in small chunks in the same order  
> it went in?

Matt:

You can write your query like this:

SELECT ROW_NUMBER() OVER() as rownum, spool_table.* FROM spool_table  
ORDER BY date_inserted WHERE rownum <= 100

And it should do the trick. This requires Derby 10.4.

Hope it helps :)

Geoff