You are viewing a plain text version of this content. The canonical link for it is here.
Posted to derby-dev@db.apache.org by Army <qo...@sbcglobal.net> on 2005/05/19 23:53:51 UTC

Question with binding...

Hi,

This message is asking for help debugging some specific changes that I've 
written for Derby, so if you're not interested in my own personal coding 
problems, feel free to stop reading now ;)

I'm trying to add language support to Derby so that it allows aggregate 
functions to be used in the WHERE iff they are a) in a subquery of a HAVING 
clause and b) they correlate to an outer query block.

As a first step toward part "a", I made what seem to me like reasonable changes 
to the SelectNode class so that Derby will allow aggregates in the WHERE of a 
HAVING subquery.  At a high level, what my changes do (or at least, what I want 
them to do) is as follows.

Assume I have a query of the form:

<OUTER_SELECT> <GROUP_BY> having <outer_col_a> >= <SUBSELECT>

and <SUBSELECT> is something like

select x from t1 where min(<outer_col_b>) > 0

and <outer_col_b> is a column that is both in the <OUTER_SELECT> list and in the 
GROUP_BY list.  That said, the changes I've made would rewrite the <SUBSELECT> 
to represent the following:

select x, min(<outer_col_b>) SQLCol1 from t1 where SQLCol1 > 0

(Note that this is what the rewritten query _represents_--this in itself isn't 
necessary valid SQL).

With this new query I bind SQLCol1 without any problems, but at execution time, 
the result is an NPE from within the generated code for the query.  I poked 
through the derby.log file and rewrote the query in a bunch of different ways, 
and I now _think_ problem is that the DataValueDescriptor for SQLCol1 in the 
WHERE clause is null at row retrieval time, and that's causing the NPE.

Are there any language experts out there who can look at the attached patch and 
offer input as to why the DataValueDescriptor for the aggregate result column 
would end up null on a call to retrieve the results?  I've made sure that the 
aggregate is bound at compile time (and I've tried binding it in lots of 
different ways) but it looks like I'm missing something somewhere.

I've attached a small patch with minimal changes to reproduce the problem.  I've 
re-written this patch too many times to count, some times in ways that were very 
simple (like the one attached to this patch) and some times in ways that were 
significantly more complex--but in all cases, I keep coming back to this NPE 
problem.

Using the attached patch, the NPE can be reproduced with either of the following 
two scenarios:

----

1) A "realistic" scenario that actually has some meaning (and thus is more 
complicated);

ij> create table team (teamid int not null primary key, budget double);
ij> create table players (id int not null unique, teamid int references team, 
salary double);
ij> insert into team values (1, 100), (2, 200), (3, 300), (4, 400);
ij> insert into players values (1, 1, 50), (2, 1, 75), (3, 1, 60);
3 rows inserted/updated/deleted
ij> insert into players values (4, 2, 10), (5, 2, 25), (6, 2, 30);
3 rows inserted/updated/deleted
ij> insert into players values (7, 3, 25), (8, 3, 30), (9, 3, 20), (10, 3, 45);

-- Now, look for all players such that: 1) the player is on a team with an 
average salary that is less than half of the team's budget, AND 2) the player is 
the lowest-paid player on his/her team, AND 3) the player is making less than 
"20" (i.e. maybe these player should get a raise! ;).  In such cases, retrieve 
the player's team and his/her salary.

ij> select t.teamid,p.salary from team t, players p where (p.teamid = t.teamid) 
group by t.teamid,p.salary having avg(p.salary) < (select (t2.budget / 2) from 
team t2 where t2.teamid = t.teamid and 20 > min(p.salary));

----

2) A simpler query that doesn't make any logical sense but still reproduces the 
problem:

ij> create table t1 (i int);
ij> create table t2 (j int);
ij> create table t3 (k int);
ij> insert into t1 values 2, 4, 6;
ij> insert into t2 values 1, 3, 7;
ij> insert into t3 values 1, 2, 3, 4, 5;
ij> select j,k from t2 t2_alias, t3 t3_alias group by j,k having j > (select 
min(2) from t1 where min(t3_alias.k) > 0);

It's the derby.log file for this second query that makes me think the 
DataValueDescriptor is null and thus the NPE.

Many thanks to any who has the time to point out what I'm missing,
Army