You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@calcite.apache.org by "Paul Jackson (JIRA)" <ji...@apache.org> on 2018/01/10 17:57:00 UTC
[jira] [Created] (CALCITE-2129) Aggregating function from subquery
not projected properly
Paul Jackson created CALCITE-2129:
-------------------------------------
Summary: Aggregating function from subquery not projected properly
Key: CALCITE-2129
URL: https://issues.apache.org/jira/browse/CALCITE-2129
Project: Calcite
Issue Type: Bug
Components: core
Affects Versions: 1.15.0
Reporter: Paul Jackson
Assignee: Julian Hyde
Starting with this DDL:
{code:SQL}
create table friends (
id int not null,
friend_id int,
primary key (id),
foreign key (friend_id) references friends (id)
);
insert into friends values
(210, null),
(209, 210),
(202, 209),
(208, 202),
(207, 209),
(203, 207),
(201, null),
(204, null),
(205, null),
(206, 209);
{code}
This query gives the highest id of friends that share friend_id for each friend:
{code:SQL}
SELECT id, friend_id,
(SELECT max(f2.id)
FROM friends AS f2
WHERE f2.friend_id = f1.friend_id) AS foaf_id
FROM friends AS f1
id friend foaf
201
202 209 207
203 207 203
204
205
206 209 207
207 209 207
208 202 208
209 210 209
210
{code}
I convert this query to a RelNode and then converted it back to SQL:
{code:Java}
Planner aPlanner = Frameworks.getPlanner(aConfig);
SqlNode aQuery = aPlanner.parse(theSql);
aQuery = aPlanner.validate(aQuery);
RelNode aRelNode = aPlanner.rel(aQuery).project();
RelToSqlConverter aSqlConverter = new RelToSqlConverter(aSqlDialect);
SqlNode aSqlNode = aSqlConverter.visitChild(0, aRelNode).asStatement();
{code}
This gives the following plan and SQL:
{code}
LogicalProject(id=[$0], friend_id=[$1], foaf_id=[$2])
LogicalProject(id=[$0], friend_id=[$1], EXPR$0=[$3])
LogicalJoin(condition=[=($1, $2)], joinType=[left])
JdbcTableScan(table=[[stardog, friends]])
LogicalAggregate(group=[{0}], EXPR$0=[MAX($1)])
LogicalProject(friend_id=[$1], id=[$0])
LogicalProject(id=[$0], friend_id=[$1])
LogicalFilter(condition=[IS NOT NULL($1)])
JdbcTableScan(table=[[stardog, friends]])
SELECT `friends`.`id`, `friends`.`friend_id`, MAX(`id`)
FROM `stardog`.`friends`
LEFT JOIN (SELECT `friend_id`, MAX(`id`)
FROM `stardog`.`friends`
WHERE `friend_id` IS NOT NULL
GROUP BY `friend_id`) AS `t1` ON `friends`.`friend_id` = `t1`.`friend_id`
{code}
This is a bad conversion. The {{MAX(`id`)}} should not be repeated in the outer select. PostgreSQL will complain that the aggregating function requires a group by. MySQL returns the max id that has a non-null friend (208), that id's friend (202), and the max id of all rows (210):
{code}
id friend MAX(`id`)
208 202 210
{code}
I think the correct SQL should be:
{code:SQL}
SELECT `friends`.`id`, `friends`.`friend_id`, foaf_id
FROM `stardog`.`friends`
LEFT JOIN (SELECT `friend_id`, MAX(`id`) AS foaf_id
FROM `stardog`.`friends`
WHERE `friend_id` IS NOT NULL
GROUP BY `friend_id`) AS `t1` ON `friends`.`friend_id` = `t1`.`friend_id`
{code}
There is code in {{SqlImplementor}} and {{RelToSqlConverter}} that uses an {{ordinalMap}} to track what functions are aliases as what identifiers. When the SQL is generated, the identifier is replaced with the function. If all that code is removed, this works as expected (using {{EXPR$0}} as the alias rather than {{foaf_id}}).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)