You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Dmitriy Setrakyan (JIRA)" <ji...@apache.org> on 2017/11/13 08:47:00 UTC

[jira] [Assigned] (IGNITE-6865) Wrong map query build when using group by in both the outer and inner queries

     [ https://issues.apache.org/jira/browse/IGNITE-6865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy Setrakyan reassigned IGNITE-6865:
-----------------------------------------

    Assignee: Vladimir Ozerov

> Wrong map query build when using group by in both the outer and inner queries
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-6865
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6865
>             Project: Ignite
>          Issue Type: Bug
>      Security Level: Public(Viewable by anyone) 
>          Components: sql
>    Affects Versions: 2.3
>            Reporter: Alin Andrei Corodescu
>            Assignee: Vladimir Ozerov
>             Fix For: 2.4
>
>
> The Ignite SQL engine builds an incorrect SQL query for the "map" part of a distributed query when the select statements contains 2 nested group by's. I initiated a discussion on the ignite users mailing list : [http://apache-ignite-users.70518.x6.nabble.com/Incorrect-map-query-built-when-joining-with-a-subquery-with-group-by-statement-td17784.html]
> To reproduce the error:
> Consider a simple table with only one column: Persons(name), and two Ignite nodes, each containing 2 rows for this table, for example :
> Node 1 :
> {code}
> { p1 : name = "A"; p2 : name = "B"}
> {code}
> Node 2 :
> {code}
> { p3 : name = "A"; p4 : name = "B"}
> {code}
> Given the query :
> {code}
> SELECT t1.name, count(1)
> FROM "default".Persons t1
> JOIN (SELECT name from "default".Persons group by name) t2
> on t1.name = t2.name
> group by t1.name
> {code}
> The query won't be executed because a wrong map query is being built :
> {code}
> SELECT
> T1__Z0.NAME AS __C0_0,
> COUNT(1) AS __C0_1
> FROM "default".PERSONS T1__Z0
> /* "default".PERSONS._SCAN */
> {code}
> The map query uses an aggregate function even though the group by has been dropped, thus the query can't be run on any of the nodes.
> I tested with the distributedJoins=true flag, and the behaviour is still the same. When running with collocated=true however, it works (as it is expected since the query is passed directly to the underlying H2 engine), but only retrieves data from the current node (as it is expected) (result = (A,1 ; B,1).
> The workaround I found for this problem is to re-write the query as follows:
> {code}
> SELECT t1.name, count(1)
> FROM (select * from "default".Persons) t1
> JOIN (SELECT name from "default".Persons group by name) t2
> on t1.name = t2.name
> group by t1.name
> {code}
> This form is completely equivalent with the previous one and works as expected, and the data returned is correctly calculated (A, 2 ; B, 2).
> I also found out that the problem doesn't arise when Ignite decides to use indexes instead of a table scan, so the problem seems to be related with table scans.
> Please mind the example given is simply to reproduce the error. It was identified using real production queries with much more complicated structure, but I was able to reproduce it using this simple table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)