You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "Till (JIRA)" <ji...@apache.org> on 2018/03/14 23:47:00 UTC
[jira] [Updated] (ASTERIXDB-2212) Variable binding/tracking bug in complex GROUP BY query

     [ https://issues.apache.org/jira/browse/ASTERIXDB-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Till updated ASTERIXDB-2212:
----------------------------
    Component/s: COMP - Compiler

> Variable binding/tracking bug in complex GROUP BY query
> -------------------------------------------------------
>
>                 Key: ASTERIXDB-2212
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-2212
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: COMP - Compiler
>            Reporter: Michael J. Carey
>            Assignee: Dmitry Lychagin
>            Priority: Major
>         Attachments: mondial.adm
>
>
> Try the following sequence of steps (after getting the 'mondial.adm' data file and fixing its path for your machine) and look at the query result - then un-comment out the GROUP BY near the end and look again. You will see that the SQL++ compiler apparently gets confused about which variable/field is which?  Weird bug!  (Reported by U-Wash.)
> DROP DATAVERSE hw5 IF EXISTS;
> CREATE DATAVERSE hw5;
> USE hw5;
> CREATE TYPE worldType AS {auto_id:uuid };
> CREATE DATASET world(worldType)  PRIMARY KEY auto_id AUTOGENERATED;
> LOAD DATASET world USING localfs
>         (("path"="127.0.0.1:///Users/mikejcarey/uwash/mondial.adm"),("format"="adm"));
> WITH instr AS (select distinct z.`#text` AS ethnic_group, floor(sum((float(z.`-percentage`)/ 100 * float(y.population)))) AS total_population, count(*) AS num_countries
> FROM world x, x.mondial.country y, 
>      (CASE WHEN is_array(y.ethnicgroups) THEN y.ethnicgroups ELSE [y.ethnicgroups] END) z
> WHERE y.ethnicgroups IS NOT MISSING
> GROUP BY ethnic_group
> ORDER BY total_population DESC
> ),
> stud AS (WITH g AS
> (SELECT a.`#text` as ethnicgroup, SUM((float(a.`-percentage`)/100 * float(y.population))) as numEth
> FROM world x, x.mondial.country y, 
> CASE 
> WHEN y.ethnicgroups is missing then [[]] 
> WHEN is_object(y.ethnicgroups) then [[y.ethnicgroups]]
> ELSE [y.ethnicgroups] END z
> UNNEST z a
> GROUP BY ethnicgroup),
> h as (SELECT a.`#text` as ethnicgroup, COUNT(y.name) as count
> FROM world x, x.mondial.country y, 
> CASE 
> WHEN y.ethnicgroups is missing then [[]] 
> WHEN is_object(y.ethnicgroups) then [[y.ethnicgroups]]
> ELSE [y.ethnicgroups] END z
> UNNEST z a
> GROUP BY ethnicgroup)
> SELECT g.ethnicgroup AS ethnic_group, h.count AS num_countries, floor(g.numEth) AS total_population
> FROM g, h
> WHERE g.ethnicgroup = h.ethnicgroup
> ORDER BY total_population DESC
> )
> SELECT subq.ethnic_group, subq.num_countries, subq.total_population
> FROM (select total_population, 
>    num_countries, ethnic_group FROM instr
> UNION ALL
>    select total_population, num_countries, ethnic_group FROM stud) AS subq
> -- GROUP BY subq.ethnic_group, subq.num_countries, subq.total_population
> ORDER BY subq.ethnic_group;



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)