You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jinfeng Ni (JIRA)" <ji...@apache.org> on 2014/07/24 20:48:38 UTC

[jira] [Commented] (DRILL-1187) Select * from join does not return required columns

    [ https://issues.apache.org/jira/browse/DRILL-1187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14073513#comment-14073513 ] 

Jinfeng Ni commented on DRILL-1187:
-----------------------------------

Drill is schema-less execution engine. This feature makes it a challenge to handle * column. In a schema-based engine, * column will be expanded into a list of regular columns in query planning time, by using a meta-data store.  However, in schema-less engine, the expanding to regular columns have to be deferred to execution-time. 

To fix DRILL-1187 and DRILL-1189, we have to figure out way to expand * column properly during execution-time. 


> Select * from join does not return required columns
> ---------------------------------------------------
>
>                 Key: DRILL-1187
>                 URL: https://issues.apache.org/jira/browse/DRILL-1187
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Jinfeng Ni
>
> 1.  Select * from join does not return required columns   
>  select * from cp.`tpch/nation.parquet` n, cp.`tpch/region.parquet` r where n.n_regionkey = r.r_regionkey limit 2;
> +------------+-------------+-------------+------------+-------------+------------+
> |     *0     | r_regionkey | N_NATIONKEY |   N_NAME   | N_REGIONKEY | N_COMMENT  |
> +------------+-------------+-------------+------------+-------------+------------+
> | null       | 0           | 0           | [B@6c7e90b9 | 0           | [B@694f9954 |
> | null       | 1           | 1           | [B@68db8fcf | 1           | [B@26be94d1 |
> +------------+-------------+-------------+------------+-------------+——————+
> In this case, RHS only has the join key column appear in the result. Also, it has *0 column in the output.
> 2.  Select T1.* from join returns unwanted columns
> select n.* from cp.`tpch/nation.parquet` n, cp.`tpch/region.parquet` r where n.n_regionkey = r.r_regionkey limit 2;
> +-------------+-------------+------------+-------------+------------+
> | R_REGIONKEY | N_NATIONKEY |   N_NAME   | N_REGIONKEY | N_COMMENT  |
> +-------------+-------------+------------+-------------+------------+
> | 0           | 0           | [B@3328c286 | 0           | [B@38fb05a7 |
> | 1           | 1           | [B@6830342a | 1           | [B@34b0e6d6 |
> +-------------+-------------+------------+-------------+——————+
> select r.* from cp.`tpch/nation.parquet` n, cp.`tpch/region.parquet` r where n.n_regionkey = r.r_regionkey limit 2;
> +-------------+------------+------------+-------------+
> | R_REGIONKEY |   R_NAME   | R_COMMENT  | N_REGIONKEY |
> +-------------+------------+------------+-------------+
> | 0           | [B@997b44c | [B@46bdee7f | 0           |
> | 1           | [B@5f74f821 | [B@784e6f7c | 1           |
> +-------------+------------+------------+——————+
>  
> In this case, the join key column from the other table appears in the output, which is not correct. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)