You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Victoria Markman (JIRA)" <ji...@apache.org> on 2015/03/02 06:37:04 UTC
[jira] [Commented] (DRILL-2342) Nullability property of the view created from parquet file is not correct

    [ https://issues.apache.org/jira/browse/DRILL-2342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14342787#comment-14342787 ] 

Victoria Markman commented on DRILL-2342:
-----------------------------------------

It seems to me that bug 2103 is an illustration of incorrect nullability property in the view.

> Nullability property of the view created from parquet file is not correct
> -------------------------------------------------------------------------
>
>                 Key: DRILL-2342
>                 URL: https://issues.apache.org/jira/browse/DRILL-2342
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Metadata
>    Affects Versions: 0.8.0
>            Reporter: Victoria Markman
>            Assignee: Steven Phillips
>
> Here is my t1 table definition:
> {code}
> message root {
>   optional int32 a1;
>   optional binary b1 (UTF8);
>   optional int32 c1 (DATE);
> }
> {code}
> I created a view on top of it:
> {code}
> 0: jdbc:drill:schema=dfs> create view v1 as select cast(a1 as int), cast(b1 as varchar(10)), cast(c1 as date) from t1;
> +------------+------------+
> |     ok     |  summary   |
> +------------+------------+
> | true       | View 'v1' created successfully in 'dfs.aggregation' schema |
> +------------+------------+
> 1 row selected (0.096 seconds)
> {code}
> IS_NULLABLE says 'NO', which is incorrect.
> {code}
> 0: jdbc:drill:schema=dfs> describe v1;
> +-------------+------------+-------------+
> | COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
> +-------------+------------+-------------+
> | EXPR$0      | INTEGER    | NO          |
> | EXPR$1      | VARCHAR    | NO          |
> | EXPR$2      | DATE       | NO          |
> +-------------+------------+-------------+
> 3 rows selected (0.067 seconds)
> {code}
> It is dangerous potentially, because if Calcite decided to take advantage over this property tomorrow and create an optimization where if column is not nullable "is null" predicate can be dropped, query : "select * from v1 where x is null" would return incorrect result.
> {code}
> 0: jdbc:drill:schema=dfs> explain plan for select * from v1 where z is null;
> +------------+------------+
> |    text    |    json    |
> +------------+------------+
> | 00-00    Screen
> 00-01      Project(x=[$0], y=[$1], z=[$2])
> 00-02        SelectionVectorRemover
> 00-03          Filter(condition=[IS NULL($2)])
> 00-04            Project(x=[CAST($2):ANY NOT NULL], y=[CAST($1):ANY NOT NULL], z=[CAST($0):ANY NOT NULL])
> 00-05              Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/aggregation/t1]], selectionRoot=/aggregation/t1, numFiles=1, columns=[`a1`, `b1`, `c1`]]])
> {code}
> It seems to me that in views column properties should be always nullable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)