You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Victoria Markman (JIRA)" <ji...@apache.org> on 2015/02/28 00:25:05 UTC
[jira] [Created] (DRILL-2342) Nullability property of the view
created from parquet file is not correct
Victoria Markman created DRILL-2342:
---------------------------------------
Summary: Nullability property of the view created from parquet file is not correct
Key: DRILL-2342
URL: https://issues.apache.org/jira/browse/DRILL-2342
Project: Apache Drill
Issue Type: Bug
Components: Metadata
Affects Versions: 0.8.0
Reporter: Victoria Markman
Assignee: Steven Phillips
Here is my t1 table definition:
{code}
message root {
optional int32 a1;
optional binary b1 (UTF8);
optional int32 c1 (DATE);
}
{code}
I created a view on top of it:
{code}
0: jdbc:drill:schema=dfs> create view v1 as select cast(a1 as int), cast(b1 as varchar(10)), cast(c1 as date) from t1;
+------------+------------+
| ok | summary |
+------------+------------+
| true | View 'v1' created successfully in 'dfs.aggregation' schema |
+------------+------------+
1 row selected (0.096 seconds)
{code}
IS_NULLABLE says 'NO', which is incorrect.
{code}
0: jdbc:drill:schema=dfs> describe v1;
+-------------+------------+-------------+
| COLUMN_NAME | DATA_TYPE | IS_NULLABLE |
+-------------+------------+-------------+
| EXPR$0 | INTEGER | NO |
| EXPR$1 | VARCHAR | NO |
| EXPR$2 | DATE | NO |
+-------------+------------+-------------+
3 rows selected (0.067 seconds)
{code}
It is dangerous potentially, because if Calcite decided to take advantage over this property tomorrow and create an optimization where if column is not nullable "is null" predicate can be dropped, query : "select * from v1 where x is null" would return incorrect result.
{code}
0: jdbc:drill:schema=dfs> explain plan for select * from v1 where z is null;
+------------+------------+
| text | json |
+------------+------------+
| 00-00 Screen
00-01 Project(x=[$0], y=[$1], z=[$2])
00-02 SelectionVectorRemover
00-03 Filter(condition=[IS NULL($2)])
00-04 Project(x=[CAST($2):ANY NOT NULL], y=[CAST($1):ANY NOT NULL], z=[CAST($0):ANY NOT NULL])
00-05 Scan(groupscan=[ParquetGroupScan [entries=[ReadEntryWithPath [path=maprfs:/aggregation/t1]], selectionRoot=/aggregation/t1, numFiles=1, columns=[`a1`, `b1`, `c1`]]])
{code}
It seems to me that in views column properties should be always nullable.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)