You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Stefán Baxter (JIRA)" <ji...@apache.org> on 2015/07/21 20:20:04 UTC
[jira] [Created] (DRILL-3533) null values in a sub-structure in
Parquet returns unexpected/misleading results
Stefán Baxter created DRILL-3533:
------------------------------------
Summary: null values in a sub-structure in Parquet returns unexpected/misleading results
Key: DRILL-3533
URL: https://issues.apache.org/jira/browse/DRILL-3533
Project: Apache Drill
Issue Type: Bug
Components: Query Planning & Optimization
Affects Versions: 1.1.0
Reporter: Stefán Baxter
Assignee: Jinfeng Ni
Priority: Critical
With this minimal dataset as /tmp/test.json:
{"dimensions":{"adults":"A"}}
select lower(p.dimensions.budgetLevel) as `field1`, lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test.json` as p;
Returns this:
+---------+---------+
| field1 | field2 |
+---------+---------+
| null | a |
+---------+---------+
With the same data as a Parquet file
CREATE TABLE dfs.tmp.`/test` AS SELECT * FROM dfs.tmp.`/test.json`;
The same query:
select lower(p.dimensions.budgetLevel) as `field1`, lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test/0_0_0.parquet` as p;
Return this:
+---------+---------+
| field1 | field2 |
+---------+---------+
| a | null |
+---------+---------+
After some more testing it appears that this has nothing to do with trim. (any non existing nested-value will be pushed aside)
select p.dimensions.budgetLevel as `field1`, lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test/0_0_0.parquet` as p;
also returns:
+---------+---------+
| field1 | field2 |
+---------+---------+
| a | null |
+---------+---------+
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)