You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Stefán Baxter (JIRA)" <ji...@apache.org> on 2015/07/21 20:20:04 UTC

[jira] [Created] (DRILL-3533) null values in a sub-structure in Parquet returns unexpected/misleading results

Stefán Baxter created DRILL-3533:
------------------------------------

             Summary: null values in a sub-structure in Parquet returns unexpected/misleading results
                 Key: DRILL-3533
                 URL: https://issues.apache.org/jira/browse/DRILL-3533
             Project: Apache Drill
          Issue Type: Bug
          Components: Query Planning & Optimization
    Affects Versions: 1.1.0
            Reporter: Stefán Baxter
            Assignee: Jinfeng Ni
            Priority: Critical


With this minimal dataset as /tmp/test.json:
{"dimensions":{"adults":"A"}}

select lower(p.dimensions.budgetLevel) as `field1`, lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test.json` as p;

Returns this:
+---------+---------+
| field1  | field2  |
+---------+---------+
| null    | a       |
+---------+---------+

With the same data as a Parquet file
CREATE TABLE dfs.tmp.`/test` AS SELECT * FROM dfs.tmp.`/test.json`;

The same query:
select lower(p.dimensions.budgetLevel) as `field1`, lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test/0_0_0.parquet` as p;

Return this:
+---------+---------+
| field1  | field2  |
+---------+---------+
| a       | null    |
+---------+---------+

After some more testing it appears that this has nothing to do with trim. (any non existing nested-value will be pushed aside)

select p.dimensions.budgetLevel as `field1`, lower(p.dimensions.adults) as `field2` from dfs.tmp.`/test/0_0_0.parquet` as p;

also returns:
+---------+---------+
| field1  | field2  |
+---------+---------+
| a       | null    |
+---------+---------+




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)