You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Rahul Challapalli (JIRA)" <ji...@apache.org> on 2016/11/03 18:32:58 UTC

[jira] [Created] (DRILL-4996) Parquet Date auto-correction is not working in auto-partitioned parquet files generated by drill-1.6

Rahul Challapalli created DRILL-4996:
----------------------------------------

             Summary: Parquet Date auto-correction is not working in auto-partitioned parquet files generated by drill-1.6
                 Key: DRILL-4996
                 URL: https://issues.apache.org/jira/browse/DRILL-4996
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - Parquet
            Reporter: Rahul Challapalli
            Priority: Critical


git.commit.id.abbrev=4ee1d4c

Below are the steps I followed to generate the data :
{code}
1. Generate a parquet file with date column using hive1.2
2. Use drill 1.6 to create auto-partitioned parquet files partitioned on the date column
{code}

Now the below query returns wrong results :
{code}
select i_rec_start_date, i_size from dfs.`/drill/testdata/parquet_date/auto_partition/item_multipart_autorefresh`  group by i_rec_start_date, i_size;
+-------------------+--------------+
| i_rec_start_date  |    i_size    |
+-------------------+--------------+
| null              | large        |
| 366-11-08        | extra large  |
| 366-11-08        | medium       |
| null              | medium       |
| 366-11-08        | petite       |
| 364-11-07        | medium       |
| null              | petite       |
| 365-11-07        | medium       |
| 368-11-07        | economy      |
| 365-11-07        | large        |
| 365-11-07        | small        |
| 366-11-08        | small        |
| 365-11-07        | extra large  |
| 364-11-07        | N/A          |
| 366-11-08        | economy      |
| 366-11-08        | large        |
| 364-11-07        | small        |
| null              | small        |
| 364-11-07        | large        |
| 364-11-07        | extra large  |
| 368-11-07        | N/A          |
| 368-11-07        | extra large  |
| 368-11-07        | large        |
| 365-11-07        | petite       |
| null              | N/A          |
| 365-11-07        | economy      |
| 364-11-07        | economy      |
| 364-11-07        | petite       |
| 365-11-07        | N/A          |
| 368-11-07        | medium       |
| null              | extra large  |
| 368-11-07        | small        |
| 368-11-07        | petite       |
| 366-11-08        | N/A          |
+-------------------+--------------+
34 rows selected (0.691 seconds)
{code}

However I tried generating the auto-partitioned parquet files using Drill 1.2 and then the above query returned the right results.

I attached the required data sets.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)