You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Rahul Challapalli (JIRA)" <ji...@apache.org> on 2016/11/03 18:32:58 UTC
[jira] [Created] (DRILL-4996) Parquet Date auto-correction is not
working in auto-partitioned parquet files generated by drill-1.6
Rahul Challapalli created DRILL-4996:
----------------------------------------
Summary: Parquet Date auto-correction is not working in auto-partitioned parquet files generated by drill-1.6
Key: DRILL-4996
URL: https://issues.apache.org/jira/browse/DRILL-4996
Project: Apache Drill
Issue Type: Bug
Components: Storage - Parquet
Reporter: Rahul Challapalli
Priority: Critical
git.commit.id.abbrev=4ee1d4c
Below are the steps I followed to generate the data :
{code}
1. Generate a parquet file with date column using hive1.2
2. Use drill 1.6 to create auto-partitioned parquet files partitioned on the date column
{code}
Now the below query returns wrong results :
{code}
select i_rec_start_date, i_size from dfs.`/drill/testdata/parquet_date/auto_partition/item_multipart_autorefresh` group by i_rec_start_date, i_size;
+-------------------+--------------+
| i_rec_start_date | i_size |
+-------------------+--------------+
| null | large |
| 366-11-08 | extra large |
| 366-11-08 | medium |
| null | medium |
| 366-11-08 | petite |
| 364-11-07 | medium |
| null | petite |
| 365-11-07 | medium |
| 368-11-07 | economy |
| 365-11-07 | large |
| 365-11-07 | small |
| 366-11-08 | small |
| 365-11-07 | extra large |
| 364-11-07 | N/A |
| 366-11-08 | economy |
| 366-11-08 | large |
| 364-11-07 | small |
| null | small |
| 364-11-07 | large |
| 364-11-07 | extra large |
| 368-11-07 | N/A |
| 368-11-07 | extra large |
| 368-11-07 | large |
| 365-11-07 | petite |
| null | N/A |
| 365-11-07 | economy |
| 364-11-07 | economy |
| 364-11-07 | petite |
| 365-11-07 | N/A |
| 368-11-07 | medium |
| null | extra large |
| 368-11-07 | small |
| 368-11-07 | petite |
| 366-11-08 | N/A |
+-------------------+--------------+
34 rows selected (0.691 seconds)
{code}
However I tried generating the auto-partitioned parquet files using Drill 1.2 and then the above query returned the right results.
I attached the required data sets.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)