You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "James Turton (Jira)" <ji...@apache.org> on 2022/07/19 07:29:00 UTC

[jira] [Closed] (DRILL-7399) Querying parquet file with boolean data type return wrong results

     [ https://issues.apache.org/jira/browse/DRILL-7399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

James Turton closed DRILL-7399.
-------------------------------
      Assignee: James Turton
    Resolution: Fixed

Based on my testing with the attached file using 1.19.0 (broken) and 1.20.1 (working), this was fixed in either 1.20.0 or 1.20.1. My transcript from 1.20.1 follows.

 
{code:java}
Apache Drill 1.20.1
"Say hello to my little Drill."
apache drill> alter session set `store.parquet.use_new_reader` = true;
ok       true
summary  store.parquet.use_new_reader updated.
1 row selected (0.51 seconds)
apache drill> SELECT press_run_1, count() FROM dfs.tmp.`newrule22_3_1.parquet` group by press_run_1;
press_run_1  false
EXPR$1       11032
press_run_1  true
EXPR$1       9421
2 rows selected (2.488 seconds)
apache drill> alter session set `store.parquet.use_new_reader` = false;
ok       true
summary  store.parquet.use_new_reader updated.
1 row selected (0.083 seconds)
apache drill> SELECT press_run_1, count() FROM dfs.tmp.`newrule22_3_1.parquet` group by press_run_1;
press_run_1  false
EXPR$1       11032
press_run_1  true
EXPR$1       9421
2 rows selected (0.424 seconds)
{code}
 

> Querying parquet file with boolean data type return wrong results
> -----------------------------------------------------------------
>
>                 Key: DRILL-7399
>                 URL: https://issues.apache.org/jira/browse/DRILL-7399
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Storage - Parquet
>    Affects Versions: 1.16.0
>            Reporter: Fabian Barreiro
>            Assignee: James Turton
>            Priority: Critical
>             Fix For: 1.20.1
>
>         Attachments: newrule22_3_1.parquet
>
>
> The following query return a wrong value for the boolean column press_run_1:
>  SELECT * FROM dfs.root.`/tmp/newrule22_3_1.parquet` WHERE cycle_id=23435119
> The query return press_run_1 = 'false'
> the parquet file contain pess_run_1 = 'true' value for this record.
> You can find many records with this problem if try different selects.
> ATTACHED:  newrule22_3_1.parquet file.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)