You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2022/07/19 03:42:00 UTC

[jira] [Created] (IMPALA-11444) Wrong results in reading wide rows from ORC

Quanlong Huang created IMPALA-11444:
---------------------------------------

             Summary: Wrong results in reading wide rows from ORC
                 Key: IMPALA-11444
                 URL: https://issues.apache.org/jira/browse/IMPALA-11444
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 3.4.1, Impala 3.4.0
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


The bug only exists in 3.4 branches where we have IMPALA-9228 and is missing IMPALA-9469.

When reading from a wide table with tuple size larger than , the orc scanner produces wrong results. The issue can be reproduced using the attached CreateTable stmt and the ORC file.

{code:sql}
$ bin/impala-shell.sh --quiet -f create-table-512cols.sql
$ bin/impala-shell.sh -B --quiet -q 'show table stats orc_tbl_512cols'
-1	0	0B	NOT CACHED	NOT CACHED	ORC	false	hdfs://localhost:20500/test-warehouse/orc_tbl_512cols
$ hdfs dfs -put widerow_512cols.orc hdfs://localhost:20500/test-warehouse/orc_tbl_512cols
$ bin/impala-shell.sh -q 'refresh orc_tbl_512cols'
{code}

Then run the following query:
{code:sql}
$ bin/impala-shell.sh -B -q "select * from orc_tbl_512cols where col0 = '1'"
{code}
The result should be only one row with all values as '1'. However, we get one rwo with all values as '1024'.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)