You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Gabor Kaszab (Jira)" <ji...@apache.org> on 2021/08/04 13:50:00 UTC

[jira] [Created] (IMPALA-10839) NULL values are displayed on a wrong level for nested structs (ORC)

Gabor Kaszab created IMPALA-10839:
-------------------------------------

             Summary: NULL values are displayed on a wrong level for nested structs (ORC)
                 Key: IMPALA-10839
                 URL: https://issues.apache.org/jira/browse/IMPALA-10839
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
            Reporter: Gabor Kaszab


When querying a non-toplevel nested struct then the NULL values are displayed in an incorrect level. E.g.:
{code:java}
select id, outer_struct.inner_struct3 from functional_orc_def.complextypes_nested_structs where id >= 4;
{code}
{code:java}
+----+----------------------------+
| id | outer_struct.inner_struct3 |
+----+----------------------------+
| 4  | {"s":{"i":null,"s":null}}  |
| 5  | {"s":null}                 |
+----+----------------------------+
{code}

However, here in the first row the expected would be that 's' is null and not its members and in the second line the result should be 'NULL'.
For reference see what is returned when querying 'outer_struct' instead of 'outer_struct.inner_struct3':
{code:java}
+----+-------------------------------------------------------------------------------------------------------------------------------+
| 4  | {"str":"","inner_struct1":{"str":"somestr2","de":12345.12},"inner_struct2":{"i":1,"str":"string"},"inner_struct3":{"s":null}} |
| 5  | {"str":null,"inner_struct1":null,"inner_struct2":null,"inner_struct3":null}                                                   |
+----+-------------------------------------------------------------------------------------------------------------------------------+
{code}

Note, this issues is with ORC format.
After some digging I found that these incorrect null values are already present in the ORC scanner where OrcStructReader reads the rows in ReadValue() and ReadValueBatch() functions.
As a first step it would be nice to verify that the external ORC reader we use for reading the actual values from the files gives correct results.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org