You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Csaba Ringhofer (Jira)" <ji...@apache.org> on 2022/02/17 19:49:00 UTC

[jira] [Resolved] (IMPALA-2272) Parquet scanner always materializes NULL for empty collections

     [ https://issues.apache.org/jira/browse/IMPALA-2272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Csaba Ringhofer resolved IMPALA-2272.
-------------------------------------
    Fix Version/s: Impala 4.1.0
       Resolution: Fixed

Resolved by IMPALA-9498

> Parquet scanner always materializes NULL for empty collections
> --------------------------------------------------------------
>
>                 Key: IMPALA-2272
>                 URL: https://issues.apache.org/jira/browse/IMPALA-2272
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>    Affects Versions: Impala 2.3.0
>            Reporter: Skye Wanderman-Milne
>            Priority: Minor
>              Labels: complextype, nested_types
>             Fix For: Impala 4.1.0
>
>
> Currently the Parquet scanner will always materialize a NULL slot for an empty collection, rather than an empty ArrayValue/CollectionValue. It is not currently possible to write a query that exposes this bug (i.e. it's not possible to write a query that distinguishes between an empty and NULL collection), but it will be once we add expressions that take collections as input (e.g. "select array_column is null from tbl").
> We have this bug because the parquet scanner only looks at the repeated field of an array, not the containing group field. To fix it, it will have to consider the def/rep levels of both.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)