You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@asterixdb.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2023/08/07 02:18:00 UTC

[jira] [Commented] (ASTERIXDB-3234) Incorrect handling of empty arrays in columnar collections

    [ https://issues.apache.org/jira/browse/ASTERIXDB-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17751430#comment-17751430 ] 

ASF subversion and git services commented on ASTERIXDB-3234:
------------------------------------------------------------

Commit 5e195e23a47097a5e1979ed17a49928f6fa6547b in asterixdb's branch refs/heads/master from Wail Alkowaileet
[ https://gitbox.apache.org/repos/asf?p=asterixdb.git;h=5e195e23a4 ]

[ASTERIXDB-3234][STO] Fix handling empty arrays in columnar datasets

- user model changes: no
- storage format changes: yes
- interface changes: no

Details:
Fix the issue of handling empty arrays in columnar datasets

Storage format changes:
- Repeated values will always end with a MISSING value.
  The last MISSING value will be used as an indicator that the
  array itself is present and it will be consumed by the assembler
  and won't be included in the assembled array. In case of an
  empty array, the last MISSING value will be consumed and
  an empty array will be returned.

Change-Id: I220e9e8ede45530ef61656530309c79321dc189c
Reviewed-on: https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/17693
Integration-Tests: Jenkins <je...@fulliautomatix.ics.uci.edu>
Reviewed-by: Murtadha Hubail <mh...@apache.org>
Tested-by: Jenkins <je...@fulliautomatix.ics.uci.edu>


> Incorrect handling of empty arrays in columnar collections
> ----------------------------------------------------------
>
>                 Key: ASTERIXDB-3234
>                 URL: https://issues.apache.org/jira/browse/ASTERIXDB-3234
>             Project: Apache AsterixDB
>          Issue Type: Bug
>          Components: STO - Storage
>    Affects Versions: 0.9.9
>            Reporter: Wail Y. Alkowaileet
>            Assignee: Wail Y. Alkowaileet
>            Priority: Critical
>             Fix For: 0.9.9
>
>
> Currently, columnar collections doesn't handle empty arrays correctly.
>  * If an array was always empty and a flush occurred, a NPE exception is thrown
> {noformat}
> java.lang.NullPointerException: Cannot invoke "org.apache.asterix.column.metadata.schema.AbstractSchemaNode.serialize(java.io.DataOutput, org.apache.asterix.column.metadata.PathInfoSerializer)" because "this.item" is null {noformat}
>  *  If the array item type is known but the array itself is empty, we treat it as if the array itself is missing. This is incorrect and produces a different result compared to the row format. Below is an example of the current output of an empty array
> {noformat}
> Input: {"a" []}
> Output: {} {noformat}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)