You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Tianshuo Deng (JIRA)" <ji...@apache.org> on 2015/07/22 00:35:05 UTC

[jira] [Created] (PARQUET-341) Improve write performance with wide schema sparse data

Tianshuo Deng created PARQUET-341:
-------------------------------------

             Summary: Improve write performance with wide schema sparse data
                 Key: PARQUET-341
                 URL: https://issues.apache.org/jira/browse/PARQUET-341
             Project: Parquet
          Issue Type: Improvement
            Reporter: Tianshuo Deng
            Assignee: Tianshuo Deng


In write path, when there are tons of sparse data, most of time is spent on writing nulls.

Currently writing nulls has the same code path as writing values, which is reclusive traverse all the leaves when a group is null.

Due to the fact that when a group is null all the leaves beneath it should be written with null value with the same repetition level and definition level, we can eliminate the recursion call to get the leaves



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)