You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/04/23 11:34:00 UTC

[jira] [Commented] (PARQUET-852) Slowly ramp up sizes of byte[] in ByteBasedBitPackingEncoder

    [ https://issues.apache.org/jira/browse/PARQUET-852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16447997#comment-16447997 ] 

ASF GitHub Bot commented on PARQUET-852:
----------------------------------------

gszadovszky opened a new pull request #467: Revert "PARQUET-852: Slowly ramp up sizes of byte[] in ByteBasedBitPa…
URL: https://github.com/apache/parquet-mr/pull/467
 
 
   …ckingEncoder"
   
   This reverts commit d59b32a9120ad40e2a9f6651b680e84dae1747a6.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> Slowly ramp up sizes of byte[] in ByteBasedBitPackingEncoder
> ------------------------------------------------------------
>
>                 Key: PARQUET-852
>                 URL: https://issues.apache.org/jira/browse/PARQUET-852
>             Project: Parquet
>          Issue Type: Improvement
>            Reporter: John Jenkins
>            Priority: Minor
>             Fix For: 1.10.0
>
>
> The current allocation policy for ByteBasedBitPackingEncoder is to allocate 64KB * #bits up-front. As similarly observed in [PARQUET-580], this can lead to significant memory overheads for high-fanout scenarios (many columns and/or open files, in my case using BooleanPlainValuesWriter).
> As done in [PARQUET-585], I'll follow up with a PR that starts with a smaller buffer and works its way up to a max.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)