You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Raphael Taylor-Davies (Jira)" <ji...@apache.org> on 2020/04/19 13:05:00 UTC

[jira] [Updated] (ARROW-8516) Slow BufferBuilder inserts

     [ https://issues.apache.org/jira/browse/ARROW-8516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raphael Taylor-Davies updated ARROW-8516:
-----------------------------------------
    Description: 
{color:#000000}BufferBuilder{color}{color:#0073bf}<{color}{color:#000000}BooleanType>::append_slice is called by PrimitiveBuilder{color}{color:#000000}::append_slice with a constructed vector of true values. {color}

{color:#000000}Even in release builds the associated allocations and iterations are not optimised out, resulting in a third of the time to parse a parquet file containing single integers being spent in PrimitiveBuilder::append_slice.{color}

{color:#000000}This PR adds an append_n method to the BufferBuilderTrait that allows this to be handled more efficiently. My rather unscientific testing shows it to halve the amount of time spent in this method yielding an ~20% speedup for my particular workload.{color}

 

  was:
{color:#000000}BufferBuilder{color}{color:#0073bf}<{color}{color:#000000}BooleanType>::append_slice is called by ArrayBuilder::append_slice with a constructed vector of true values. {color}

{color:#000000}Even in release builds the associated allocations and iterations are not optimised out, resulting in a third of the time to parse a parquet file containing single integers being spent in PrimitiveBuilder::append_slice.{color}

{color:#000000}This PR adds an append_n method to the BufferBuilderTrait that allows this to be handled more efficiently. My rather unscientific testing shows it to halve the amount of time spent in this method yielding an ~20% speedup for my particular workload.
{color}

 


> Slow BufferBuilder<BooleanType> inserts
> ---------------------------------------
>
>                 Key: ARROW-8516
>                 URL: https://issues.apache.org/jira/browse/ARROW-8516
>             Project: Apache Arrow
>          Issue Type: Improvement
>            Reporter: Raphael Taylor-Davies
>            Priority: Trivial
>
> {color:#000000}BufferBuilder{color}{color:#0073bf}<{color}{color:#000000}BooleanType>::append_slice is called by PrimitiveBuilder{color}{color:#000000}::append_slice with a constructed vector of true values. {color}
> {color:#000000}Even in release builds the associated allocations and iterations are not optimised out, resulting in a third of the time to parse a parquet file containing single integers being spent in PrimitiveBuilder::append_slice.{color}
> {color:#000000}This PR adds an append_n method to the BufferBuilderTrait that allows this to be handled more efficiently. My rather unscientific testing shows it to halve the amount of time spent in this method yielding an ~20% speedup for my particular workload.{color}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)