You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Mayur Srivastava (Jira)" <ji...@apache.org> on 2020/04/21 17:30:00 UTC

[jira] [Updated] (ARROW-8543) [C++] IO: single pass coalescing algorithm

     [ https://issues.apache.org/jira/browse/ARROW-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mayur Srivastava updated ARROW-8543:
------------------------------------
    Description: 
The current coalescing algorithm is a O(N^2) two pass algorithm (where N is number of ranges) (first implemented in https://issues.apache.org/jira/browse/ARROW-7995). In the first pass, the Coalesce() function finds the begin and end of a candidate range that can be coalesced. In the second, pass the CoalesceUntilLargeEnough() function goes over the ranges from begin to end and adds coalesced range to the result (out).

The proposal is to convert the algorithm to an O(N) single pass algorithm that computes coalesced ranges while making the first pass over the ranges in the list. This algorithm is also shorter in lines of code and hence (hopefully) more maintainable in long term.

 

  was:
The current coalescing algorithm is a O(n^2) two pass algorithm (where n is number of ranges) (first implemented in https://issues.apache.org/jira/browse/ARROW-7995). In the first pass, the Coalesce() function finds the begin and end of a candidate range that can be coalesced. In the second, pass the CoalesceUntilLargeEnough() function goes over the ranges from begin to end and adds coalesced range to the result (out).

The proposal is to convert the algorithm to an O(n) single pass algorithm that computes coalesced ranges while making the first pass over the ranges in the list. This algorithm is also shorter in lines of code and hence (hopefully) more maintainable in long term.

 


> [C++] IO: single pass coalescing algorithm
> ------------------------------------------
>
>                 Key: ARROW-8543
>                 URL: https://issues.apache.org/jira/browse/ARROW-8543
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Mayur Srivastava
>            Priority: Major
>
> The current coalescing algorithm is a O(N^2) two pass algorithm (where N is number of ranges) (first implemented in https://issues.apache.org/jira/browse/ARROW-7995). In the first pass, the Coalesce() function finds the begin and end of a candidate range that can be coalesced. In the second, pass the CoalesceUntilLargeEnough() function goes over the ranges from begin to end and adds coalesced range to the result (out).
> The proposal is to convert the algorithm to an O(N) single pass algorithm that computes coalesced ranges while making the first pass over the ranges in the list. This algorithm is also shorter in lines of code and hence (hopefully) more maintainable in long term.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)