You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Sorabh Hamirwasia (JIRA)" <ji...@apache.org> on 2018/08/14 18:32:00 UTC

[jira] [Commented] (DRILL-6687) Improve RemovingRecordBatch to do transfer when all records needs to be copied

    [ https://issues.apache.org/jira/browse/DRILL-6687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580235#comment-16580235 ] 

Sorabh Hamirwasia commented on DRILL-6687:
------------------------------------------

The idea to make this improvement is SelectionVector2 should also contain the actualRecordCount stored in associated RecordBatch container. All the operators who creates SelectionVector2 should be able to set this information. Later in RemovingRecordBatch, it checks with SelectionVector2 if full transfer can be done or not. Based on the incoming batch RemovingRecordBatch will do either row by row copy or full transfer.

> Improve RemovingRecordBatch to do transfer when all records needs to be copied
> ------------------------------------------------------------------------------
>
>                 Key: DRILL-6687
>                 URL: https://issues.apache.org/jira/browse/DRILL-6687
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Execution - Relational Operators
>            Reporter: Sorabh Hamirwasia
>            Assignee: Sorabh Hamirwasia
>            Priority: Major
>             Fix For: 1.15.0
>
>
> SelectionVector2 contains list of indexes for the rows that RemovingRecordBatch can copy from underlying RecordBatch. SV2 is created by operator like Filter, Limit, etc to provide the selected rows from underlying buffer. Later then RemovingRecordBatch copies the rows based on indexes in SelectionVector2 to the output container of type NONE. 
> For cases when all the rows needs to be copied by RemovingRecordBatch from incoming batch, it can be improved to do full transfer of ValueVectors from input to output container instead of row by row copy. For example if for an incoming batch all rows are selected by the Filter condition in FilterRecordBatch, it will prepare an SV2 with all the record rowIndex. Later RemovingRecordBatch downstream of Filter can potentially do just transfer instead of row by row copy.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)