You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/04/18 00:11:00 UTC

[jira] [Commented] (DRILL-6335) Refactor row set abstractions to prepare for unions

    [ https://issues.apache.org/jira/browse/DRILL-6335?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16441690#comment-16441690 ] 

ASF GitHub Bot commented on DRILL-6335:
---------------------------------------

GitHub user paul-rogers opened a pull request:

    https://github.com/apache/drill/pull/1218

    DRILL-6335: Refactor row set abstractions to prepare for unions

    Refactors the column accessors to prepare for adding unions, lists and repeated lists.
    
    This is a subset of a PR done a week ago in the hope that this one will be easier to review. The original one will be broken down into four or more smaller PRs: this one, a refactoring of the result set loader, also to prepare for unions, and the union work itself.
    
    The row set mechanism is fully described [here](https://github.com/paul-rogers/drill/wiki/Batch-Handling-Upgrades).
    
    Rather than write a long description here, please take a look at the code and the Wiki post. To ease review, however, the following summarizes the changes:
    
    * Moved metadata from the tuple reader/writer to the column reader/writer so that it is available for all columns. Added a `tupleMetadata()` to tuples to continue to provide the tuple schema.
    * Added a `ProjectionType` bit of metadata in preparation for the projection system to be used by the scan operator. (Projection has three states, captured by the new enum.)
    * Updated some tests to use a slightly simpler version of the code that compares two result sets.
    * Added unit tests for "indirect" readers: a reader for an SV2.
    * Refactored the offset vector writer to allow a dummy offset vector writer as part of the projection mechanism.
    * Changed the column accessor code gen template to use constants instead of hard-coded numbers for field positions.
    * Additional documentation.
    * Restructured the column accessor base classes to better organize the functions in preparation for lists and unions (which are far more complex than maps and scalars.)
    * Pulled a couple of formerly-nested classes into top-level classes.
    * Reorganized the code that builds accessors; moved it into the accessor itself.
    * Added more complete system to allow writing of generic Java objects in tests.
    * Temporary patches to the row set loader classes to handle the above changes. (The patches will be replaced in the result set loader refactoring PR to come later.)
    
    The code already has extensive unit tests for this functionality. Those tests were rerun to demonstrate that the refactoring preserves existing functionality. A later PR will exercise the new structure in tests for unions, lists, etc.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/paul-rogers/drill DRILL-6335

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1218.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1218
    
----
commit c59a3cbe0aa910c8d31589fe1619846e6d417915
Author: Paul Rogers <pr...@...>
Date:   2018-04-17T04:44:10Z

    DRILL-6335: Column accessor refactoring

commit 0716a3894f0e7e48747b04f0f934cf94a04ebd3f
Author: Paul Rogers <pr...@...>
Date:   2018-04-17T17:41:07Z

    Merge fixes

----


> Refactor row set abstractions to prepare for unions
> ---------------------------------------------------
>
>                 Key: DRILL-6335
>                 URL: https://issues.apache.org/jira/browse/DRILL-6335
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Paul Rogers
>            Assignee: Paul Rogers
>            Priority: Major
>             Fix For: 1.14.0
>
>
> The row set abstractions will eventually support unions and lists. The changes to support these types are extensive. This PR introduces refactoring that puts the pieces in the correct location to allow for inserting those additional types.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)