You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kurt Young (Jira)" <ji...@apache.org> on 2019/12/02 06:39:00 UTC

[jira] [Closed] (FLINK-14135) Introduce vectorized orc InputFormat for blink runtime

     [ https://issues.apache.org/jira/browse/FLINK-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kurt Young closed FLINK-14135.
------------------------------
    Fix Version/s: 1.10.0
       Resolution: Fixed

merged to master:

3c1ecbf4efb6e9eda60841db54c3ad74da80973d

ab6893201dbabab1c44a052b23821b78f18ed9b3

1f9cd8ac6fda1eee0c3f8899b8101ede182f2c93

> Introduce vectorized orc InputFormat for blink runtime
> -------------------------------------------------------
>
>                 Key: FLINK-14135
>                 URL: https://issues.apache.org/jira/browse/FLINK-14135
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Connectors / ORC
>            Reporter: Jingsong Lee
>            Assignee: Jingsong Lee
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 1.10.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> VectorizedOrcInputFormat is introduced to read orc data in batches.
> When returning each row of data, instead of actually retrieving each field, we use BaseRow's abstraction to return a Columnar Row-like view.
> This will greatly improve the downstream filtered scenarios, so that there is no need to access redundant fields on the filtered data.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)