You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Kurt Young (Jira)" <ji...@apache.org> on 2019/12/02 06:39:00 UTC
[jira] [Closed] (FLINK-14135) Introduce vectorized orc InputFormat for blink runtime
[ https://issues.apache.org/jira/browse/FLINK-14135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kurt Young closed FLINK-14135.
------------------------------
Fix Version/s: 1.10.0
Resolution: Fixed
merged to master:
3c1ecbf4efb6e9eda60841db54c3ad74da80973d
ab6893201dbabab1c44a052b23821b78f18ed9b3
1f9cd8ac6fda1eee0c3f8899b8101ede182f2c93
> Introduce vectorized orc InputFormat for blink runtime
> -------------------------------------------------------
>
> Key: FLINK-14135
> URL: https://issues.apache.org/jira/browse/FLINK-14135
> Project: Flink
> Issue Type: Sub-task
> Components: Connectors / ORC
> Reporter: Jingsong Lee
> Assignee: Jingsong Lee
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.10.0
>
> Time Spent: 20m
> Remaining Estimate: 0h
>
> VectorizedOrcInputFormat is introduced to read orc data in batches.
> When returning each row of data, instead of actually retrieving each field, we use BaseRow's abstraction to return a Columnar Row-like view.
> This will greatly improve the downstream filtered scenarios, so that there is no need to access redundant fields on the filtered data.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)