You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "sujun (Jira)" <ji...@apache.org> on 2021/01/29 03:52:00 UTC

[jira] [Commented] (FLINK-21195) LimitableBulkFormat is invalid when format is orc

    [ https://issues.apache.org/jira/browse/FLINK-21195?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17274137#comment-17274137 ] 

sujun commented on FLINK-21195:
-------------------------------

If confirmed to be a bug, you can assign the issue to me :D

> LimitableBulkFormat is invalid when format is orc
> -------------------------------------------------
>
>                 Key: FLINK-21195
>                 URL: https://issues.apache.org/jira/browse/FLINK-21195
>             Project: Flink
>          Issue Type: Bug
>          Components: Connectors / FileSystem
>    Affects Versions: 1.12.1
>            Reporter: sujun
>            Priority: Major
>         Attachments: orc_reader_debug.jpg
>
>
> The orc file will read a stripe data in advance in the createReader() method (see the construction method of RecordReaderImpl in detail), and the parquet file will start to read the block data when the readBatch() method is called, so if all orc files have only one stripe, limitableBulkFormat will be invalid
>  
> !orc_reader_debug.jpg!
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)