You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Till Rohrmann (JIRA)" <ji...@apache.org> on 2018/11/22 16:51:00 UTC

[jira] [Updated] (FLINK-10989) OrcRowInputFormat uses two different file systems

     [ https://issues.apache.org/jira/browse/FLINK-10989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Till Rohrmann updated FLINK-10989:
----------------------------------
    Description: 
The {{OrcRowInputFormat}} seems to use two different {{FileSystem}}. The Flink {{FileSystem}} for listing the files and generating the {{InputSplits}} and then Hadoop's {{FileSystem}} to actually read the input splits. This can be problematic if one only configures Flink's S3 {{FileSystem}} but does not provide a S3 implementation for Hadoop's {{FileSystem}}.

I think this is not an intuitive behaviour and can lead to hard to debug problems for a user.

  was:The {{OrcRowInputFormat}} seems to use two different {{FileSystem}}. The Flink {{FileSystem}} for listing the files and generating the {{InputSplits}} and then Hadoop's {{FileSystem}} to actually read the input splits. This can be problematic if one only configures Flink's S3 {{FileSystem}} but does not provide a S3 implementation for Hadoop's {{FileSystem}}.


> OrcRowInputFormat uses two different file systems
> -------------------------------------------------
>
>                 Key: FLINK-10989
>                 URL: https://issues.apache.org/jira/browse/FLINK-10989
>             Project: Flink
>          Issue Type: Bug
>          Components: Batch Connectors and Input/Output Formats
>    Affects Versions: 1.7.0
>            Reporter: Till Rohrmann
>            Priority: Major
>
> The {{OrcRowInputFormat}} seems to use two different {{FileSystem}}. The Flink {{FileSystem}} for listing the files and generating the {{InputSplits}} and then Hadoop's {{FileSystem}} to actually read the input splits. This can be problematic if one only configures Flink's S3 {{FileSystem}} but does not provide a S3 implementation for Hadoop's {{FileSystem}}.
> I think this is not an intuitive behaviour and can lead to hard to debug problems for a user.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)