You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Tian, Li (JIRA)" <ji...@apache.org> on 2016/04/04 17:45:25 UTC

[jira] [Comment Edited] (FLINK-3655) Allow comma-separated or multiple directories to be specified for FileInputFormat

    [ https://issues.apache.org/jira/browse/FLINK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15224354#comment-15224354 ] 

Tian, Li edited comment on FLINK-3655 at 4/4/16 3:44 PM:
---------------------------------------------------------

I think we may need to use "List<Path> filePaths" instead of "Path filePath" in FileInputFormat.
In this way, we should also
1. modify current implementations to support multiple input paths
2. add functions like setFilePaths, getFilePaths to FileInputFormat, and support comma-seperated Path string in ExecutionEnvironment
3. for backward compatibility, let FileInputFormat.setFilePath set the inputPaths to a one-element list 


was (Author: tianli):
I think we may need to use List<Path> instead of a single Path in FileInputFormat.
In this way, we should also
1. modify current implementations to support multiple input paths
2. add functions like setFilePaths, getFilePaths to FileInputFormat, and support comma-seperated Path string in ExecutionEnvironment
3. for backward compatibility, let FileInputFormat.setFilePath set the inputPaths to a one-element list 

> Allow comma-separated or multiple directories to be specified for FileInputFormat
> ---------------------------------------------------------------------------------
>
>                 Key: FLINK-3655
>                 URL: https://issues.apache.org/jira/browse/FLINK-3655
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.0
>            Reporter: Gna Phetsarath
>            Priority: Minor
>              Labels: starter
>
> Allow comma-separated or multiple directories to be specified for FileInputFormat so that a DataSource will process the directories sequentially.
>    env.readFile("/data/2016/01/01/*/*,/data/2016/01/02/*/*,/data/2016/01/03/*/*")
> in Scala
>    env.readFile(paths: Seq[String])
> or 
>   env.readFile(path: String, otherPaths: String*)
> Wildcard support would be a bonus.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)