You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Artiom Darie (JIRA)" <ji...@apache.org> on 2017/04/28 17:14:04 UTC

[jira] [Updated] (FLINK-6417) Wildcard support for read text file

     [ https://issues.apache.org/jira/browse/FLINK-6417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Artiom Darie updated FLINK-6417:
--------------------------------
    Description: 
Add wildcard support while reading from s3://, hdfs://, file://, etc.

h6. Examples:
# {code} s3://bucket-name/*.gz {code}
# {code} hdfs://path/*file-name*.csv {code}
# {code} file://tmp/**/*.* {code}

h6. Proposal
# Use the existing method: {code}environment.readFile(...){code}
# List all the files in the directories
# Read files using existing: {code}ContinuousFileReaderOperator{code}

h6. Concerns (Open for discussions)
# Have multiple DataSource(s) created for each each file and then to join them into a single DataSource
# Have all the files into the same DataSource
# Have the listing of the files on the driver and load on each task manager




  was:
Add wildcard support while reading from s3://, hdfs://, file://, etc.

h6. Examples:
# {code} s3://bucket-name/*.gz {code}
# {code} hdfs://path/*file-name*.csv {code}
# {code} file://tmp/**/*.* {code}

h6. Proposal
# Use the existing method: {code}environment.readFile(...){code}
# List all the files in the directories
# Read files using existing: {code}ContinuousFileReaderOperator{code}

h6. Concerns (Open for discussions)
# Have multiple DataSource(s) created for each each file and then to join them into a single DataSource
# Have all the files into the same DataSource





> Wildcard support for read text file
> -----------------------------------
>
>                 Key: FLINK-6417
>                 URL: https://issues.apache.org/jira/browse/FLINK-6417
>             Project: Flink
>          Issue Type: New Feature
>          Components: Core
>            Reporter: Artiom Darie
>            Priority: Minor
>
> Add wildcard support while reading from s3://, hdfs://, file://, etc.
> h6. Examples:
> # {code} s3://bucket-name/*.gz {code}
> # {code} hdfs://path/*file-name*.csv {code}
> # {code} file://tmp/**/*.* {code}
> h6. Proposal
> # Use the existing method: {code}environment.readFile(...){code}
> # List all the files in the directories
> # Read files using existing: {code}ContinuousFileReaderOperator{code}
> h6. Concerns (Open for discussions)
> # Have multiple DataSource(s) created for each each file and then to join them into a single DataSource
> # Have all the files into the same DataSource
> # Have the listing of the files on the driver and load on each task manager



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)