You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Shashank Agarwal (JIRA)" <ji...@apache.org> on 2017/06/23 13:36:06 UTC
[jira] [Created] (FLINK-6993) Not reading recursive files in Batch
by using readTextFile when file name contains _ in starting.
Shashank Agarwal created FLINK-6993:
---------------------------------------
Summary: Not reading recursive files in Batch by using readTextFile when file name contains _ in starting.
Key: FLINK-6993
URL: https://issues.apache.org/jira/browse/FLINK-6993
Project: Flink
Issue Type: Bug
Components: Batch Connectors and Input/Output Formats
Affects Versions: 1.3.0
Reporter: Shashank Agarwal
Priority: Critical
Fix For: 1.3.2
When i try to read files from a folder using using readTextFile in batch and using recursive.file.enumeration, It's not reading the files when file name contains _ in starting. But when i removed the _ from start it's working fine.
It also working fine in case of direct path of single file not working with Directory path. For replicate the issue :
{code}
import org.apache.flink.api.scala.{DataSet, ExecutionEnvironment}
import org.apache.flink.configuration.Configuration
object CSVMerge {
def main(args: Array[String]): Unit = {
val env = ExecutionEnvironment.getExecutionEnvironment
// create a configuration object
val parameters = new Configuration
// set the recursive enumeration parameter
parameters.setBoolean("recursive.file.enumeration", true)
val stream = env.readTextFile("file:///Users/data")
.withParameters(parameters)
stream.print()
}
}
{code}
When you put 2-3 Text files with name like 1.txt, 2.txt etc. in data folder it's working fine. But when we put _1.txt, _2.txt file it's not working.
Flink BucketingSink in stream by default put _ before the file names. So unable to read Sinked files from DataStream.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)