You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by chrisr123 <ch...@gmail.com> on 2018/07/14 17:15:25 UTC

understanding purpose of TextInputFormat

I'm building a streaming app that continuously monitors a directory for new
files and I'm confused about why I have to specify a TextInputFormat - see
source code below.  It seems redundant but it is a required parameter.  It
makes perfect sense to specify the directory I want to monitor, but what
purpose is the TextInputFormat filling and what should I set it to? Example:
Simple Word Count App that reads lines of text.  


    TextInputFormat format = new TextInputFormat(
            new org.apache.flink.core.fs.Path("file:///tmp/dir/"));

    DataStream<String> inputStream = env.readFile(
            format,
            "file:///tmp/dir/",
            FileProcessingMode.PROCESS_CONTINUOUSLY,
            100);



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: understanding purpose of TextInputFormat

Posted by Jörn Franke <jo...@gmail.com>.
Textinputformat defines the format of the data, it could be also different from text , eg orc, parquet etc

> On 14. Jul 2018, at 19:15, chrisr123 <ch...@gmail.com> wrote:
> 
> I'm building a streaming app that continuously monitors a directory for new
> files and I'm confused about why I have to specify a TextInputFormat - see
> source code below.  It seems redundant but it is a required parameter.  It
> makes perfect sense to specify the directory I want to monitor, but what
> purpose is the TextInputFormat filling and what should I set it to? Example:
> Simple Word Count App that reads lines of text.  
> 
> 
>    TextInputFormat format = new TextInputFormat(
>            new org.apache.flink.core.fs.Path("file:///tmp/dir/"));
> 
>    DataStream<String> inputStream = env.readFile(
>            format,
>            "file:///tmp/dir/",
>            FileProcessingMode.PROCESS_CONTINUOUSLY,
>            100);
> 
> 
> 
> --
> Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/