You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Sugandha Amatya <su...@gmail.com> on 2017/10/22 07:37:05 UTC

Monitoring folder in flink

I have folder where new files arrive at schedule. Why is my flink readfile
not reading new files. I have used but *PROCESS_ONCE* and
*PROCESS_CONTINUOUSLY*. When I use *PROCESS_CONTINUOUSLY* it reads the same
file but the execution does not terminate whereas for PROCESS_ONCE it
terminates in IDE.

    String path = "C:\\test";

StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment();

    TextInputFormat format = new TextInputFormat(new
org.apache.flink.core.fs.Path(path));

    DataStream<String> inputStream = env.readFile(format, path,
FileProcessingMode.PROCESS_ONCE, 100);

Here at stackoverflow
<https://stackoverflow.com/questions/46871106/how-to-read-new-files-arriving-in-folder-using-flink>
.

Re: Monitoring folder in flink

Posted by Sugandha Amatya <su...@gmail.com>.
Hi

I found that flink polls directory based on modified date. In windows when
I copy files the modified date remained same. So, PROCESS_CONTINUOUSLY
resolved the issue.

On Tue, Oct 24, 2017 at 6:09 PM, Fabian Hueske <fh...@gmail.com> wrote:

> Hi,
>
> with PROCESS_CONTINUOUSLY the application monitors the directory and
> processes new arriving files or files that have been modified. In this case
> the application never terminates because it is waiting for new files to
> appear.
> With PROCESS_ONCE, the content of a directory is processed as it was when
> the application was started. Once all files are processed the application
> terminates.
>
> What kind of behavior are you looking for?
>
> Best, Fabian
>
> 2017-10-22 9:37 GMT+02:00 Sugandha Amatya <su...@gmail.com>:
>
>> I have folder where new files arrive at schedule. Why is my flink
>> readfile not reading new files. I have used but *PROCESS_ONCE* and
>> *PROCESS_CONTINUOUSLY*. When I use *PROCESS_CONTINUOUSLY* it reads the same
>> file but the execution does not terminate whereas for PROCESS_ONCE it
>> terminates in IDE.
>>
>>     String path = "C:\\test";
>>
>> StreamExecutionEnvironment env = StreamExecutionEnvironment.get
>> ExecutionEnvironment();
>>
>>     TextInputFormat format = new TextInputFormat(new
>> org.apache.flink.core.fs.Path(path));
>>
>>     DataStream<String> inputStream = env.readFile(format, path,
>> FileProcessingMode.PROCESS_ONCE, 100);
>>
>> Here at stackoverflow
>> <https://stackoverflow.com/questions/46871106/how-to-read-new-files-arriving-in-folder-using-flink>
>> .
>>
>
>

Re: Monitoring folder in flink

Posted by Fabian Hueske <fh...@gmail.com>.
Hi,

with PROCESS_CONTINUOUSLY the application monitors the directory and
processes new arriving files or files that have been modified. In this case
the application never terminates because it is waiting for new files to
appear.
With PROCESS_ONCE, the content of a directory is processed as it was when
the application was started. Once all files are processed the application
terminates.

What kind of behavior are you looking for?

Best, Fabian

2017-10-22 9:37 GMT+02:00 Sugandha Amatya <su...@gmail.com>:

> I have folder where new files arrive at schedule. Why is my flink readfile
> not reading new files. I have used but *PROCESS_ONCE* and
> *PROCESS_CONTINUOUSLY*. When I use *PROCESS_CONTINUOUSLY* it reads the same
> file but the execution does not terminate whereas for PROCESS_ONCE it
> terminates in IDE.
>
>     String path = "C:\\test";
>
> StreamExecutionEnvironment env = StreamExecutionEnvironment.
> getExecutionEnvironment();
>
>     TextInputFormat format = new TextInputFormat(new
> org.apache.flink.core.fs.Path(path));
>
>     DataStream<String> inputStream = env.readFile(format, path,
> FileProcessingMode.PROCESS_ONCE, 100);
>
> Here at stackoverflow
> <https://stackoverflow.com/questions/46871106/how-to-read-new-files-arriving-in-folder-using-flink>
> .
>

Re: Monitoring folder in flink

Posted by kitex <su...@gmail.com>.
The code I have pasted is all that I have.



--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Monitoring folder in flink

Posted by "☼ R Nair (रविशंकर नायर)" <ra...@gmail.com>.
Can you please share the full code?

Thanks, RAV

On Oct 22, 2017 3:37 AM, "Sugandha Amatya" <su...@gmail.com>
wrote:

I have folder where new files arrive at schedule. Why is my flink readfile
not reading new files. I have used but *PROCESS_ONCE* and
*PROCESS_CONTINUOUSLY*. When I use *PROCESS_CONTINUOUSLY* it reads the same
file but the execution does not terminate whereas for PROCESS_ONCE it
terminates in IDE.

    String path = "C:\\test";

StreamExecutionEnvironment env = StreamExecutionEnvironment.
getExecutionEnvironment();

    TextInputFormat format = new TextInputFormat(new
org.apache.flink.core.fs.Path(path));

    DataStream<String> inputStream = env.readFile(format, path,
FileProcessingMode.PROCESS_ONCE, 100);

Here at stackoverflow
<https://stackoverflow.com/questions/46871106/how-to-read-new-files-arriving-in-folder-using-flink>
.