You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Congxian Qiu <qc...@gmail.com> on 2020/05/06 10:36:06 UTC

Re: checkpointing opening too many file

Hi

Yes, for your use case, if you do not have large state size, you can try to
use FsStateBackend.
Best,
Congxian


ysnakie <ys...@hotmail.com> 于2020年4月27日周一 下午3:42写道:

> Hi
> If I use FsStateBackend instead of RocksdbFsStateBackend, will the open
> files decrease significantly? I dont have large state size.
>
> thanks
> On 4/25/2020 13:48,Congxian Qiu<qc...@gmail.com>
> <qc...@gmail.com> wrote:
>
> Hi
> If there are indeed so many files need to upload to hdfs, then currently
> we do not have any solutions to limit the open files, there exist an
> issue[1] wants to fix this problem, and a pr for it, maybe you can try the
> attached pr to try it can solve your problem.
>
> [1] https://issues.apache.org/jira/browse/FLINK-11937
> Best,
> Congxian
>
>
> ysnakie <ys...@hotmail.com> 于2020年4月24日周五 下午11:30写道:
>
>> Hi everyone
>> We have a Flink Job to write files to HDFS's different directories. It
>> will open many files due to its high parallelism. I also found that if
>> using rocksdb state backend, it will have even more files open during the
>> checkpointing.  We use yarn to schedule Flink job. However yarn always
>> schedule taskmanagers to the same machine and I cannot control it! So the
>> datanode will get very very high pressure and always throw a "bad link"
>> error.  We hava already increase the xiceviers limit of HDFS to 16384
>>
>> Any idea to solve this problem? reduce the number of opening file or
>> control the yarn scheduling to put taskmanager on different machines!
>>
>> Thank you very much!
>> regards
>>
>> Shengnan
>>
>>

Re: checkpointing opening too many file

Posted by David Anderson <da...@alpinegizmo.com>.
With the FsStateBackend you could also try increasing the value
of state.backend.fs.memory-threshold [1]. Only those state chunks that are
larger than this value are stored in separate files; smaller chunks go into
the checkpoint metadata file. The default is 1KB, increasing this
should reduce filesystem stress for small state.

[1]
https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#state-backend-fs-memory-threshold

Best,
David

On Wed, May 6, 2020 at 12:36 PM Congxian Qiu <qc...@gmail.com> wrote:

> Hi
>
> Yes, for your use case, if you do not have large state size, you can try
> to use FsStateBackend.
> Best,
> Congxian
>
>
> ysnakie <ys...@hotmail.com> 于2020年4月27日周一 下午3:42写道:
>
>> Hi
>> If I use FsStateBackend instead of RocksdbFsStateBackend, will the open
>> files decrease significantly? I dont have large state size.
>>
>> thanks
>> On 4/25/2020 13:48,Congxian Qiu<qc...@gmail.com>
>> <qc...@gmail.com> wrote:
>>
>> Hi
>> If there are indeed so many files need to upload to hdfs, then currently
>> we do not have any solutions to limit the open files, there exist an
>> issue[1] wants to fix this problem, and a pr for it, maybe you can try the
>> attached pr to try it can solve your problem.
>>
>> [1] https://issues.apache.org/jira/browse/FLINK-11937
>> Best,
>> Congxian
>>
>>
>> ysnakie <ys...@hotmail.com> 于2020年4月24日周五 下午11:30写道:
>>
>>> Hi everyone
>>> We have a Flink Job to write files to HDFS's different directories. It
>>> will open many files due to its high parallelism. I also found that if
>>> using rocksdb state backend, it will have even more files open during the
>>> checkpointing.  We use yarn to schedule Flink job. However yarn always
>>> schedule taskmanagers to the same machine and I cannot control it! So the
>>> datanode will get very very high pressure and always throw a "bad link"
>>> error.  We hava already increase the xiceviers limit of HDFS to 16384
>>>
>>> Any idea to solve this problem? reduce the number of opening file or
>>> control the yarn scheduling to put taskmanager on different machines!
>>>
>>> Thank you very much!
>>> regards
>>>
>>> Shengnan
>>>
>>>