You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Jayant Ameta <wi...@gmail.com> on 2017/12/06 08:43:21 UTC

Performance of docker-flink

Hi,
I wanted to explore docker-flink (using Ceph for state backend). before
opting for a standalone cluster.

Has there been any comparative studies on the performance of docker-flink?
Would the states be consistent and performant if the docker containers go
down and respawn frequently?

Re: Performance of docker-flink

Posted by Jayant Ameta <wi...@gmail.com>.
Thank you Gary.
I know that theoretically there shouldn't be any performance issue.
I was curious to know if any other users have tried out docker-flink and
whether they have faced/reported any performance hit. I would want real
time processing for some of the events, and was looking existing users'
experience with docker-flink.


Jayant Ameta

On Thu, Dec 7, 2017 at 4:37 PM, Gary Yao <ga...@data-artisans.com> wrote:

> Hi Jayant,
>
> Running Flink in a Docker container should not have an impact on the
> performance
> in itself. Docker does not employ virtualization. To put it simply, Docker
> containers are processes on the host operating system that are isolated
> against
> each other using kernel features. See [1] for a more in-depth discussion.
>
> Whether the state of your Flink Application remains consistent when
> containers
> get restarted depends on many factors, such as whether you have
> checkpointing
> and JobManager HA enabled [2][3]. Also the checkpoint files still need to
> be
> available for job recovery after container restarts.
>
> If you want to use the docker images published under
> https://hub.docker.com/_/flink/, you probably want to overwrite the
> provided
> flink-conf.yaml by setting the FLINK_CONF_DIR environment variable to
> enable a
> fault tolerant setup.
>
> Best,
> Gary
>
> [1] https://stackoverflow.com/questions/21889053/what-is-
> the-runtime-performance-cost-of-a-docker-container
> [2] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/checkpoints.html
> [3] https://ci.apache.org/projects/flink/flink-docs-
> release-1.3/setup/jobmanager_high_availability.html
>
> On Wed, Dec 6, 2017 at 9:43 AM, Jayant Ameta <wi...@gmail.com> wrote:
>
>> Hi,
>> I wanted to explore docker-flink (using Ceph for state backend). before
>> opting for a standalone cluster.
>>
>> Has there been any comparative studies on the performance of
>> docker-flink? Would the states be consistent and performant if the docker
>> containers go down and respawn frequently?
>>
>
>

Re: Performance of docker-flink

Posted by Gary Yao <ga...@data-artisans.com>.
Hi Jayant,

Running Flink in a Docker container should not have an impact on the
performance
in itself. Docker does not employ virtualization. To put it simply, Docker
containers are processes on the host operating system that are isolated
against
each other using kernel features. See [1] for a more in-depth discussion.

Whether the state of your Flink Application remains consistent when
containers
get restarted depends on many factors, such as whether you have
checkpointing
and JobManager HA enabled [2][3]. Also the checkpoint files still need to be
available for job recovery after container restarts.

If you want to use the docker images published under
https://hub.docker.com/_/flink/, you probably want to overwrite the provided
flink-conf.yaml by setting the FLINK_CONF_DIR environment variable to
enable a
fault tolerant setup.

Best,
Gary

[1]
https://stackoverflow.com/questions/21889053/what-is-the-runtime-performance-cost-of-a-docker-container
[2]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/checkpoints.html
[3]
https://ci.apache.org/projects/flink/flink-docs-release-1.3/setup/jobmanager_high_availability.html

On Wed, Dec 6, 2017 at 9:43 AM, Jayant Ameta <wi...@gmail.com> wrote:

> Hi,
> I wanted to explore docker-flink (using Ceph for state backend). before
> opting for a standalone cluster.
>
> Has there been any comparative studies on the performance of docker-flink?
> Would the states be consistent and performant if the docker containers go
> down and respawn frequently?
>