You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Manas Kale <ma...@gmail.com> on 2021/02/16 11:59:32 UTC

Flink docker in session cluster mode - is a local distribution needed?

Hi,

I have a project that is a set of 6 jobs out of which 4 are written in Java
and 2 are written in pyFlink. I want to dockerize these so that all 6 can
be run in a single Flink session cluster.

I have been able to successfully set up the JobManager and TaskManager
containers as per [1] after creating a custom Docker image that has Python.
For the last step, the guide asks us to submit the job using a local
distribution of Flink:

$ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar

I am probably missing something here because I have the following questions:
Why do I need to use a local distribution to submit a job?
Why can't I use the Flink distribution that already exists within the
images?
How do I submit a job using the Docker image's distribution?


[1]
https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/docker.html#starting-a-session-cluster-on-docker

Re: Flink docker in session cluster mode - is a local distribution needed?

Posted by Yang Wang <da...@gmail.com>.
I am not aware of some simple solution we could use the Flink runtime jars
within the docker images, except for "docker run/exec".
So if we want to provide some easy commands to submit Flink jobs, I think
they are also a wrapper of "docker run/exec".

Best,
Yang

Till Rohrmann <tr...@apache.org> 于2021年2月17日周三 下午9:04写道:

> Yes, agreed. This could be better streamlined. If you wanna help with
> this, then feel free to open a JIRA issue for it.
>
> Cheers,
> Till
>
> On Wed, Feb 17, 2021 at 11:37 AM Manas Kale <ma...@gmail.com> wrote:
>
>> Hi Till,
>> Oh I see... I managed to do what you said using a bunch of docker exec
>> commands. However, I think this solution is quite hacky and could be
>> improved by providing some simple command to submit jobs using the Flink
>> runtime within the docker images. I believe this will achieve full
>> containerization - the host system is not at all expected to have the Flink
>> runtime, everything is within Docker images.
>>
>> Thanks a lot!
>>
>> On Tue, Feb 16, 2021 at 6:08 PM Till Rohrmann <tr...@apache.org>
>> wrote:
>>
>>> Hi Manas,
>>>
>>> I think the documentation assumes that you first start a session cluster
>>> and then submit jobs from outside the Docker images. If your jobs are
>>> included in the Docker image, then you could log into the master process
>>> and start the jobs from within the Docker image.
>>>
>>> Cheers,
>>> Till
>>>
>>> On Tue, Feb 16, 2021 at 1:00 PM Manas Kale <ma...@gmail.com>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I have a project that is a set of 6 jobs out of which 4 are written in
>>>> Java and 2 are written in pyFlink. I want to dockerize these so that all 6
>>>> can be run in a single Flink session cluster.
>>>>
>>>> I have been able to successfully set up the JobManager and TaskManager
>>>> containers as per [1] after creating a custom Docker image that has Python.
>>>> For the last step, the guide asks us to submit the job using a local
>>>> distribution of Flink:
>>>>
>>>> $ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
>>>>
>>>> I am probably missing something here because I have the following
>>>> questions:
>>>> Why do I need to use a local distribution to submit a job?
>>>> Why can't I use the Flink distribution that already exists within the
>>>> images?
>>>> How do I submit a job using the Docker image's distribution?
>>>>
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/docker.html#starting-a-session-cluster-on-docker
>>>>
>>>>

Re: Flink docker in session cluster mode - is a local distribution needed?

Posted by Till Rohrmann <tr...@apache.org>.
Yes, agreed. This could be better streamlined. If you wanna help with this,
then feel free to open a JIRA issue for it.

Cheers,
Till

On Wed, Feb 17, 2021 at 11:37 AM Manas Kale <ma...@gmail.com> wrote:

> Hi Till,
> Oh I see... I managed to do what you said using a bunch of docker exec
> commands. However, I think this solution is quite hacky and could be
> improved by providing some simple command to submit jobs using the Flink
> runtime within the docker images. I believe this will achieve full
> containerization - the host system is not at all expected to have the Flink
> runtime, everything is within Docker images.
>
> Thanks a lot!
>
> On Tue, Feb 16, 2021 at 6:08 PM Till Rohrmann <tr...@apache.org>
> wrote:
>
>> Hi Manas,
>>
>> I think the documentation assumes that you first start a session cluster
>> and then submit jobs from outside the Docker images. If your jobs are
>> included in the Docker image, then you could log into the master process
>> and start the jobs from within the Docker image.
>>
>> Cheers,
>> Till
>>
>> On Tue, Feb 16, 2021 at 1:00 PM Manas Kale <ma...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I have a project that is a set of 6 jobs out of which 4 are written in
>>> Java and 2 are written in pyFlink. I want to dockerize these so that all 6
>>> can be run in a single Flink session cluster.
>>>
>>> I have been able to successfully set up the JobManager and TaskManager
>>> containers as per [1] after creating a custom Docker image that has Python.
>>> For the last step, the guide asks us to submit the job using a local
>>> distribution of Flink:
>>>
>>> $ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
>>>
>>> I am probably missing something here because I have the following
>>> questions:
>>> Why do I need to use a local distribution to submit a job?
>>> Why can't I use the Flink distribution that already exists within the
>>> images?
>>> How do I submit a job using the Docker image's distribution?
>>>
>>>
>>> [1]
>>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/docker.html#starting-a-session-cluster-on-docker
>>>
>>>

Re: Flink docker in session cluster mode - is a local distribution needed?

Posted by Manas Kale <ma...@gmail.com>.
Hi Till,
Oh I see... I managed to do what you said using a bunch of docker exec
commands. However, I think this solution is quite hacky and could be
improved by providing some simple command to submit jobs using the Flink
runtime within the docker images. I believe this will achieve full
containerization - the host system is not at all expected to have the Flink
runtime, everything is within Docker images.

Thanks a lot!

On Tue, Feb 16, 2021 at 6:08 PM Till Rohrmann <tr...@apache.org> wrote:

> Hi Manas,
>
> I think the documentation assumes that you first start a session cluster
> and then submit jobs from outside the Docker images. If your jobs are
> included in the Docker image, then you could log into the master process
> and start the jobs from within the Docker image.
>
> Cheers,
> Till
>
> On Tue, Feb 16, 2021 at 1:00 PM Manas Kale <ma...@gmail.com> wrote:
>
>> Hi,
>>
>> I have a project that is a set of 6 jobs out of which 4 are written in
>> Java and 2 are written in pyFlink. I want to dockerize these so that all 6
>> can be run in a single Flink session cluster.
>>
>> I have been able to successfully set up the JobManager and TaskManager
>> containers as per [1] after creating a custom Docker image that has Python.
>> For the last step, the guide asks us to submit the job using a local
>> distribution of Flink:
>>
>> $ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
>>
>> I am probably missing something here because I have the following
>> questions:
>> Why do I need to use a local distribution to submit a job?
>> Why can't I use the Flink distribution that already exists within the
>> images?
>> How do I submit a job using the Docker image's distribution?
>>
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/docker.html#starting-a-session-cluster-on-docker
>>
>>

Re: Flink docker in session cluster mode - is a local distribution needed?

Posted by Till Rohrmann <tr...@apache.org>.
Hi Manas,

I think the documentation assumes that you first start a session cluster
and then submit jobs from outside the Docker images. If your jobs are
included in the Docker image, then you could log into the master process
and start the jobs from within the Docker image.

Cheers,
Till

On Tue, Feb 16, 2021 at 1:00 PM Manas Kale <ma...@gmail.com> wrote:

> Hi,
>
> I have a project that is a set of 6 jobs out of which 4 are written in
> Java and 2 are written in pyFlink. I want to dockerize these so that all 6
> can be run in a single Flink session cluster.
>
> I have been able to successfully set up the JobManager and TaskManager
> containers as per [1] after creating a custom Docker image that has Python.
> For the last step, the guide asks us to submit the job using a local
> distribution of Flink:
>
> $ ./bin/flink run ./examples/streaming/TopSpeedWindowing.jar
>
> I am probably missing something here because I have the following
> questions:
> Why do I need to use a local distribution to submit a job?
> Why can't I use the Flink distribution that already exists within the
> images?
> How do I submit a job using the Docker image's distribution?
>
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-stable/deployment/resource-providers/standalone/docker.html#starting-a-session-cluster-on-docker
>
>