You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Roc Marshal <fl...@126.com> on 2020/05/02 03:18:01 UTC
[DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn
mode
Hi all.
Expect to have such a mode of submission. Build the job directly in the Environment, and then submit the job in yarn mode. Just like RemoteStreamEnvironment, as long as you specify the parameters of the yarn cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME, you can use the topology built by Env to submit the job .
This submission method is best to minimize the transmission of resources required by yarn to start flink-jobmanager and taskmanagerrunner to ensure that flink can deploy job on the yarn cluster as quickly as possible.
The simple demo as shown in the picture .the parameter named 'env' containes all the operators about job ,like sources,maps,etc..
Thank you for your attention.
Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode
Posted by Yang Wang <da...@gmail.com>.
Hi Roc Marshal,
I have a question about making the Yarn deployment ASAP. In my opinion,
using the "ExecutionEnvironment"
instead of "flink run -m yarn-cluster" to deploy a Flink cluster on Yarn do
not help to reduce the time cost. Since
we still need to ship the user jars, flink libs to the HDFS staging
directory and register as Yarn local resource.
If we want to achieve this, we need to use the pre-uploaded libs to avoid
the unnecessary uploading and
downloading. We already have a ticket for this[1].
[1].https://issues.apache.org/jira/browse/FLINK-13938
Best,
Yang
Roc Marshal <fl...@126.com> 于2020年5月5日周二 下午7:26写道:
> Sorry, I confused JIRA with email.
> The Attachment link:
> https://gitee.com/RocMarshal/resources4link/blob/master/README.md
> The JIRA ID: FLINK-17472
> The JIRA link:
> https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-17472
>
> Best,
> Roc
>
>
>
> | |
> Roc Marshal
> |
> |
> 邮箱:flinker@126.com
> |
>
> 签名由 网易邮箱大师 定制
>
> On 05/05/2020 18:43, Aljoscha Krettek wrote:
> Could you post the Jira issue here? I don't see it mentioned in this
> thread so far.
>
> Best,
> Aljoscha
>
> On 05.05.20 12:32, Roc Marshal wrote:
> > Hi,Aljoscha.<br/><br/>I have updated the JIRA according to your
> suggestion. Thank you very much.<br/><br/><br/>Best,<br/>Roc
> > At 2020-05-05 16:04:01, "Aljoscha Krettek" <al...@apache.org> wrote:
> >> Hi,
> >>
> >> image attachments don't work on this ML. You will have to upload the
> >> image somewhere and post a link.
> >>
> >> Best,
> >> Aljoscha
> >>
> >> On 02.05.20 09:16, Jeff Zhang wrote:
> >>> Hi Roc,
> >>>
> >>> You can try flink on zeppelin, where you can submit flink job to yarn
> >>> directly without starting flink cluster by yourself. Here's a few
> >>> tutorials.
> >>>
> >>> 1) Get started https://link.medium.com/oppqD6dIg5
> >>> <https://t.co/PTouUYYTrv?amp=1> 2) Batch
> https://link.medium.com/3qumbwRIg5
> >>> <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
> >>> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4)
> Advanced
> >>> usage https://link.medium.com/CAekyoXIg5 <
> https://t.co/MXolULmafZ?amp=1>
> >>>
> >>>
> >>>
> >>> Roc Marshal <fl...@126.com> 于2020年5月2日周六 上午11:18写道:
> >>>
> >>>> Hi all.
> >>>> Expect to have such a mode of submission. Build the job
> directly in
> >>>> the Environment, and then submit the job in yarn mode. Just like
> >>>> RemoteStreamEnvironment, as long as you specify the parameters of the
> yarn
> >>>> cluster (host, port) or yarn configuration directory and
> HADOOP_USER_NAME,
> >>>> you can use the topology built by Env to submit the job .
> >>>> This submission method is best to minimize the transmission
> of
> >>>> resources required by yarn to start flink-jobmanager and
> taskmanagerrunner
> >>>> to ensure that flink can deploy job on the yarn cluster as quickly as
> >>>> possible.
> >>>> The simple demo as shown in the picture .the parameter named 'env'
> >>>> containes all the operators about job ,like sources,maps,etc..
> >>>>
> >>>> Thank you for your attention.
> >>>>
> >>>>
> >>>>
> >>>>
> >>>
> >>>
>
Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in
Yarn mode
Posted by Roc Marshal <fl...@126.com>.
Sorry, I confused JIRA with email.
The Attachment link: https://gitee.com/RocMarshal/resources4link/blob/master/README.md
The JIRA ID: FLINK-17472
The JIRA link: https://issues.apache.org/jira/plugins/servlet/mobile#issue/FLINK-17472
Best,
Roc
| |
Roc Marshal
|
|
邮箱:flinker@126.com
|
签名由 网易邮箱大师 定制
On 05/05/2020 18:43, Aljoscha Krettek wrote:
Could you post the Jira issue here? I don't see it mentioned in this
thread so far.
Best,
Aljoscha
On 05.05.20 12:32, Roc Marshal wrote:
> Hi,Aljoscha.<br/><br/>I have updated the JIRA according to your suggestion. Thank you very much.<br/><br/><br/>Best,<br/>Roc
> At 2020-05-05 16:04:01, "Aljoscha Krettek" <al...@apache.org> wrote:
>> Hi,
>>
>> image attachments don't work on this ML. You will have to upload the
>> image somewhere and post a link.
>>
>> Best,
>> Aljoscha
>>
>> On 02.05.20 09:16, Jeff Zhang wrote:
>>> Hi Roc,
>>>
>>> You can try flink on zeppelin, where you can submit flink job to yarn
>>> directly without starting flink cluster by yourself. Here's a few
>>> tutorials.
>>>
>>> 1) Get started https://link.medium.com/oppqD6dIg5
>>> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
>>> <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
>>> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
>>> usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>
>>>
>>>
>>>
>>> Roc Marshal <fl...@126.com> 于2020年5月2日周六 上午11:18写道:
>>>
>>>> Hi all.
>>>> Expect to have such a mode of submission. Build the job directly in
>>>> the Environment, and then submit the job in yarn mode. Just like
>>>> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
>>>> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
>>>> you can use the topology built by Env to submit the job .
>>>> This submission method is best to minimize the transmission of
>>>> resources required by yarn to start flink-jobmanager and taskmanagerrunner
>>>> to ensure that flink can deploy job on the yarn cluster as quickly as
>>>> possible.
>>>> The simple demo as shown in the picture .the parameter named 'env'
>>>> containes all the operators about job ,like sources,maps,etc..
>>>>
>>>> Thank you for your attention.
>>>>
>>>>
>>>>
>>>>
>>>
>>>
Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn
mode
Posted by Aljoscha Krettek <al...@apache.org>.
Could you post the Jira issue here? I don't see it mentioned in this
thread so far.
Best,
Aljoscha
On 05.05.20 12:32, Roc Marshal wrote:
> Hi,Aljoscha.<br/><br/>I have updated the JIRA according to your suggestion. Thank you very much.<br/><br/><br/>Best,<br/>Roc
> At 2020-05-05 16:04:01, "Aljoscha Krettek" <al...@apache.org> wrote:
>> Hi,
>>
>> image attachments don't work on this ML. You will have to upload the
>> image somewhere and post a link.
>>
>> Best,
>> Aljoscha
>>
>> On 02.05.20 09:16, Jeff Zhang wrote:
>>> Hi Roc,
>>>
>>> You can try flink on zeppelin, where you can submit flink job to yarn
>>> directly without starting flink cluster by yourself. Here's a few
>>> tutorials.
>>>
>>> 1) Get started https://link.medium.com/oppqD6dIg5
>>> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
>>> <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
>>> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
>>> usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>
>>>
>>>
>>>
>>> Roc Marshal <fl...@126.com> 于2020年5月2日周六 上午11:18写道:
>>>
>>>> Hi all.
>>>> Expect to have such a mode of submission. Build the job directly in
>>>> the Environment, and then submit the job in yarn mode. Just like
>>>> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
>>>> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
>>>> you can use the topology built by Env to submit the job .
>>>> This submission method is best to minimize the transmission of
>>>> resources required by yarn to start flink-jobmanager and taskmanagerrunner
>>>> to ensure that flink can deploy job on the yarn cluster as quickly as
>>>> possible.
>>>> The simple demo as shown in the picture .the parameter named 'env'
>>>> containes all the operators about job ,like sources,maps,etc..
>>>>
>>>> Thank you for your attention.
>>>>
>>>>
>>>>
>>>>
>>>
>>>
Re:Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in
Yarn mode
Posted by Roc Marshal <fl...@126.com>.
Hi,Aljoscha.<br/><br/>I have updated the JIRA according to your suggestion. Thank you very much.<br/><br/><br/>Best,<br/>Roc
At 2020-05-05 16:04:01, "Aljoscha Krettek" <al...@apache.org> wrote:
>Hi,
>
>image attachments don't work on this ML. You will have to upload the
>image somewhere and post a link.
>
>Best,
>Aljoscha
>
>On 02.05.20 09:16, Jeff Zhang wrote:
>> Hi Roc,
>>
>> You can try flink on zeppelin, where you can submit flink job to yarn
>> directly without starting flink cluster by yourself. Here's a few
>> tutorials.
>>
>> 1) Get started https://link.medium.com/oppqD6dIg5
>> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
>> <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
>> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
>> usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>
>>
>>
>>
>> Roc Marshal <fl...@126.com> 于2020年5月2日周六 上午11:18写道:
>>
>>> Hi all.
>>> Expect to have such a mode of submission. Build the job directly in
>>> the Environment, and then submit the job in yarn mode. Just like
>>> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
>>> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
>>> you can use the topology built by Env to submit the job .
>>> This submission method is best to minimize the transmission of
>>> resources required by yarn to start flink-jobmanager and taskmanagerrunner
>>> to ensure that flink can deploy job on the yarn cluster as quickly as
>>> possible.
>>> The simple demo as shown in the picture .the parameter named 'env'
>>> containes all the operators about job ,like sources,maps,etc..
>>>
>>> Thank you for your attention.
>>>
>>>
>>>
>>>
>>
>>
Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn
mode
Posted by Aljoscha Krettek <al...@apache.org>.
Hi,
image attachments don't work on this ML. You will have to upload the
image somewhere and post a link.
Best,
Aljoscha
On 02.05.20 09:16, Jeff Zhang wrote:
> Hi Roc,
>
> You can try flink on zeppelin, where you can submit flink job to yarn
> directly without starting flink cluster by yourself. Here's a few
> tutorials.
>
> 1) Get started https://link.medium.com/oppqD6dIg5
> <https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
> <https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
> link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
> usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>
>
>
>
> Roc Marshal <fl...@126.com> 于2020年5月2日周六 上午11:18写道:
>
>> Hi all.
>> Expect to have such a mode of submission. Build the job directly in
>> the Environment, and then submit the job in yarn mode. Just like
>> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
>> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
>> you can use the topology built by Env to submit the job .
>> This submission method is best to minimize the transmission of
>> resources required by yarn to start flink-jobmanager and taskmanagerrunner
>> to ensure that flink can deploy job on the yarn cluster as quickly as
>> possible.
>> The simple demo as shown in the picture .the parameter named 'env'
>> containes all the operators about job ,like sources,maps,etc..
>>
>> Thank you for your attention.
>>
>>
>>
>>
>
>
Re: [DISCUSS] Introduce The Batch/Stream ExecutionEnvironment in Yarn mode
Posted by Jeff Zhang <zj...@gmail.com>.
Hi Roc,
You can try flink on zeppelin, where you can submit flink job to yarn
directly without starting flink cluster by yourself. Here's a few
tutorials.
1) Get started https://link.medium.com/oppqD6dIg5
<https://t.co/PTouUYYTrv?amp=1> 2) Batch https://link.medium.com/3qumbwRIg5
<https://t.co/Yo9QAY0Joj?amp=1> 3) Streaming https://
link.medium.com/RBHa2lTIg5 <https://t.co/sUapN40tvI?amp=1> 4) Advanced
usage https://link.medium.com/CAekyoXIg5 <https://t.co/MXolULmafZ?amp=1>
Roc Marshal <fl...@126.com> 于2020年5月2日周六 上午11:18写道:
> Hi all.
> Expect to have such a mode of submission. Build the job directly in
> the Environment, and then submit the job in yarn mode. Just like
> RemoteStreamEnvironment, as long as you specify the parameters of the yarn
> cluster (host, port) or yarn configuration directory and HADOOP_USER_NAME,
> you can use the topology built by Env to submit the job .
> This submission method is best to minimize the transmission of
> resources required by yarn to start flink-jobmanager and taskmanagerrunner
> to ensure that flink can deploy job on the yarn cluster as quickly as
> possible.
> The simple demo as shown in the picture .the parameter named 'env'
> containes all the operators about job ,like sources,maps,etc..
>
> Thank you for your attention.
>
>
>
>
--
Best Regards
Jeff Zhang