You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@beam.apache.org by thinkdoom <th...@qq.com> on 2019/12/12 09:32:05 UTC

What's every part's responsiblity for python sdk with flink?

1. what i grasp there are 4 part, and the data flow and call step is describe as below, is it right? 

user python code&nbsp; -&gt;&nbsp; beam-runners-flink-1.8-job-server -&gt; SDK harness -&gt; flink cluster.
1. What's every part's responsiblity for the 4 part?

Re: What's every part's responsiblity for python sdk with flink?

Posted by Kyle Weaver <kc...@google.com>.

The order is: user python code  ->  job server -> *flink cluster -> SDK
harness*

1. User python code defines the Beam pipeline.
2. The job server executes the Beam pipeline on the Flink cluster. To do
so, it must translate Beam operations into Flink native operations.
3. The Flink cluster executes the transforms specified by Beam, the same as
it would execute any "normal" Flink pipeline.
4. When needed, a Beam-defined Flink transform (running on a Flink task
manager) will invoke the SDK harness to run the python user code.

Might be a little overly simplistic, but that's the gist. For more info,
there are several public talks on this available online. I think this one
is the latest: https://youtu.be/hxHGLrshnCY?t=1769
<https://www.youtube.com/watch?v=hxHGLrshnCY>

On Thu, Dec 12, 2019 at 1:32 AM thinkdoom <th...@qq.com> wrote:

> 1. what i grasp there are 4 part, and the data flow and call step is
> describe as below, is it right?
> user python code  ->  beam-runners-flink-1.8-job-server -> SDK harness ->
> flink cluster.
> 1. What's every part's responsiblity for the 4 part?
>
>