You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Chang Liu <fl...@gmail.com> on 2018/11/21 13:59:38 UTC

Assign IDs to Operators

Dear All,

As stated here (https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html <https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html>), it is highly recommended to assign IDs to Operators, especially for the stateful ones.

My question is: what is the gradually of a so-called Operator.

To be more specific, in the following example, we have the Operators like, addSource and map. I am wondering is shuffle and print also a some kind of Operator?

DataStream<String> stream = env.
  // Stateful source (e.g. Kafka) with ID
  .addSource(new StatefulSource())
  .uid("source-id") // ID for the source operator
  .shuffle()
  // Stateful mapper with ID
  .map(new StatefulMapper())
  .uid("mapper-id") // ID for the mapper
  // Stateless printing sink
  .print(); // Auto-generated ID

Or, in the following example, how many Operator we have (that we can assign IDs to)? 3? KeyBy, window and aggregate?


input
    .keyBy(<key selector>)
    .window(<window assigner>)
    .aggregate(new AverageAggregate)

Then, how many Operators (and which are they) do we have in the following example?

stream.join(otherStream)
    .where(<KeySelector>)
    .equalTo(<KeySelector>)
    .window(<WindowAssigner>)
    .apply(<JoinFunction>)

Many Thanks.

Best regards/祝好,

Chang Liu 刘畅



Re: Assign IDs to Operators

Posted by Jamie Grier <jg...@lyft.com>.
Hi Chang,

The partitioning steps, like keyBy() are not operators.  In general you can
let Flink's fluent-style API tell you the answer.  If you can call .uid()
in the API and it compiles then the thing just before that is an operator ;)

-Jamie


On Wed, Nov 21, 2018 at 5:59 AM Chang Liu <fl...@gmail.com> wrote:

> Dear All,
>
> As stated here (
> https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/savepoints.html),
> it is highly recommended to assign IDs to Operators, especially for the
> stateful ones.
>
> My question is: what is the gradually of a so-called Operator.
>
> To be more specific, in the following example, we have the Operators like,
> addSource and map. I am wondering is shuffle and print also a some kind of
> Operator?
>
> DataStream<String> stream = env.
>   // Stateful source (e.g. Kafka) with ID
>   .addSource(new StatefulSource())
>   .uid("source-id") // ID for the source operator
>   .shuffle()
>   // Stateful mapper with ID
>   .map(new StatefulMapper())
>   .uid("mapper-id") // ID for the mapper
>   // Stateless printing sink
>   .print(); // Auto-generated ID
>
>
> Or, in the following example, how many Operator we have (that we can
> assign IDs to)? 3? KeyBy, window and aggregate?
>
>
> input
>     .keyBy(<key selector>)
>     .window(<window assigner>)
>     .aggregate(new AverageAggregate)
>
>
> Then, how many Operators (and which are they) do we have in the following
> example?
>
> stream.join(otherStream)
>     .where(<KeySelector>)
>     .equalTo(<KeySelector>)
>     .window(<WindowAssigner>)
>     .apply(<JoinFunction>)
>
>
> Many Thanks.
>
> Best regards/祝好,
>
> Chang Liu 刘畅
>
>
>