You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@flink.apache.org by Gregory Fee <gf...@lyft.com> on 2018/03/08 23:47:44 UTC

UUIDs generated by Flink SQL

Hello, from what I understand in the documentation it appears there is no
way to assign UUIDs to operators added to the DAG by Flink SQL. Is my
understanding correct?

I'd very much like to be able to assign UUIDs to those operators. I want to
run a program using some Flink SQL, create a save point, and then run
another program with slightly different structure that picks up from that
save point. The suggested way of making something like that work in the
document is to assign UUIDs but that doesn't seem possible if I'm using
Flink SQL. Any advice?

On a related note, I'm wondering what happens if I have a stateful program
using Flink SQL and I want to update my Flink binaries. If the query plan
ends up changing based on that upgrade does it mean that the load of the
save point is going to fail?

Thanks!

-- 
*Gregory Fee*
Engineer
425.830.4734 <+14258304734>
[image: Lyft] <http://www.lyft.com>

Re: UUIDs generated by Flink SQL

Posted by Fabian Hueske <fh...@gmail.com>.

Hi Gregory,

Your understanding is correct. It is not possible to assign UUID to the
operators generated by the SQL/Table planner.
To be honest, I am not sure whether the use case that you are describing
should be the scope of the "officially" supported use cases of the API.
It would require in depth knowledge of the SQL operators' internals which
is something that we don't want to expose as public API because we want to
have the freedom to improve the execution code.

Having said that, we have thought about adding the possibility of adjusting
the parallelism of operators.
Similar to assigning UUIDs, this would require an intermediate step between
planning and submission because usually, you don't know the plan that is
generated.
This could be done by generating a representation of a plan that can be
modified before translating it into a DataStream program.

Right now, we don't aim to guarantee backwards compatibility for queries.
Starting a query from a savepoint works if you don't change the query and
flink-table version but might fail as soon as you change either of both.
If you start the same query with a different flink-table version, different
optimization rules or changes in the operators might result in different
states.
If you start a different query, the data types of the state of operators
will most likely have changed.
Coming up with an upgrade strategy for SQL queries is still a major TODO
and there are several ideas how this can be achieved.

Best, Fabian


2018-03-09 0:47 GMT+01:00 Gregory Fee <gf...@lyft.com>:

> Hello, from what I understand in the documentation it appears there is no
> way to assign UUIDs to operators added to the DAG by Flink SQL. Is my
> understanding correct?
>
> I'd very much like to be able to assign UUIDs to those operators. I want
> to run a program using some Flink SQL, create a save point, and then run
> another program with slightly different structure that picks up from that
> save point. The suggested way of making something like that work in the
> document is to assign UUIDs but that doesn't seem possible if I'm using
> Flink SQL. Any advice?
>
> On a related note, I'm wondering what happens if I have a stateful program
> using Flink SQL and I want to update my Flink binaries. If the query plan
> ends up changing based on that upgrade does it mean that the load of the
> save point is going to fail?
>
> Thanks!
>
> --
> *Gregory Fee*
> Engineer
> 425.830.4734 <+14258304734>
> [image: Lyft] <http://www.lyft.com>
>