You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Paul Lam <pa...@gmail.com> on 2022/05/04 08:42:57 UTC

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Hi Shengkai,

Thanks a lot for your input!

> I just wonder how the users can get the web ui in the application mode.
Therefore, it's better we can list the Web UI using the SHOW statement.
WDYT?

I think it's a valid approach. I'm adding it to the FLIP.

> After the investigation, I am fine with the QUERY but the keyword JOB is
also okay to me.

In addition, CockroachDB has both SHOW QUERIES [1] and SHOW JOBS [2],
while the former shows the active running queries and the latter shows the
background tasks like schema changes. FYI.

WRT the questions:

> 1. Could you add some details about the behaviour with the different
execution.target, e.g. session, application mode?

IMHO, the difference between different `execution.target` is mostly about
cluster startup, which has little relation with the proposed statements.
These statements rely on the current ClusterClient/JobClient API,
which is deployment mode agnostic. Canceling a job in an application
cluster is the same as in a session cluster.

BTW, application mode is still in the development progress ATM [3].

> 2. Considering the SQL Client/Gateway is not limited to submitting the job
to the specified cluster, is it able to list jobs in the other clusters?

I think multi-cluster support in SQL Client/Gateway should be aligned with
CLI, at least at the early phase. We may use SET  to set a cluster id for a
session, then we have access to the cluster. However,  every SHOW
statement would only involve one cluster.

Best,
Paul Lam

[1] https://www.cockroachlabs.com/docs/stable/show-statements.html
[2] https://www.cockroachlabs.com/docs/v21.2/show-jobs
[3] https://issues.apache.org/jira/browse/FLINK-26541

Shengkai Fang <fs...@gmail.com> 于2022年4月29日周五 15:36写道：

> Hi.
>
> Thanks for Paul's update.
>
> > It's better we can also get the infos about the cluster where the job is
> > running through the DESCRIBE statement.
>
> I just wonder how the users can get the web ui in the application mode.
> Therefore, it's better we can list the Web UI using the SHOW statement.
> WDYT?
>
>
> > QUERY or other keywords.
>
> I list the statement to manage the lifecycle of the query/dml in other
> systems:
>
> Mysql[1] allows users to SHOW [FULL] PROCESSLIST and use the KILL command
> to kill the query.
>
> ```
> mysql> SHOW PROCESSLIST;
>
> mysql> KILL 27;
> ```
>
>
> Postgres use the following statements to kill the queries.
>
> ```
> SELECT pg_cancel_backend(<pid of the process>)
>
> SELECT pg_terminate_backend(<pid of the process>)
> ```
>
> KSQL uses the following commands to control the query lifecycle[4].
>
> ```
> SHOW QUERIES;
>
> TERMINATE <query id>;
>
> ```
>
> [1] https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html
> [2] https://scaledynamix.com/blog/how-to-kill-mysql-queries/
> [3]
>
> https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql
> [4]
>
> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
> [5]
>
> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/
>
> After the investigation, I am fine with the QUERY but the keyword JOB is
> also okay to me.
>
> We also have two questions here.
>
> 1. Could you add some details about the behaviour with the different
> execution.target, e.g. session, application mode?
>
> 2. Considering the SQL Client/Gateway is not limited to submitting the job
> to the specified cluster, is it able to list jobs in the other clusters?
>
>
> Best,
> Shengkai
>
> Paul Lam <pa...@gmail.com> 于2022年4月28日周四 17:17写道：
>
> > Hi Martjin,
> >
> > Thanks a lot for your reply! I agree that the scope may be a bit
> confusing,
> > please let me clarify.
> >
> > The FLIP aims to add new SQL statements that are supported only in
> > sql-client, similar to
> > jar statements [1]. Jar statements can be parsed into jar operations,
> which
> > are used only in
> > CliClient in sql-client module and cannot be executed by TableEnvironment
> > (not available in
> > Table API program that contains SQL that you mentioned).
> >
> > WRT the unchanged CLI client, I mean CliClient instead of the sql-client
> > module, which
> > currently contains the gateway codes (e.g. Executor). The FLIP mainly
> > extends
> > the gateway part, and barely touches CliClient and REST server (REST
> > endpoint in FLIP-91).
> >
> > WRT the syntax, I don't have much experience with SQL standards, and I'd
> > like to hear
> > more opinions from the community. I prefer Hive-style syntax because I
> > think many users
> > are familiar with Hive, and there're on-going efforts to improve
> Flink-Hive
> > integration [2][3].
> > But my preference is not strong, I'm okay with other options too. Do you
> > think JOB/Task is
> > a good choice, or do you have other preferred keywords?
> >
> > [1]
> >
> >
> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/
> > [2]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility
> > [3]
> >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint
> >
> > Best,
> > Paul Lam
> >
> > Martijn Visser <ma...@apache.org> 于2022年4月26日周二 20:14写道：
> >
> > > Hi Paul,
> > >
> > > Thanks for creating the FLIP and opening the discussion. I did get a
> bit
> > > confused about the title, being "query lifecycle statements in SQL
> > client".
> > > This sounds like you want to adopt the SQL client, but you want to
> expand
> > > the SQL syntax with lifecycle statements, which could be used from the
> > SQL
> > > client, but of course also in a Table API program that contains SQL.
> > GIven
> > > that you're highlighting the CLI client as unchanged, this adds to more
> > > confusion.
> > >
> > > I am interested if there's anything listed in the SQL 2016 standard on
> > > these types of lifecycle statements. I did a quick scan for "SHOW
> > QUERIES"
> > > but couldn't find it. It would be great if we could stay as close as
> > > possible to such syntax. Overall I'm not in favour of using QUERIES as
> a
> > > keyword. I think Flink applications are not queries, but short- or long
> > > running applications. Why should we follow Hive's setup and indeed not
> > > others such as Snowflake, but also Postgres or MySQL?
> > >
> > > Best regards,
> > >
> > > Martijn Visser
> > > https://twitter.com/MartijnVisser82
> > > https://github.com/MartijnVisser
> > >
> > >
> > > On Fri, 22 Apr 2022 at 12:06, Paul Lam <pa...@gmail.com> wrote:
> > >
> > > > Hi Shengkai,
> > > >
> > > > Thanks a lot for your opinions!
> > > >
> > > > > 1. I think the keyword QUERY may confuse users because the
> statement
> > > also
> > > > > works for the DML statement.
> > > >
> > > > I slightly lean to QUERY, because:
> > > >
> > > > Hive calls DMLs queries. We could be better aligned with Hive using
> > > QUERY,
> > > > especially given that we plan to introduce Hive endpoint.
> > > > QUERY is a more SQL-like concept and friendly to SQL users.
> > > >
> > > > In general, my preference: QUERY > JOB > TASK. I’m okay with JOB, but
> > not
> > > > very good with TASK, as it conflicts with the task concept in Flink
> > > runtime.
> > > >
> > > > We could wait for more feedbacks from the community.
> > > >
> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
> > > terminate
> > > > > their jobs.
> > > >
> > > > Agreed. I’m okay with DROP. And if we want to align with Hive, KILL
> > might
> > > > an alternative.
> > > >
> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
> > > >
> > > > Agreed. It’s more SQL-like and intuitive. I’m updating the syntax on
> > the
> > > > FLIP.
> > > >
> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
> > more
> > > > > detailed job infos.
> > > >
> > > > That is a more SQL-like approach I think. But considering the
> > > > ClusterClient APIs, we can fetch the names and the status along in
> one
> > > > request,
> > > > thus it may be more user friendly to return them all in the SHOW
> > > > statement?
> > > >
> > > > > It's better we can also get the infos about the cluster where the
> job
> > > is
> > > > > running on through the DESCRIBE statement.
> > > >
> > > > I think cluster info could be part of session properties instead.
> WDYT?
> > > >
> > > > Best,
> > > > Paul Lam
> > > >
> > > > > 2022年4月22日 11:14，Shengkai Fang <fs...@gmail.com> 写道：
> > > > >
> > > > > Hi Paul
> > > > >
> > > > > Sorry for the late response. I propose my thoughts here.
> > > > >
> > > > > 1. I think the keyword QUERY may confuse users because the
> statement
> > > also
> > > > > works for the DML statement. I find the Snowflakes[1] supports
> > > > >
> > > > > - CREATE TASK
> > > > > - DROP TASK
> > > > > - ALTER TASK
> > > > > - SHOW TASKS
> > > > > - DESCRIPE TASK
> > > > >
> > > > > I think we can follow snowflake to use `TASK` as the keyword or use
> > the
> > > > > keyword `JOB`?
> > > > >
> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
> > > terminate
> > > > > their jobs.
> > > > >
> > > > > ```
> > > > > DROP TASK [IF EXISTS] <job id> PURGE; -- Forcely stop the job with
> > > drain
> > > > >
> > > > > DROP TASK [IF EXISTS] <job id>; -- Stop the task with savepoints
> > > > > ```
> > > > >
> > > > > Oracle[2] uses the PURGE to clean up the table and users can't not
> > > > recover.
> > > > > I think it also works for us to terminate the job permanently.
> > > > >
> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like. Users
> > can
> > > > use
> > > > > the
> > > > >
> > > > > ```
> > > > >  SET 'state.savepoints.dir' = '<path_to_savepoint>';
> > > > >  SET 'state.savepoints.fomat' = 'native';
> > > > >  CREATE SAVEPOINT <job id>;
> > > > >
> > > > >  DROP SAVEPOINT <path_to_savepoint>;
> > > > > ```
> > > > >
> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
> > more
> > > > > detailed job infos.
> > > > >
> > > > > ```
> > > > >
> > > > > SHOW TASKS;
> > > > >
> > > > >
> > > > > +----------------------------------+
> > > > > |            job_id                |
> > > > > +----------------------------------+
> > > > > | 0f6413c33757fbe0277897dd94485f04 |
> > > > > +----------------------------------+
> > > > >
> > > > > DESCRIPE TASK <job id>;
> > > > >
> > > > > +------------------------
> > > > > |  job name   | status  |
> > > > > +------------------------
> > > > > | insert-sink | running |
> > > > > +------------------------
> > > > >
> > > > > ```
> > > > > It's better we can also get the infos about the cluster where the
> job
> > > is
> > > > > running on through the DESCRIBE statement.
> > > > >
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> >
> https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management
> > > > <
> > > >
> > >
> >
> https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management
> > > > >
> > > > > [2]
> > > > >
> > > >
> > >
> >
> https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806
> > > > <
> > > >
> > >
> >
> https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806
> > > > >
> > > > >
> > > > > Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>
> > > > 于2022年4月21日周四 10:36写道：
> > > > >
> > > > >> ping @Timo @Jark @Shengkai
> > > > >>
> > > > >> Best,
> > > > >> Paul Lam
> > > > >>
> > > > >>> 2022年4月18日 17:12，Paul Lam <pa...@gmail.com> 写道：
> > > > >>>
> > > > >>> Hi team,
> > > > >>>
> > > > >>> I’d like to start a discussion about FLIP-222 [1], which adds
> query
> > > > >> lifecycle
> > > > >>> statements to SQL client.
> > > > >>>
> > > > >>> Currently, SQL client supports submitting queries (queries in a
> > broad
> > > > >> sense,
> > > > >>> including DQLs and DMLs) but no further lifecycle statements,
> like
> > > > >> canceling
> > > > >>> a query or triggering a savepoint. That makes SQL users have to
> > rely
> > > on
> > > > >>> CLI or REST API to manage theirs queries.
> > > > >>>
> > > > >>> Thus, I propose to introduce the following statements to fill the
> > > gap.
> > > > >>> SHOW QUERIES
> > > > >>> STOP QUERY <query_id>
> > > > >>> CANCEL QUERY <query_id>
> > > > >>> TRIGGER SAVEPOINT <savepoint_path>
> > > > >>> DISPOSE SAVEPOINT <savepoint_path>
> > > > >>> These statement would align SQL client with CLI, providing the
> full
> > > > >> lifecycle
> > > > >>> management for queries/jobs.
> > > > >>>
> > > > >>> Please see the FLIP page[1] for more details. Thanks a lot!
> > > > >>> (For reference, the previous discussion thread see [2].)
> > > > >>>
> > > > >>> [1]
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client
> > > > >> <
> > > > >>
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client
> > > > >
> > > > >>>
> > > > >>> [2]
> > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb
> > > <
> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb> <
> > > > >> https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb
> <
> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>>
> > > > >>>
> > > > >>> Best,
> > > > >>> Paul Lam
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

Hi team,

I think we’ve reached consensus on the FLIP, thus I’m starting a vote thread.

Thank you all for the your advice in the discussion!

Best,
Paul Lam

> 2022年6月10日 02:26，Jing Ge <ji...@ververica.com> 写道：
> 
> Hi Paul,
> 
> Fired a ticket: https://issues.apache.org/jira/browse/FLINK-27977 <https://issues.apache.org/jira/browse/FLINK-27977> for savepoints housekeeping.
> 
> Best regards,
> Jing
> 
> On Thu, Jun 9, 2022 at 10:37 AM Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> wrote:
> Hi Paul,
> 
> That's a fair point, but I still think we should not offer that capability via the CLI either. But that's a different discussion :)
> 
> Thanks,
> 
> Martijn
> 
> Op do 9 jun. 2022 om 10:08 schreef Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>:
> Hi Martijn,
> 
> I think the `DROP SAVEPOINT` statement would not conflict with NO_CLAIM mode, since the statement is triggered by users instead of Flink runtime.
> 
> We’re simply providing a tool for user to cleanup the savepoints, just like `bin/flink savepoint -d :savepointPath` in Flink CLI [1].
> 
> [1] https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#disposing-savepoints <https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#disposing-savepoints>
> 
> Best,
> Paul Lam
> 
>> 2022年6月9日 15:41，Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> 写道：
>> 
>> Hi all,
>> 
>> I would not include a DROP SAVEPOINT syntax. With the recently introduced CLAIM/NO CLAIM mode, I would argue that we've just clarified snapshot ownership and if you have a savepoint established "with NO_CLAIM it creates its own copy and leaves the existing one up to the user." [1] We shouldn't then again make it fuzzy by making it possible that Flink can remove snapshots. 
>> 
>> Best regards,
>> 
>> Martijn
>> 
>> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership <https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership>
>> Op do 9 jun. 2022 om 09:27 schreef Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>:
>> Hi team,
>> 
>> It's great to see our opinions are finally converging!
>> 
>>> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>> 
>> 
>> LGTM. Adding it to the FLIP.
>> 
>> To Jark,
>> 
>>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>”
>> 
>> Good point. The default savepoint dir should be enough for most cases.
>> 
>> To Jing,
>> 
>>> DROP SAVEPOINT ALL
>> 
>> I think it’s valid to have such a statement, but I have two concerns:
>> `ALL` is already an SQL keyword, thus it may cause ambiguity.
>> Flink CLI and REST API doesn’t provided the corresponding functionalities, and we’d better keep them aligned.
>> How about making this statement as follow-up tasks which should touch REST API and Flink CLI?
>> 
>> Best,
>> Paul Lam
>> 
>>> 2022年6月9日 11:53，godfrey he <godfreyhe@gmail.com <ma...@gmail.com>> 写道：
>>> 
>>> Hi all,
>>> 
>>> Regarding `PIPELINE`, it comes from flink-core module, see
>>> `PipelineOptions` class for more details.
>>> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with `JOBS`.
>>> 
>>> +1 to discuss JOBTREE in other FLIP.
>>> 
>>> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>>> 
>>> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT <savepoint_path>`
>>> 
>>> Best,
>>> Godfrey
>>> 
>>> Jing Ge <jing@ververica.com <ma...@ververica.com>> 于2022年6月9日周四 01:48写道：
>>>> 
>>>> Hi Paul, Hi Jark,
>>>> 
>>>> Re JOBTREE, agree that it is out of the scope of this FLIP
>>>> 
>>>> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP SAVEPOINT ALL' housekeeping. WDYT?
>>>> 
>>>> Best regards,
>>>> Jing
>>>> 
>>>> 
>>>> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <imjark@gmail.com <ma...@gmail.com>> wrote:
>>>>> 
>>>>> Hi Jing,
>>>>> 
>>>>> Regarding JOBTREE (job lineage), I agree with Paul that this is out of the scope
>>>>> of this FLIP and can be discussed in another FLIP.
>>>>> 
>>>>> Job lineage is a big topic that may involve many problems:
>>>>> 1) how to collect and report job entities, attributes, and lineages?
>>>>> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
>>>>> 3) how does Flink SQL CLI/Gateway know the lineage information and show jobtree?
>>>>> 4) ...
>>>>> 
>>>>> Best,
>>>>> Jark
>>>>> 
>>>>> On Wed, 8 Jun 2022 at 20:44, Jark Wu <imjark@gmail.com <ma...@gmail.com>> wrote:
>>>>>> 
>>>>>> Hi Paul,
>>>>>> 
>>>>>> I'm fine with using JOBS. The only concern is that this may conflict with displaying more detailed
>>>>>> information for query (e.g. query content, plan) in the future, e.g. SHOW QUERIES EXTENDED in ksqldb[1].
>>>>>> This is not a big problem as we can introduce SHOW QUERIES in the future if necessary.
>>>>>> 
>>>>>>> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and `table.job.stop-with-drain`)
>>>>>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>>>>>> It might be trivial and error-prone to set configuration before executing a statement,
>>>>>> and the configuration will affect all statements after that.
>>>>>> 
>>>>>>> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>>>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>>>>>> and always use configuration "state.savepoints.dir" as the default savepoint dir.
>>>>>> The concern with using "<savepoint_path>" is here should be savepoint dir,
>>>>>> and savepoint_path is the returned value.
>>>>>> 
>>>>>> I'm fine with other changes.
>>>>>> 
>>>>>> Thanks,
>>>>>> Jark
>>>>>> 
>>>>>> [1]: https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/ <https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/>
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
>>>>>>> 
>>>>>>> Hi Jing,
>>>>>>> 
>>>>>>> Thank you for your inputs!
>>>>>>> 
>>>>>>> TBH, I haven’t considered the ETL scenario that you mentioned. I think they’re managed just like other jobs interns of job lifecycles (please correct me if I’m wrong).
>>>>>>> 
>>>>>>> WRT to the SQL statements about SQL lineages, I think it might be a little bit out of the scope of the FLIP, since it’s mainly about lifecycles. By the way, do we have these functionalities in Flink CLI or REST API already?
>>>>>>> 
>>>>>>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m updating the FLIP arcading to the latest discussions.
>>>>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 2022年6月8日 07:31，Jing Ge <jing@ververica.com <ma...@ververica.com>> 写道：
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> Sorry that I am a little bit too late to join this thread. Thanks for driving this and starting this informative discussion. The FLIP looks really interesting. It will help us a lot to manage Flink SQL jobs.
>>>>>>> 
>>>>>>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs build a DAG for many DAGs?
>>>>>>> 
>>>>>>> 1)
>>>>>>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to support ETL jobs. Briefly speaking, SQLs that used to build the DAG are responsible to *produce* data as the result(cube, materialized view, etc.) for the future consumption by queries. The INSERT INTO SELECT FROM example in FLIP and CTAS are typical SQL in this case. I would prefer to call them Jobs instead of Queries.
>>>>>>> 
>>>>>>> 2)
>>>>>>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to support syntax like:
>>>>>>> 
>>>>>>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>>>>>>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given job_id
>>>>>>> SHOW JOBTREES // shows all DAGs
>>>>>>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>>>>>> 
>>>>>>> 3)
>>>>>>> Could we also support Savepoint housekeeping syntax? We ran into this issue that a lot of savepoints have been created by customers (via their apps). It will take extra (hacking) effort to clean it.
>>>>>>> 
>>>>>>> RELEASE SAVEPOINT ALL
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> Jing
>>>>>>> 
>>>>>>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> wrote:
>>>>>>>> 
>>>>>>>> Hi Paul,
>>>>>>>> 
>>>>>>>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
>>>>>>>> imply that this will actually show the query, but we're returning IDs of
>>>>>>>> the running application. At first I was also not very much in favour of
>>>>>>>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>>>>>>>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>>>>>>> 
>>>>>>>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>>>>>>> 
>>>>>>>> Best regards,
>>>>>>>> 
>>>>>>>> Martijn
>>>>>>>> 
>>>>>>>> [1]
>>>>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary <https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary>
>>>>>>>> 
>>>>>>>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>:
>>>>>>>> 
>>>>>>>>> Hi Godfrey,
>>>>>>>>> 
>>>>>>>>> Sorry for the late reply, I was on vacation.
>>>>>>>>> 
>>>>>>>>> It looks like we have a variety of preferences on the syntax, how about we
>>>>>>>>> choose the most acceptable one?
>>>>>>>>> 
>>>>>>>>> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
>>>>>>>>> would be:
>>>>>>>>> 
>>>>>>>>> - SHOW JOBS
>>>>>>>>> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>>>>>>>> `table.job.stop-with-drain`)
>>>>>>>>> 
>>>>>>>>> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>>>>>>>>> JOB`:
>>>>>>>>> 
>>>>>>>>> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>>>>>>> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>>>>>>>>> manager remembers)
>>>>>>>>> - DROP SAVEPOINT <savepoint_path>
>>>>>>>>> 
>>>>>>>>> cc @Jark @ShengKai @Martijn @Timo .
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Paul Lam
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> godfrey he <godfreyhe@gmail.com <ma...@gmail.com>> 于2022年5月23日周一 21:34写道：
>>>>>>>>> 
>>>>>>>>>> Hi Paul,
>>>>>>>>>> 
>>>>>>>>>> Thanks for the update.
>>>>>>>>>> 
>>>>>>>>>>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>>>>>>> (DataStream or SQL) or
>>>>>>>>>> clients (SQL client or CLI).
>>>>>>>>>> 
>>>>>>>>>> Is DataStream job a QUERY? I think not.
>>>>>>>>>> For a QUERY, the most important concept is the statement. But the
>>>>>>>>>> result does not contain this info.
>>>>>>>>>> If we need to contain all jobs in the cluster, I think the name should
>>>>>>>>>> be JOB or PIPELINE.
>>>>>>>>>> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>>>>>>>>> 
>>>>>>>>>>> SHOW SAVEPOINTS
>>>>>>>>>> To list the savepoint for a specific job, we need to specify a
>>>>>>>>>> specific pipeline,
>>>>>>>>>> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Godfrey
>>>>>>>>>> 
>>>>>>>>>> Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 于2022年5月20日周五 11:25写道：
>>>>>>>>>>> 
>>>>>>>>>>> Hi Jark,
>>>>>>>>>>> 
>>>>>>>>>>> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>>>>>>>>>>> part of the reason why I proposed “STOP/CANCEL QUERY” at the
>>>>>>>>>>> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>>>>>>>>>> 
>>>>>>>>>>> Another question is, what should be the syntax for ungracefully
>>>>>>>>>>> canceling a query? As ShengKai pointed out in a offline discussion,
>>>>>>>>>>> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>>>>>>>>>>> Flink CLI has both stop and cancel, mostly due to historical problems.
>>>>>>>>>>> 
>>>>>>>>>>> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>>>>>>>>>>> that savepoints are owned by users and beyond the lifecycle of a Flink
>>>>>>>>>>> cluster. For example, a user might take a savepoint at a custom path
>>>>>>>>>>> that’s different than the default savepoint path, I think jobmanager
>>>>>>>>>> would
>>>>>>>>>>> not remember that, not to mention the jobmanager may be a fresh new
>>>>>>>>>>> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
>>>>>>>>>>> probably a best-effort one.
>>>>>>>>>>> 
>>>>>>>>>>> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>>>>>>>>>>> Savepoints are alias for nested transactions in DB area[1], and there’s
>>>>>>>>>>> correspondingly global transactions. If we consider Flink jobs as
>>>>>>>>>>> global transactions and Flink checkpoints as nested transactions,
>>>>>>>>>>> then the savepoint semantics are close, thus I think savepoint syntax
>>>>>>>>>>> in SQL-standard could be considered. But again, I’m don’t have very
>>>>>>>>>>> strong preference.
>>>>>>>>>>> 
>>>>>>>>>>> Ping @Timo to get more inputs.
>>>>>>>>>>> 
>>>>>>>>>>> [1] https://en.wikipedia.org/wiki/Nested_transaction <https://en.wikipedia.org/wiki/Nested_transaction> <
>>>>>>>>>> https://en.wikipedia.org/wiki/Nested_transaction <https://en.wikipedia.org/wiki/Nested_transaction>>
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Paul Lam
>>>>>>>>>>> 
>>>>>>>>>>>> 2022年5月18日 17:48，Jark Wu <imjark@gmail.com <ma...@gmail.com>> 写道：
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi Paul,
>>>>>>>>>>>> 
>>>>>>>>>>>> 1) SHOW QUERIES
>>>>>>>>>>>> +1 to add finished time, but it would be better to call it "end_time"
>>>>>>>>>> to
>>>>>>>>>>>> keep aligned with names in Web UI.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2) DROP QUERY
>>>>>>>>>>>> I think we shouldn't throw exceptions for batch jobs, otherwise, how
>>>>>>>>>> to
>>>>>>>>>>>> stop batch queries?
>>>>>>>>>>>> At present, I don't think "DROP" is a suitable keyword for this
>>>>>>>>>> statement.
>>>>>>>>>>>> From the perspective of users, "DROP" sounds like the query should be
>>>>>>>>>>>> removed from the
>>>>>>>>>>>> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>>>>>>>>> more
>>>>>>>>>>>> suitable and
>>>>>>>>>>>> compliant with commands of Flink CLI.
>>>>>>>>>>>> 
>>>>>>>>>>>> 3) SHOW SAVEPOINTS
>>>>>>>>>>>> I think this statement is needed, otherwise, savepoints are lost
>>>>>>>>>> after the
>>>>>>>>>>>> SAVEPOINT
>>>>>>>>>>>> command is executed. Savepoints can be retrieved from REST API
>>>>>>>>>>>> "/jobs/:jobid/checkpoints"
>>>>>>>>>>>> with filtering "checkpoint_type"="savepoint". It's also worth
>>>>>>>>>> considering
>>>>>>>>>>>> providing "SHOW CHECKPOINTS"
>>>>>>>>>>>> to list all checkpoints.
>>>>>>>>>>>> 
>>>>>>>>>>>> 4) SAVEPOINT & RELEASE SAVEPOINT
>>>>>>>>>>>> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>>>>>>>>> statements
>>>>>>>>>>>> now.
>>>>>>>>>>>> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>>>>>>>>>> both
>>>>>>>>>>>> the same savepoint id.
>>>>>>>>>>>> However, in our syntax, the first one is query id, and the second one
>>>>>>>>>> is
>>>>>>>>>>>> savepoint path, which is confusing and
>>>>>>>>>>>> not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>>>>>>>>> they
>>>>>>>>>>>> should be in the same syntax set.
>>>>>>>>>>>> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
>>>>>>>>>>>> <sp_path>.
>>>>>>>>>>>> That means we don't follow the majority of vendors in SAVEPOINT
>>>>>>>>>> commands. I
>>>>>>>>>>>> would say the purpose is different in Flink.
>>>>>>>>>>>> What other's opinion on this?
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Jark
>>>>>>>>>>>> 
>>>>>>>>>>>> [1]:
>>>>>>>>>>>> 
>>>>>>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints <https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints>
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, 18 May 2022 at 14:43, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Godfrey,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks a lot for your inputs!
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>>>>>>> (DataStream
>>>>>>>>>>>>> or SQL) or
>>>>>>>>>>>>> clients (SQL client or CLI). Under the hook, it’s based on
>>>>>>>>>>>>> ClusterClient#listJobs, the
>>>>>>>>>>>>> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>>>>>>>>>> in SQL
>>>>>>>>>>>>> client, because
>>>>>>>>>>>>> these jobs can be managed via SQL client too.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> WRT finished time, I think you’re right. Adding it to the FLIP. But
>>>>>>>>>> I’m a
>>>>>>>>>>>>> bit afraid that the
>>>>>>>>>>>>> rows would be too long.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> WRT ‘DROP QUERY’,
>>>>>>>>>>>>>> What's the behavior for batch jobs and the non-running jobs?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> In general, the behavior would be aligned with Flink CLI. Triggering
>>>>>>>>>> a
>>>>>>>>>>>>> savepoint for
>>>>>>>>>>>>> a non-running job would cause errors, and the error message would be
>>>>>>>>>>>>> printed to
>>>>>>>>>>>>> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>>>>>>>>>>>>> streaming
>>>>>>>>>>>>> execution mode would be the same with streaming jobs. However, for
>>>>>>>>>> batch
>>>>>>>>>>>>> jobs in
>>>>>>>>>>>>> batch execution mode, I think there would be an error, because batch
>>>>>>>>>>>>> execution
>>>>>>>>>>>>> doesn’t support checkpoints currently (please correct me if I’m
>>>>>>>>>> wrong).
>>>>>>>>>>>>> 
>>>>>>>>>>>>> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>>>>>>>>> clusterClient/
>>>>>>>>>>>>> jobClient doesn’t have such a functionality at the moment, neither do
>>>>>>>>>>>>> Flink CLI.
>>>>>>>>>>>>> Maybe we could make it a follow-up FLIP, which includes the
>>>>>>>>>> modifications
>>>>>>>>>>>>> to
>>>>>>>>>>>>> clusterClient/jobClient and Flink CLI. WDYT?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Paul Lam
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2022年5月17日 20:34，godfrey he <godfreyhe@gmail.com <ma...@gmail.com>> 写道：
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Godfrey
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>>>> 
>> 
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Jing Ge <ji...@ververica.com>.

Hi Paul,

Fired a ticket: https://issues.apache.org/jira/browse/FLINK-27977 for
savepoints housekeeping.

Best regards,
Jing

On Thu, Jun 9, 2022 at 10:37 AM Martijn Visser <ma...@apache.org>
wrote:

> Hi Paul,
>
> That's a fair point, but I still think we should not offer that capability
> via the CLI either. But that's a different discussion :)
>
> Thanks,
>
> Martijn
>
> Op do 9 jun. 2022 om 10:08 schreef Paul Lam <pa...@gmail.com>:
>
>> Hi Martijn,
>>
>> I think the `DROP SAVEPOINT` statement would not conflict with NO_CLAIM
>> mode, since the statement is triggered by users instead of Flink runtime.
>>
>> We’re simply providing a tool for user to cleanup the savepoints, just
>> like `bin/flink savepoint -d :savepointPath` in Flink CLI [1].
>>
>> [1]
>> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#disposing-savepoints
>>
>> Best,
>> Paul Lam
>>
>> 2022年6月9日 15:41，Martijn Visser <ma...@apache.org> 写道：
>>
>> Hi all,
>>
>> I would not include a DROP SAVEPOINT syntax. With the recently introduced
>> CLAIM/NO CLAIM mode, I would argue that we've just clarified snapshot
>> ownership and if you have a savepoint established "with NO_CLAIM it creates
>> its own copy and leaves the existing one up to the user." [1] We shouldn't
>> then again make it fuzzy by making it possible that Flink can remove
>> snapshots.
>>
>> Best regards,
>>
>> Martijn
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership
>>
>> Op do 9 jun. 2022 om 09:27 schreef Paul Lam <pa...@gmail.com>:
>>
>>> Hi team,
>>>
>>> It's great to see our opinions are finally converging!
>>>
>>> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>>>
>>>
>>> LGTM. Adding it to the FLIP.
>>>
>>> To Jark,
>>>
>>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>”
>>>
>>>
>>> Good point. The default savepoint dir should be enough for most cases.
>>>
>>> To Jing,
>>>
>>> DROP SAVEPOINT ALL
>>>
>>>
>>> I think it’s valid to have such a statement, but I have two concerns:
>>>
>>>    - `ALL` is already an SQL keyword, thus it may cause ambiguity.
>>>    - Flink CLI and REST API doesn’t provided the corresponding
>>>    functionalities, and we’d better keep them aligned.
>>>
>>> How about making this statement as follow-up tasks which should touch
>>> REST API and Flink CLI?
>>>
>>> Best,
>>> Paul Lam
>>>
>>> 2022年6月9日 11:53，godfrey he <go...@gmail.com> 写道：
>>>
>>> Hi all,
>>>
>>> Regarding `PIPELINE`, it comes from flink-core module, see
>>> `PipelineOptions` class for more details.
>>> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with
>>> `JOBS`.
>>>
>>> +1 to discuss JOBTREE in other FLIP.
>>>
>>> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>>>
>>> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT
>>> <savepoint_path>`
>>>
>>> Best,
>>> Godfrey
>>>
>>> Jing Ge <ji...@ververica.com> 于2022年6月9日周四 01:48写道：
>>>
>>>
>>> Hi Paul, Hi Jark,
>>>
>>> Re JOBTREE, agree that it is out of the scope of this FLIP
>>>
>>> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP
>>> SAVEPOINT ALL' housekeeping. WDYT?
>>>
>>> Best regards,
>>> Jing
>>>
>>>
>>> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <im...@gmail.com> wrote:
>>>
>>>
>>> Hi Jing,
>>>
>>> Regarding JOBTREE (job lineage), I agree with Paul that this is out of
>>> the scope
>>> of this FLIP and can be discussed in another FLIP.
>>>
>>> Job lineage is a big topic that may involve many problems:
>>> 1) how to collect and report job entities, attributes, and lineages?
>>> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
>>> 3) how does Flink SQL CLI/Gateway know the lineage information and show
>>> jobtree?
>>> 4) ...
>>>
>>> Best,
>>> Jark
>>>
>>> On Wed, 8 Jun 2022 at 20:44, Jark Wu <im...@gmail.com> wrote:
>>>
>>>
>>> Hi Paul,
>>>
>>> I'm fine with using JOBS. The only concern is that this may conflict
>>> with displaying more detailed
>>> information for query (e.g. query content, plan) in the future, e.g.
>>> SHOW QUERIES EXTENDED in ksqldb[1].
>>> This is not a big problem as we can introduce SHOW QUERIES in the future
>>> if necessary.
>>>
>>> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>> `table.job.stop-with-drain`)
>>>
>>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>>> It might be trivial and error-prone to set configuration before
>>> executing a statement,
>>> and the configuration will affect all statements after that.
>>>
>>> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>
>>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>>> and always use configuration "state.savepoints.dir" as the default
>>> savepoint dir.
>>> The concern with using "<savepoint_path>" is here should be savepoint
>>> dir,
>>> and savepoint_path is the returned value.
>>>
>>> I'm fine with other changes.
>>>
>>> Thanks,
>>> Jark
>>>
>>> [1]:
>>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>>
>>>
>>>
>>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <pa...@gmail.com> wrote:
>>>
>>>
>>> Hi Jing,
>>>
>>> Thank you for your inputs!
>>>
>>> TBH, I haven’t considered the ETL scenario that you mentioned. I think
>>> they’re managed just like other jobs interns of job lifecycles (please
>>> correct me if I’m wrong).
>>>
>>> WRT to the SQL statements about SQL lineages, I think it might be a
>>> little bit out of the scope of the FLIP, since it’s mainly about
>>> lifecycles. By the way, do we have these functionalities in Flink CLI or
>>> REST API already?
>>>
>>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the
>>> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m
>>> updating the FLIP arcading to the latest discussions.
>>>
>>> Best,
>>> Paul Lam
>>>
>>> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
>>>
>>> Hi Paul,
>>>
>>> Sorry that I am a little bit too late to join this thread. Thanks for
>>> driving this and starting this informative discussion. The FLIP looks
>>> really interesting. It will help us a lot to manage Flink SQL jobs.
>>>
>>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs
>>> build a DAG for many DAGs?
>>>
>>> 1)
>>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how
>>> to support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
>>> responsible to *produce* data as the result(cube, materialized view, etc.)
>>> for the future consumption by queries. The INSERT INTO SELECT FROM example
>>> in FLIP and CTAS are typical SQL in this case. I would prefer to call them
>>> Jobs instead of Queries.
>>>
>>> 2)
>>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to
>>> support syntax like:
>>>
>>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the
>>> given job_id
>>> SHOW JOBTREES // shows all DAGs
>>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>>
>>> 3)
>>> Could we also support Savepoint housekeeping syntax? We ran into this
>>> issue that a lot of savepoints have been created by customers (via their
>>> apps). It will take extra (hacking) effort to clean it.
>>>
>>> RELEASE SAVEPOINT ALL
>>>
>>> Best regards,
>>> Jing
>>>
>>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org>
>>> wrote:
>>>
>>>
>>> Hi Paul,
>>>
>>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES
>>> could
>>> imply that this will actually show the query, but we're returning IDs of
>>> the running application. At first I was also not very much in favour of
>>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>>
>>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>>
>>> Best regards,
>>>
>>> Martijn
>>>
>>> [1]
>>>
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>>
>>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>>>
>>> Hi Godfrey,
>>>
>>> Sorry for the late reply, I was on vacation.
>>>
>>> It looks like we have a variety of preferences on the syntax, how about
>>> we
>>> choose the most acceptable one?
>>>
>>> WRT keyword for SQL jobs, we use JOBS, thus the statements related to
>>> jobs
>>> would be:
>>>
>>> - SHOW JOBS
>>> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>> `table.job.stop-with-drain`)
>>>
>>> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>>> JOB`:
>>>
>>> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>>> manager remembers)
>>> - DROP SAVEPOINT <savepoint_path>
>>>
>>> cc @Jark @ShengKai @Martijn @Timo .
>>>
>>> Best,
>>> Paul Lam
>>>
>>>
>>> godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>>>
>>> Hi Paul,
>>>
>>> Thanks for the update.
>>>
>>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>
>>> (DataStream or SQL) or
>>> clients (SQL client or CLI).
>>>
>>> Is DataStream job a QUERY? I think not.
>>> For a QUERY, the most important concept is the statement. But the
>>> result does not contain this info.
>>> If we need to contain all jobs in the cluster, I think the name should
>>> be JOB or PIPELINE.
>>> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>>
>>> SHOW SAVEPOINTS
>>>
>>> To list the savepoint for a specific job, we need to specify a
>>> specific pipeline,
>>> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>>
>>> Best,
>>> Godfrey
>>>
>>> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>>>
>>>
>>> Hi Jark,
>>>
>>> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>>> part of the reason why I proposed “STOP/CANCEL QUERY” at the
>>> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>>
>>> Another question is, what should be the syntax for ungracefully
>>> canceling a query? As ShengKai pointed out in a offline discussion,
>>> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>>> Flink CLI has both stop and cancel, mostly due to historical problems.
>>>
>>> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>>> that savepoints are owned by users and beyond the lifecycle of a Flink
>>> cluster. For example, a user might take a savepoint at a custom path
>>> that’s different than the default savepoint path, I think jobmanager
>>>
>>> would
>>>
>>> not remember that, not to mention the jobmanager may be a fresh new
>>> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
>>> probably a best-effort one.
>>>
>>> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>>> Savepoints are alias for nested transactions in DB area[1], and there’s
>>> correspondingly global transactions. If we consider Flink jobs as
>>> global transactions and Flink checkpoints as nested transactions,
>>> then the savepoint semantics are close, thus I think savepoint syntax
>>> in SQL-standard could be considered. But again, I’m don’t have very
>>> strong preference.
>>>
>>> Ping @Timo to get more inputs.
>>>
>>> [1] https://en.wikipedia.org/wiki/Nested_transaction <
>>>
>>> https://en.wikipedia.org/wiki/Nested_transaction>
>>>
>>>
>>> Best,
>>> Paul Lam
>>>
>>> 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>>>
>>> Hi Paul,
>>>
>>> 1) SHOW QUERIES
>>> +1 to add finished time, but it would be better to call it "end_time"
>>>
>>> to
>>>
>>> keep aligned with names in Web UI.
>>>
>>> 2) DROP QUERY
>>> I think we shouldn't throw exceptions for batch jobs, otherwise, how
>>>
>>> to
>>>
>>> stop batch queries?
>>> At present, I don't think "DROP" is a suitable keyword for this
>>>
>>> statement.
>>>
>>> From the perspective of users, "DROP" sounds like the query should be
>>> removed from the
>>> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>>
>>> more
>>>
>>> suitable and
>>> compliant with commands of Flink CLI.
>>>
>>> 3) SHOW SAVEPOINTS
>>> I think this statement is needed, otherwise, savepoints are lost
>>>
>>> after the
>>>
>>> SAVEPOINT
>>> command is executed. Savepoints can be retrieved from REST API
>>> "/jobs/:jobid/checkpoints"
>>> with filtering "checkpoint_type"="savepoint". It's also worth
>>>
>>> considering
>>>
>>> providing "SHOW CHECKPOINTS"
>>> to list all checkpoints.
>>>
>>> 4) SAVEPOINT & RELEASE SAVEPOINT
>>> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>>
>>> statements
>>>
>>> now.
>>> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>>>
>>> both
>>>
>>> the same savepoint id.
>>> However, in our syntax, the first one is query id, and the second one
>>>
>>> is
>>>
>>> savepoint path, which is confusing and
>>> not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>>
>>> they
>>>
>>> should be in the same syntax set.
>>> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
>>> <sp_path>.
>>> That means we don't follow the majority of vendors in SAVEPOINT
>>>
>>> commands. I
>>>
>>> would say the purpose is different in Flink.
>>> What other's opinion on this?
>>>
>>> Best,
>>> Jark
>>>
>>> [1]:
>>>
>>>
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>>>
>>>
>>>
>>> On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
>>>
>>> Hi Godfrey,
>>>
>>> Thanks a lot for your inputs!
>>>
>>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>
>>> (DataStream
>>>
>>> or SQL) or
>>> clients (SQL client or CLI). Under the hook, it’s based on
>>> ClusterClient#listJobs, the
>>> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>>>
>>> in SQL
>>>
>>> client, because
>>> these jobs can be managed via SQL client too.
>>>
>>> WRT finished time, I think you’re right. Adding it to the FLIP. But
>>>
>>> I’m a
>>>
>>> bit afraid that the
>>> rows would be too long.
>>>
>>> WRT ‘DROP QUERY’,
>>>
>>> What's the behavior for batch jobs and the non-running jobs?
>>>
>>>
>>>
>>> In general, the behavior would be aligned with Flink CLI. Triggering
>>>
>>> a
>>>
>>> savepoint for
>>> a non-running job would cause errors, and the error message would be
>>> printed to
>>> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>>> streaming
>>> execution mode would be the same with streaming jobs. However, for
>>>
>>> batch
>>>
>>> jobs in
>>> batch execution mode, I think there would be an error, because batch
>>> execution
>>> doesn’t support checkpoints currently (please correct me if I’m
>>>
>>> wrong).
>>>
>>>
>>> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>>
>>> clusterClient/
>>>
>>> jobClient doesn’t have such a functionality at the moment, neither do
>>> Flink CLI.
>>> Maybe we could make it a follow-up FLIP, which includes the
>>>
>>> modifications
>>>
>>> to
>>> clusterClient/jobClient and Flink CLI. WDYT?
>>>
>>> Best,
>>> Paul Lam
>>>
>>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>>>
>>> Godfrey
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Martijn Visser <ma...@apache.org>.

Hi Paul,

That's a fair point, but I still think we should not offer that capability
via the CLI either. But that's a different discussion :)

Thanks,

Martijn

Op do 9 jun. 2022 om 10:08 schreef Paul Lam <pa...@gmail.com>:

> Hi Martijn,
>
> I think the `DROP SAVEPOINT` statement would not conflict with NO_CLAIM
> mode, since the statement is triggered by users instead of Flink runtime.
>
> We’re simply providing a tool for user to cleanup the savepoints, just
> like `bin/flink savepoint -d :savepointPath` in Flink CLI [1].
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#disposing-savepoints
>
> Best,
> Paul Lam
>
> 2022年6月9日 15:41，Martijn Visser <ma...@apache.org> 写道：
>
> Hi all,
>
> I would not include a DROP SAVEPOINT syntax. With the recently introduced
> CLAIM/NO CLAIM mode, I would argue that we've just clarified snapshot
> ownership and if you have a savepoint established "with NO_CLAIM it creates
> its own copy and leaves the existing one up to the user." [1] We shouldn't
> then again make it fuzzy by making it possible that Flink can remove
> snapshots.
>
> Best regards,
>
> Martijn
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership
>
> Op do 9 jun. 2022 om 09:27 schreef Paul Lam <pa...@gmail.com>:
>
>> Hi team,
>>
>> It's great to see our opinions are finally converging!
>>
>> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>>
>>
>> LGTM. Adding it to the FLIP.
>>
>> To Jark,
>>
>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>”
>>
>>
>> Good point. The default savepoint dir should be enough for most cases.
>>
>> To Jing,
>>
>> DROP SAVEPOINT ALL
>>
>>
>> I think it’s valid to have such a statement, but I have two concerns:
>>
>>    - `ALL` is already an SQL keyword, thus it may cause ambiguity.
>>    - Flink CLI and REST API doesn’t provided the corresponding
>>    functionalities, and we’d better keep them aligned.
>>
>> How about making this statement as follow-up tasks which should touch
>> REST API and Flink CLI?
>>
>> Best,
>> Paul Lam
>>
>> 2022年6月9日 11:53，godfrey he <go...@gmail.com> 写道：
>>
>> Hi all,
>>
>> Regarding `PIPELINE`, it comes from flink-core module, see
>> `PipelineOptions` class for more details.
>> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with
>> `JOBS`.
>>
>> +1 to discuss JOBTREE in other FLIP.
>>
>> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>>
>> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT
>> <savepoint_path>`
>>
>> Best,
>> Godfrey
>>
>> Jing Ge <ji...@ververica.com> 于2022年6月9日周四 01:48写道：
>>
>>
>> Hi Paul, Hi Jark,
>>
>> Re JOBTREE, agree that it is out of the scope of this FLIP
>>
>> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP
>> SAVEPOINT ALL' housekeeping. WDYT?
>>
>> Best regards,
>> Jing
>>
>>
>> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <im...@gmail.com> wrote:
>>
>>
>> Hi Jing,
>>
>> Regarding JOBTREE (job lineage), I agree with Paul that this is out of
>> the scope
>> of this FLIP and can be discussed in another FLIP.
>>
>> Job lineage is a big topic that may involve many problems:
>> 1) how to collect and report job entities, attributes, and lineages?
>> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
>> 3) how does Flink SQL CLI/Gateway know the lineage information and show
>> jobtree?
>> 4) ...
>>
>> Best,
>> Jark
>>
>> On Wed, 8 Jun 2022 at 20:44, Jark Wu <im...@gmail.com> wrote:
>>
>>
>> Hi Paul,
>>
>> I'm fine with using JOBS. The only concern is that this may conflict with
>> displaying more detailed
>> information for query (e.g. query content, plan) in the future, e.g. SHOW
>> QUERIES EXTENDED in ksqldb[1].
>> This is not a big problem as we can introduce SHOW QUERIES in the future
>> if necessary.
>>
>> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>> `table.job.stop-with-drain`)
>>
>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>> It might be trivial and error-prone to set configuration before executing
>> a statement,
>> and the configuration will affect all statements after that.
>>
>> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>
>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>> and always use configuration "state.savepoints.dir" as the default
>> savepoint dir.
>> The concern with using "<savepoint_path>" is here should be savepoint dir,
>> and savepoint_path is the returned value.
>>
>> I'm fine with other changes.
>>
>> Thanks,
>> Jark
>>
>> [1]:
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>
>>
>>
>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <pa...@gmail.com> wrote:
>>
>>
>> Hi Jing,
>>
>> Thank you for your inputs!
>>
>> TBH, I haven’t considered the ETL scenario that you mentioned. I think
>> they’re managed just like other jobs interns of job lifecycles (please
>> correct me if I’m wrong).
>>
>> WRT to the SQL statements about SQL lineages, I think it might be a
>> little bit out of the scope of the FLIP, since it’s mainly about
>> lifecycles. By the way, do we have these functionalities in Flink CLI or
>> REST API already?
>>
>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the
>> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m
>> updating the FLIP arcading to the latest discussions.
>>
>> Best,
>> Paul Lam
>>
>> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
>>
>> Hi Paul,
>>
>> Sorry that I am a little bit too late to join this thread. Thanks for
>> driving this and starting this informative discussion. The FLIP looks
>> really interesting. It will help us a lot to manage Flink SQL jobs.
>>
>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs
>> build a DAG for many DAGs?
>>
>> 1)
>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to
>> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
>> responsible to *produce* data as the result(cube, materialized view, etc.)
>> for the future consumption by queries. The INSERT INTO SELECT FROM example
>> in FLIP and CTAS are typical SQL in this case. I would prefer to call them
>> Jobs instead of Queries.
>>
>> 2)
>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to
>> support syntax like:
>>
>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given
>> job_id
>> SHOW JOBTREES // shows all DAGs
>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>
>> 3)
>> Could we also support Savepoint housekeeping syntax? We ran into this
>> issue that a lot of savepoints have been created by customers (via their
>> apps). It will take extra (hacking) effort to clean it.
>>
>> RELEASE SAVEPOINT ALL
>>
>> Best regards,
>> Jing
>>
>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org>
>> wrote:
>>
>>
>> Hi Paul,
>>
>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES
>> could
>> imply that this will actually show the query, but we're returning IDs of
>> the running application. At first I was also not very much in favour of
>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>
>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>
>> Best regards,
>>
>> Martijn
>>
>> [1]
>>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>
>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>>
>> Hi Godfrey,
>>
>> Sorry for the late reply, I was on vacation.
>>
>> It looks like we have a variety of preferences on the syntax, how about we
>> choose the most acceptable one?
>>
>> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
>> would be:
>>
>> - SHOW JOBS
>> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>> `table.job.stop-with-drain`)
>>
>> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>> JOB`:
>>
>> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>> manager remembers)
>> - DROP SAVEPOINT <savepoint_path>
>>
>> cc @Jark @ShengKai @Martijn @Timo .
>>
>> Best,
>> Paul Lam
>>
>>
>> godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>>
>> Hi Paul,
>>
>> Thanks for the update.
>>
>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>
>> (DataStream or SQL) or
>> clients (SQL client or CLI).
>>
>> Is DataStream job a QUERY? I think not.
>> For a QUERY, the most important concept is the statement. But the
>> result does not contain this info.
>> If we need to contain all jobs in the cluster, I think the name should
>> be JOB or PIPELINE.
>> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>
>> SHOW SAVEPOINTS
>>
>> To list the savepoint for a specific job, we need to specify a
>> specific pipeline,
>> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>
>> Best,
>> Godfrey
>>
>> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>>
>>
>> Hi Jark,
>>
>> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>> part of the reason why I proposed “STOP/CANCEL QUERY” at the
>> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>
>> Another question is, what should be the syntax for ungracefully
>> canceling a query? As ShengKai pointed out in a offline discussion,
>> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>> Flink CLI has both stop and cancel, mostly due to historical problems.
>>
>> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>> that savepoints are owned by users and beyond the lifecycle of a Flink
>> cluster. For example, a user might take a savepoint at a custom path
>> that’s different than the default savepoint path, I think jobmanager
>>
>> would
>>
>> not remember that, not to mention the jobmanager may be a fresh new
>> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
>> probably a best-effort one.
>>
>> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>> Savepoints are alias for nested transactions in DB area[1], and there’s
>> correspondingly global transactions. If we consider Flink jobs as
>> global transactions and Flink checkpoints as nested transactions,
>> then the savepoint semantics are close, thus I think savepoint syntax
>> in SQL-standard could be considered. But again, I’m don’t have very
>> strong preference.
>>
>> Ping @Timo to get more inputs.
>>
>> [1] https://en.wikipedia.org/wiki/Nested_transaction <
>>
>> https://en.wikipedia.org/wiki/Nested_transaction>
>>
>>
>> Best,
>> Paul Lam
>>
>> 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>>
>> Hi Paul,
>>
>> 1) SHOW QUERIES
>> +1 to add finished time, but it would be better to call it "end_time"
>>
>> to
>>
>> keep aligned with names in Web UI.
>>
>> 2) DROP QUERY
>> I think we shouldn't throw exceptions for batch jobs, otherwise, how
>>
>> to
>>
>> stop batch queries?
>> At present, I don't think "DROP" is a suitable keyword for this
>>
>> statement.
>>
>> From the perspective of users, "DROP" sounds like the query should be
>> removed from the
>> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>
>> more
>>
>> suitable and
>> compliant with commands of Flink CLI.
>>
>> 3) SHOW SAVEPOINTS
>> I think this statement is needed, otherwise, savepoints are lost
>>
>> after the
>>
>> SAVEPOINT
>> command is executed. Savepoints can be retrieved from REST API
>> "/jobs/:jobid/checkpoints"
>> with filtering "checkpoint_type"="savepoint". It's also worth
>>
>> considering
>>
>> providing "SHOW CHECKPOINTS"
>> to list all checkpoints.
>>
>> 4) SAVEPOINT & RELEASE SAVEPOINT
>> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>
>> statements
>>
>> now.
>> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>>
>> both
>>
>> the same savepoint id.
>> However, in our syntax, the first one is query id, and the second one
>>
>> is
>>
>> savepoint path, which is confusing and
>> not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>
>> they
>>
>> should be in the same syntax set.
>> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
>> <sp_path>.
>> That means we don't follow the majority of vendors in SAVEPOINT
>>
>> commands. I
>>
>> would say the purpose is different in Flink.
>> What other's opinion on this?
>>
>> Best,
>> Jark
>>
>> [1]:
>>
>>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>>
>>
>>
>> On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
>>
>> Hi Godfrey,
>>
>> Thanks a lot for your inputs!
>>
>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>
>> (DataStream
>>
>> or SQL) or
>> clients (SQL client or CLI). Under the hook, it’s based on
>> ClusterClient#listJobs, the
>> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>>
>> in SQL
>>
>> client, because
>> these jobs can be managed via SQL client too.
>>
>> WRT finished time, I think you’re right. Adding it to the FLIP. But
>>
>> I’m a
>>
>> bit afraid that the
>> rows would be too long.
>>
>> WRT ‘DROP QUERY’,
>>
>> What's the behavior for batch jobs and the non-running jobs?
>>
>>
>>
>> In general, the behavior would be aligned with Flink CLI. Triggering
>>
>> a
>>
>> savepoint for
>> a non-running job would cause errors, and the error message would be
>> printed to
>> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>> streaming
>> execution mode would be the same with streaming jobs. However, for
>>
>> batch
>>
>> jobs in
>> batch execution mode, I think there would be an error, because batch
>> execution
>> doesn’t support checkpoints currently (please correct me if I’m
>>
>> wrong).
>>
>>
>> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>
>> clusterClient/
>>
>> jobClient doesn’t have such a functionality at the moment, neither do
>> Flink CLI.
>> Maybe we could make it a follow-up FLIP, which includes the
>>
>> modifications
>>
>> to
>> clusterClient/jobClient and Flink CLI. WDYT?
>>
>> Best,
>> Paul Lam
>>
>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>>
>> Godfrey
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

Hi Martijn,

I think the `DROP SAVEPOINT` statement would not conflict with NO_CLAIM mode, since the statement is triggered by users instead of Flink runtime.

We’re simply providing a tool for user to cleanup the savepoints, just like `bin/flink savepoint -d :savepointPath` in Flink CLI [1].

[1] https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/ops/state/savepoints/#disposing-savepoints

Best,
Paul Lam

> 2022年6月9日 15:41，Martijn Visser <ma...@apache.org> 写道：
> 
> Hi all,
> 
> I would not include a DROP SAVEPOINT syntax. With the recently introduced CLAIM/NO CLAIM mode, I would argue that we've just clarified snapshot ownership and if you have a savepoint established "with NO_CLAIM it creates its own copy and leaves the existing one up to the user." [1] We shouldn't then again make it fuzzy by making it possible that Flink can remove snapshots. 
> 
> Best regards,
> 
> Martijn
> 
> [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership <https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership>
> Op do 9 jun. 2022 om 09:27 schreef Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>:
> Hi team,
> 
> It's great to see our opinions are finally converging!
> 
>> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
> 
> 
> LGTM. Adding it to the FLIP.
> 
> To Jark,
> 
>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>”
> 
> Good point. The default savepoint dir should be enough for most cases.
> 
> To Jing,
> 
>> DROP SAVEPOINT ALL
> 
> I think it’s valid to have such a statement, but I have two concerns:
> `ALL` is already an SQL keyword, thus it may cause ambiguity.
> Flink CLI and REST API doesn’t provided the corresponding functionalities, and we’d better keep them aligned.
> How about making this statement as follow-up tasks which should touch REST API and Flink CLI?
> 
> Best,
> Paul Lam
> 
>> 2022年6月9日 11:53，godfrey he <godfreyhe@gmail.com <ma...@gmail.com>> 写道：
>> 
>> Hi all,
>> 
>> Regarding `PIPELINE`, it comes from flink-core module, see
>> `PipelineOptions` class for more details.
>> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with `JOBS`.
>> 
>> +1 to discuss JOBTREE in other FLIP.
>> 
>> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>> 
>> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT <savepoint_path>`
>> 
>> Best,
>> Godfrey
>> 
>> Jing Ge <jing@ververica.com <ma...@ververica.com>> 于2022年6月9日周四 01:48写道：
>>> 
>>> Hi Paul, Hi Jark,
>>> 
>>> Re JOBTREE, agree that it is out of the scope of this FLIP
>>> 
>>> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP SAVEPOINT ALL' housekeeping. WDYT?
>>> 
>>> Best regards,
>>> Jing
>>> 
>>> 
>>> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <imjark@gmail.com <ma...@gmail.com>> wrote:
>>>> 
>>>> Hi Jing,
>>>> 
>>>> Regarding JOBTREE (job lineage), I agree with Paul that this is out of the scope
>>>> of this FLIP and can be discussed in another FLIP.
>>>> 
>>>> Job lineage is a big topic that may involve many problems:
>>>> 1) how to collect and report job entities, attributes, and lineages?
>>>> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
>>>> 3) how does Flink SQL CLI/Gateway know the lineage information and show jobtree?
>>>> 4) ...
>>>> 
>>>> Best,
>>>> Jark
>>>> 
>>>> On Wed, 8 Jun 2022 at 20:44, Jark Wu <imjark@gmail.com <ma...@gmail.com>> wrote:
>>>>> 
>>>>> Hi Paul,
>>>>> 
>>>>> I'm fine with using JOBS. The only concern is that this may conflict with displaying more detailed
>>>>> information for query (e.g. query content, plan) in the future, e.g. SHOW QUERIES EXTENDED in ksqldb[1].
>>>>> This is not a big problem as we can introduce SHOW QUERIES in the future if necessary.
>>>>> 
>>>>>> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and `table.job.stop-with-drain`)
>>>>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>>>>> It might be trivial and error-prone to set configuration before executing a statement,
>>>>> and the configuration will affect all statements after that.
>>>>> 
>>>>>> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>>>>> and always use configuration "state.savepoints.dir" as the default savepoint dir.
>>>>> The concern with using "<savepoint_path>" is here should be savepoint dir,
>>>>> and savepoint_path is the returned value.
>>>>> 
>>>>> I'm fine with other changes.
>>>>> 
>>>>> Thanks,
>>>>> Jark
>>>>> 
>>>>> [1]: https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/ <https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/>
>>>>> 
>>>>> 
>>>>> 
>>>>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
>>>>>> 
>>>>>> Hi Jing,
>>>>>> 
>>>>>> Thank you for your inputs!
>>>>>> 
>>>>>> TBH, I haven’t considered the ETL scenario that you mentioned. I think they’re managed just like other jobs interns of job lifecycles (please correct me if I’m wrong).
>>>>>> 
>>>>>> WRT to the SQL statements about SQL lineages, I think it might be a little bit out of the scope of the FLIP, since it’s mainly about lifecycles. By the way, do we have these functionalities in Flink CLI or REST API already?
>>>>>> 
>>>>>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m updating the FLIP arcading to the latest discussions.
>>>>>> 
>>>>>> Best,
>>>>>> Paul Lam
>>>>>> 
>>>>>> 2022年6月8日 07:31，Jing Ge <jing@ververica.com <ma...@ververica.com>> 写道：
>>>>>> 
>>>>>> Hi Paul,
>>>>>> 
>>>>>> Sorry that I am a little bit too late to join this thread. Thanks for driving this and starting this informative discussion. The FLIP looks really interesting. It will help us a lot to manage Flink SQL jobs.
>>>>>> 
>>>>>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs build a DAG for many DAGs?
>>>>>> 
>>>>>> 1)
>>>>>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to support ETL jobs. Briefly speaking, SQLs that used to build the DAG are responsible to *produce* data as the result(cube, materialized view, etc.) for the future consumption by queries. The INSERT INTO SELECT FROM example in FLIP and CTAS are typical SQL in this case. I would prefer to call them Jobs instead of Queries.
>>>>>> 
>>>>>> 2)
>>>>>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to support syntax like:
>>>>>> 
>>>>>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>>>>>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given job_id
>>>>>> SHOW JOBTREES // shows all DAGs
>>>>>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>>>>> 
>>>>>> 3)
>>>>>> Could we also support Savepoint housekeeping syntax? We ran into this issue that a lot of savepoints have been created by customers (via their apps). It will take extra (hacking) effort to clean it.
>>>>>> 
>>>>>> RELEASE SAVEPOINT ALL
>>>>>> 
>>>>>> Best regards,
>>>>>> Jing
>>>>>> 
>>>>>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> wrote:
>>>>>>> 
>>>>>>> Hi Paul,
>>>>>>> 
>>>>>>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
>>>>>>> imply that this will actually show the query, but we're returning IDs of
>>>>>>> the running application. At first I was also not very much in favour of
>>>>>>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>>>>>>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>>>>>> 
>>>>>>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>>>>>> 
>>>>>>> Best regards,
>>>>>>> 
>>>>>>> Martijn
>>>>>>> 
>>>>>>> [1]
>>>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary <https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary>
>>>>>>> 
>>>>>>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>:
>>>>>>> 
>>>>>>>> Hi Godfrey,
>>>>>>>> 
>>>>>>>> Sorry for the late reply, I was on vacation.
>>>>>>>> 
>>>>>>>> It looks like we have a variety of preferences on the syntax, how about we
>>>>>>>> choose the most acceptable one?
>>>>>>>> 
>>>>>>>> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
>>>>>>>> would be:
>>>>>>>> 
>>>>>>>> - SHOW JOBS
>>>>>>>> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>>>>>>> `table.job.stop-with-drain`)
>>>>>>>> 
>>>>>>>> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>>>>>>>> JOB`:
>>>>>>>> 
>>>>>>>> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>>>>>> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>>>>>>>> manager remembers)
>>>>>>>> - DROP SAVEPOINT <savepoint_path>
>>>>>>>> 
>>>>>>>> cc @Jark @ShengKai @Martijn @Timo .
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Paul Lam
>>>>>>>> 
>>>>>>>> 
>>>>>>>> godfrey he <godfreyhe@gmail.com <ma...@gmail.com>> 于2022年5月23日周一 21:34写道：
>>>>>>>> 
>>>>>>>>> Hi Paul,
>>>>>>>>> 
>>>>>>>>> Thanks for the update.
>>>>>>>>> 
>>>>>>>>>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>>>>>> (DataStream or SQL) or
>>>>>>>>> clients (SQL client or CLI).
>>>>>>>>> 
>>>>>>>>> Is DataStream job a QUERY? I think not.
>>>>>>>>> For a QUERY, the most important concept is the statement. But the
>>>>>>>>> result does not contain this info.
>>>>>>>>> If we need to contain all jobs in the cluster, I think the name should
>>>>>>>>> be JOB or PIPELINE.
>>>>>>>>> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>>>>>>>> 
>>>>>>>>>> SHOW SAVEPOINTS
>>>>>>>>> To list the savepoint for a specific job, we need to specify a
>>>>>>>>> specific pipeline,
>>>>>>>>> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Godfrey
>>>>>>>>> 
>>>>>>>>> Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 于2022年5月20日周五 11:25写道：
>>>>>>>>>> 
>>>>>>>>>> Hi Jark,
>>>>>>>>>> 
>>>>>>>>>> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>>>>>>>>>> part of the reason why I proposed “STOP/CANCEL QUERY” at the
>>>>>>>>>> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>>>>>>>>> 
>>>>>>>>>> Another question is, what should be the syntax for ungracefully
>>>>>>>>>> canceling a query? As ShengKai pointed out in a offline discussion,
>>>>>>>>>> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>>>>>>>>>> Flink CLI has both stop and cancel, mostly due to historical problems.
>>>>>>>>>> 
>>>>>>>>>> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>>>>>>>>>> that savepoints are owned by users and beyond the lifecycle of a Flink
>>>>>>>>>> cluster. For example, a user might take a savepoint at a custom path
>>>>>>>>>> that’s different than the default savepoint path, I think jobmanager
>>>>>>>>> would
>>>>>>>>>> not remember that, not to mention the jobmanager may be a fresh new
>>>>>>>>>> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
>>>>>>>>>> probably a best-effort one.
>>>>>>>>>> 
>>>>>>>>>> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>>>>>>>>>> Savepoints are alias for nested transactions in DB area[1], and there’s
>>>>>>>>>> correspondingly global transactions. If we consider Flink jobs as
>>>>>>>>>> global transactions and Flink checkpoints as nested transactions,
>>>>>>>>>> then the savepoint semantics are close, thus I think savepoint syntax
>>>>>>>>>> in SQL-standard could be considered. But again, I’m don’t have very
>>>>>>>>>> strong preference.
>>>>>>>>>> 
>>>>>>>>>> Ping @Timo to get more inputs.
>>>>>>>>>> 
>>>>>>>>>> [1] https://en.wikipedia.org/wiki/Nested_transaction <https://en.wikipedia.org/wiki/Nested_transaction> <
>>>>>>>>> https://en.wikipedia.org/wiki/Nested_transaction <https://en.wikipedia.org/wiki/Nested_transaction>>
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Paul Lam
>>>>>>>>>> 
>>>>>>>>>>> 2022年5月18日 17:48，Jark Wu <imjark@gmail.com <ma...@gmail.com>> 写道：
>>>>>>>>>>> 
>>>>>>>>>>> Hi Paul,
>>>>>>>>>>> 
>>>>>>>>>>> 1) SHOW QUERIES
>>>>>>>>>>> +1 to add finished time, but it would be better to call it "end_time"
>>>>>>>>> to
>>>>>>>>>>> keep aligned with names in Web UI.
>>>>>>>>>>> 
>>>>>>>>>>> 2) DROP QUERY
>>>>>>>>>>> I think we shouldn't throw exceptions for batch jobs, otherwise, how
>>>>>>>>> to
>>>>>>>>>>> stop batch queries?
>>>>>>>>>>> At present, I don't think "DROP" is a suitable keyword for this
>>>>>>>>> statement.
>>>>>>>>>>> From the perspective of users, "DROP" sounds like the query should be
>>>>>>>>>>> removed from the
>>>>>>>>>>> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>>>>>>>> more
>>>>>>>>>>> suitable and
>>>>>>>>>>> compliant with commands of Flink CLI.
>>>>>>>>>>> 
>>>>>>>>>>> 3) SHOW SAVEPOINTS
>>>>>>>>>>> I think this statement is needed, otherwise, savepoints are lost
>>>>>>>>> after the
>>>>>>>>>>> SAVEPOINT
>>>>>>>>>>> command is executed. Savepoints can be retrieved from REST API
>>>>>>>>>>> "/jobs/:jobid/checkpoints"
>>>>>>>>>>> with filtering "checkpoint_type"="savepoint". It's also worth
>>>>>>>>> considering
>>>>>>>>>>> providing "SHOW CHECKPOINTS"
>>>>>>>>>>> to list all checkpoints.
>>>>>>>>>>> 
>>>>>>>>>>> 4) SAVEPOINT & RELEASE SAVEPOINT
>>>>>>>>>>> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>>>>>>>> statements
>>>>>>>>>>> now.
>>>>>>>>>>> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>>>>>>>>> both
>>>>>>>>>>> the same savepoint id.
>>>>>>>>>>> However, in our syntax, the first one is query id, and the second one
>>>>>>>>> is
>>>>>>>>>>> savepoint path, which is confusing and
>>>>>>>>>>> not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>>>>>>>> they
>>>>>>>>>>> should be in the same syntax set.
>>>>>>>>>>> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
>>>>>>>>>>> <sp_path>.
>>>>>>>>>>> That means we don't follow the majority of vendors in SAVEPOINT
>>>>>>>>> commands. I
>>>>>>>>>>> would say the purpose is different in Flink.
>>>>>>>>>>> What other's opinion on this?
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Jark
>>>>>>>>>>> 
>>>>>>>>>>> [1]:
>>>>>>>>>>> 
>>>>>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints <https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints>
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> On Wed, 18 May 2022 at 14:43, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Godfrey,
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks a lot for your inputs!
>>>>>>>>>>>> 
>>>>>>>>>>>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>>>>>> (DataStream
>>>>>>>>>>>> or SQL) or
>>>>>>>>>>>> clients (SQL client or CLI). Under the hook, it’s based on
>>>>>>>>>>>> ClusterClient#listJobs, the
>>>>>>>>>>>> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>>>>>>>>> in SQL
>>>>>>>>>>>> client, because
>>>>>>>>>>>> these jobs can be managed via SQL client too.
>>>>>>>>>>>> 
>>>>>>>>>>>> WRT finished time, I think you’re right. Adding it to the FLIP. But
>>>>>>>>> I’m a
>>>>>>>>>>>> bit afraid that the
>>>>>>>>>>>> rows would be too long.
>>>>>>>>>>>> 
>>>>>>>>>>>> WRT ‘DROP QUERY’,
>>>>>>>>>>>>> What's the behavior for batch jobs and the non-running jobs?
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> In general, the behavior would be aligned with Flink CLI. Triggering
>>>>>>>>> a
>>>>>>>>>>>> savepoint for
>>>>>>>>>>>> a non-running job would cause errors, and the error message would be
>>>>>>>>>>>> printed to
>>>>>>>>>>>> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>>>>>>>>>>>> streaming
>>>>>>>>>>>> execution mode would be the same with streaming jobs. However, for
>>>>>>>>> batch
>>>>>>>>>>>> jobs in
>>>>>>>>>>>> batch execution mode, I think there would be an error, because batch
>>>>>>>>>>>> execution
>>>>>>>>>>>> doesn’t support checkpoints currently (please correct me if I’m
>>>>>>>>> wrong).
>>>>>>>>>>>> 
>>>>>>>>>>>> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>>>>>>>> clusterClient/
>>>>>>>>>>>> jobClient doesn’t have such a functionality at the moment, neither do
>>>>>>>>>>>> Flink CLI.
>>>>>>>>>>>> Maybe we could make it a follow-up FLIP, which includes the
>>>>>>>>> modifications
>>>>>>>>>>>> to
>>>>>>>>>>>> clusterClient/jobClient and Flink CLI. WDYT?
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Paul Lam
>>>>>>>>>>>> 
>>>>>>>>>>>>> 2022年5月17日 20:34，godfrey he <godfreyhe@gmail.com <ma...@gmail.com>> 写道：
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Godfrey
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Martijn Visser <ma...@apache.org>.

Hi all,

I would not include a DROP SAVEPOINT syntax. With the recently introduced
CLAIM/NO CLAIM mode, I would argue that we've just clarified snapshot
ownership and if you have a savepoint established "with NO_CLAIM it creates
its own copy and leaves the existing one up to the user." [1] We shouldn't
then again make it fuzzy by making it possible that Flink can remove
snapshots.

Best regards,

Martijn

[1]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-193%3A+Snapshots+ownership

Op do 9 jun. 2022 om 09:27 schreef Paul Lam <pa...@gmail.com>:

> Hi team,
>
> It's great to see our opinions are finally converging!
>
> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>
>
> LGTM. Adding it to the FLIP.
>
> To Jark,
>
> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>”
>
>
> Good point. The default savepoint dir should be enough for most cases.
>
> To Jing,
>
> DROP SAVEPOINT ALL
>
>
> I think it’s valid to have such a statement, but I have two concerns:
>
>    - `ALL` is already an SQL keyword, thus it may cause ambiguity.
>    - Flink CLI and REST API doesn’t provided the corresponding
>    functionalities, and we’d better keep them aligned.
>
> How about making this statement as follow-up tasks which should touch REST
> API and Flink CLI?
>
> Best,
> Paul Lam
>
> 2022年6月9日 11:53，godfrey he <go...@gmail.com> 写道：
>
> Hi all,
>
> Regarding `PIPELINE`, it comes from flink-core module, see
> `PipelineOptions` class for more details.
> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with
> `JOBS`.
>
> +1 to discuss JOBTREE in other FLIP.
>
> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
>
> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT
> <savepoint_path>`
>
> Best,
> Godfrey
>
> Jing Ge <ji...@ververica.com> 于2022年6月9日周四 01:48写道：
>
>
> Hi Paul, Hi Jark,
>
> Re JOBTREE, agree that it is out of the scope of this FLIP
>
> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP
> SAVEPOINT ALL' housekeeping. WDYT?
>
> Best regards,
> Jing
>
>
> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <im...@gmail.com> wrote:
>
>
> Hi Jing,
>
> Regarding JOBTREE (job lineage), I agree with Paul that this is out of the
> scope
> of this FLIP and can be discussed in another FLIP.
>
> Job lineage is a big topic that may involve many problems:
> 1) how to collect and report job entities, attributes, and lineages?
> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
> 3) how does Flink SQL CLI/Gateway know the lineage information and show
> jobtree?
> 4) ...
>
> Best,
> Jark
>
> On Wed, 8 Jun 2022 at 20:44, Jark Wu <im...@gmail.com> wrote:
>
>
> Hi Paul,
>
> I'm fine with using JOBS. The only concern is that this may conflict with
> displaying more detailed
> information for query (e.g. query content, plan) in the future, e.g. SHOW
> QUERIES EXTENDED in ksqldb[1].
> This is not a big problem as we can introduce SHOW QUERIES in the future
> if necessary.
>
> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> `table.job.stop-with-drain`)
>
> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
> It might be trivial and error-prone to set configuration before executing
> a statement,
> and the configuration will affect all statements after that.
>
> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>
> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
> and always use configuration "state.savepoints.dir" as the default
> savepoint dir.
> The concern with using "<savepoint_path>" is here should be savepoint dir,
> and savepoint_path is the returned value.
>
> I'm fine with other changes.
>
> Thanks,
> Jark
>
> [1]:
> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>
>
>
> On Wed, 8 Jun 2022 at 15:07, Paul Lam <pa...@gmail.com> wrote:
>
>
> Hi Jing,
>
> Thank you for your inputs!
>
> TBH, I haven’t considered the ETL scenario that you mentioned. I think
> they’re managed just like other jobs interns of job lifecycles (please
> correct me if I’m wrong).
>
> WRT to the SQL statements about SQL lineages, I think it might be a little
> bit out of the scope of the FLIP, since it’s mainly about lifecycles. By
> the way, do we have these functionalities in Flink CLI or REST API already?
>
> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the
> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m
> updating the FLIP arcading to the latest discussions.
>
> Best,
> Paul Lam
>
> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
>
> Hi Paul,
>
> Sorry that I am a little bit too late to join this thread. Thanks for
> driving this and starting this informative discussion. The FLIP looks
> really interesting. It will help us a lot to manage Flink SQL jobs.
>
> Have you considered the ETL scenario with Flink SQL, where multiple SQLs
> build a DAG for many DAGs?
>
> 1)
> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to
> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
> responsible to *produce* data as the result(cube, materialized view, etc.)
> for the future consumption by queries. The INSERT INTO SELECT FROM example
> in FLIP and CTAS are typical SQL in this case. I would prefer to call them
> Jobs instead of Queries.
>
> 2)
> Speaking of ETL DAG, we might want to see the lineage. Is it possible to
> support syntax like:
>
> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given
> job_id
> SHOW JOBTREES // shows all DAGs
> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>
> 3)
> Could we also support Savepoint housekeeping syntax? We ran into this
> issue that a lot of savepoints have been created by customers (via their
> apps). It will take extra (hacking) effort to clean it.
>
> RELEASE SAVEPOINT ALL
>
> Best regards,
> Jing
>
> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org>
> wrote:
>
>
> Hi Paul,
>
> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
> imply that this will actually show the query, but we're returning IDs of
> the running application. At first I was also not very much in favour of
> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>
> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>
> Best regards,
>
> Martijn
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>
> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>
> Hi Godfrey,
>
> Sorry for the late reply, I was on vacation.
>
> It looks like we have a variety of preferences on the syntax, how about we
> choose the most acceptable one?
>
> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
> would be:
>
> - SHOW JOBS
> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> `table.job.stop-with-drain`)
>
> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
> JOB`:
>
> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
> manager remembers)
> - DROP SAVEPOINT <savepoint_path>
>
> cc @Jark @ShengKai @Martijn @Timo .
>
> Best,
> Paul Lam
>
>
> godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>
> Hi Paul,
>
> Thanks for the update.
>
> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>
> (DataStream or SQL) or
> clients (SQL client or CLI).
>
> Is DataStream job a QUERY? I think not.
> For a QUERY, the most important concept is the statement. But the
> result does not contain this info.
> If we need to contain all jobs in the cluster, I think the name should
> be JOB or PIPELINE.
> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>
> SHOW SAVEPOINTS
>
> To list the savepoint for a specific job, we need to specify a
> specific pipeline,
> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>
> Best,
> Godfrey
>
> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>
>
> Hi Jark,
>
> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
> part of the reason why I proposed “STOP/CANCEL QUERY” at the
> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>
> Another question is, what should be the syntax for ungracefully
> canceling a query? As ShengKai pointed out in a offline discussion,
> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
> Flink CLI has both stop and cancel, mostly due to historical problems.
>
> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
> that savepoints are owned by users and beyond the lifecycle of a Flink
> cluster. For example, a user might take a savepoint at a custom path
> that’s different than the default savepoint path, I think jobmanager
>
> would
>
> not remember that, not to mention the jobmanager may be a fresh new
> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
> probably a best-effort one.
>
> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
> Savepoints are alias for nested transactions in DB area[1], and there’s
> correspondingly global transactions. If we consider Flink jobs as
> global transactions and Flink checkpoints as nested transactions,
> then the savepoint semantics are close, thus I think savepoint syntax
> in SQL-standard could be considered. But again, I’m don’t have very
> strong preference.
>
> Ping @Timo to get more inputs.
>
> [1] https://en.wikipedia.org/wiki/Nested_transaction <
>
> https://en.wikipedia.org/wiki/Nested_transaction>
>
>
> Best,
> Paul Lam
>
> 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>
> Hi Paul,
>
> 1) SHOW QUERIES
> +1 to add finished time, but it would be better to call it "end_time"
>
> to
>
> keep aligned with names in Web UI.
>
> 2) DROP QUERY
> I think we shouldn't throw exceptions for batch jobs, otherwise, how
>
> to
>
> stop batch queries?
> At present, I don't think "DROP" is a suitable keyword for this
>
> statement.
>
> From the perspective of users, "DROP" sounds like the query should be
> removed from the
> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>
> more
>
> suitable and
> compliant with commands of Flink CLI.
>
> 3) SHOW SAVEPOINTS
> I think this statement is needed, otherwise, savepoints are lost
>
> after the
>
> SAVEPOINT
> command is executed. Savepoints can be retrieved from REST API
> "/jobs/:jobid/checkpoints"
> with filtering "checkpoint_type"="savepoint". It's also worth
>
> considering
>
> providing "SHOW CHECKPOINTS"
> to list all checkpoints.
>
> 4) SAVEPOINT & RELEASE SAVEPOINT
> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>
> statements
>
> now.
> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>
> both
>
> the same savepoint id.
> However, in our syntax, the first one is query id, and the second one
>
> is
>
> savepoint path, which is confusing and
> not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>
> they
>
> should be in the same syntax set.
> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
> <sp_path>.
> That means we don't follow the majority of vendors in SAVEPOINT
>
> commands. I
>
> would say the purpose is different in Flink.
> What other's opinion on this?
>
> Best,
> Jark
>
> [1]:
>
>
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>
>
>
> On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
>
> Hi Godfrey,
>
> Thanks a lot for your inputs!
>
> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>
> (DataStream
>
> or SQL) or
> clients (SQL client or CLI). Under the hook, it’s based on
> ClusterClient#listJobs, the
> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>
> in SQL
>
> client, because
> these jobs can be managed via SQL client too.
>
> WRT finished time, I think you’re right. Adding it to the FLIP. But
>
> I’m a
>
> bit afraid that the
> rows would be too long.
>
> WRT ‘DROP QUERY’,
>
> What's the behavior for batch jobs and the non-running jobs?
>
>
>
> In general, the behavior would be aligned with Flink CLI. Triggering
>
> a
>
> savepoint for
> a non-running job would cause errors, and the error message would be
> printed to
> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> streaming
> execution mode would be the same with streaming jobs. However, for
>
> batch
>
> jobs in
> batch execution mode, I think there would be an error, because batch
> execution
> doesn’t support checkpoints currently (please correct me if I’m
>
> wrong).
>
>
> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>
> clusterClient/
>
> jobClient doesn’t have such a functionality at the moment, neither do
> Flink CLI.
> Maybe we could make it a follow-up FLIP, which includes the
>
> modifications
>
> to
> clusterClient/jobClient and Flink CLI. WDYT?
>
> Best,
> Paul Lam
>
> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>
> Godfrey
>
>
>
>
>
>
>
>
>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

Hi team,

It's great to see our opinions are finally converging!

> `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `


LGTM. Adding it to the FLIP.

To Jark,

> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>”

Good point. The default savepoint dir should be enough for most cases.

To Jing,

> DROP SAVEPOINT ALL

I think it’s valid to have such a statement, but I have two concerns:
`ALL` is already an SQL keyword, thus it may cause ambiguity.
Flink CLI and REST API doesn’t provided the corresponding functionalities, and we’d better keep them aligned.
How about making this statement as follow-up tasks which should touch REST API and Flink CLI?

Best,
Paul Lam

> 2022年6月9日 11:53，godfrey he <go...@gmail.com> 写道：
> 
> Hi all,
> 
> Regarding `PIPELINE`, it comes from flink-core module, see
> `PipelineOptions` class for more details.
> `JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with `JOBS`.
> 
> +1 to discuss JOBTREE in other FLIP.
> 
> +1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `
> 
> +1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT <savepoint_path>`
> 
> Best,
> Godfrey
> 
> Jing Ge <ji...@ververica.com> 于2022年6月9日周四 01:48写道：
>> 
>> Hi Paul, Hi Jark,
>> 
>> Re JOBTREE, agree that it is out of the scope of this FLIP
>> 
>> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP SAVEPOINT ALL' housekeeping. WDYT?
>> 
>> Best regards,
>> Jing
>> 
>> 
>> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <im...@gmail.com> wrote:
>>> 
>>> Hi Jing,
>>> 
>>> Regarding JOBTREE (job lineage), I agree with Paul that this is out of the scope
>>> of this FLIP and can be discussed in another FLIP.
>>> 
>>> Job lineage is a big topic that may involve many problems:
>>> 1) how to collect and report job entities, attributes, and lineages?
>>> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
>>> 3) how does Flink SQL CLI/Gateway know the lineage information and show jobtree?
>>> 4) ...
>>> 
>>> Best,
>>> Jark
>>> 
>>> On Wed, 8 Jun 2022 at 20:44, Jark Wu <im...@gmail.com> wrote:
>>>> 
>>>> Hi Paul,
>>>> 
>>>> I'm fine with using JOBS. The only concern is that this may conflict with displaying more detailed
>>>> information for query (e.g. query content, plan) in the future, e.g. SHOW QUERIES EXTENDED in ksqldb[1].
>>>> This is not a big problem as we can introduce SHOW QUERIES in the future if necessary.
>>>> 
>>>>> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and `table.job.stop-with-drain`)
>>>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>>>> It might be trivial and error-prone to set configuration before executing a statement,
>>>> and the configuration will affect all statements after that.
>>>> 
>>>>> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>>>> and always use configuration "state.savepoints.dir" as the default savepoint dir.
>>>> The concern with using "<savepoint_path>" is here should be savepoint dir,
>>>> and savepoint_path is the returned value.
>>>> 
>>>> I'm fine with other changes.
>>>> 
>>>> Thanks,
>>>> Jark
>>>> 
>>>> [1]: https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>>> 
>>>> 
>>>> 
>>>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <pa...@gmail.com> wrote:
>>>>> 
>>>>> Hi Jing,
>>>>> 
>>>>> Thank you for your inputs!
>>>>> 
>>>>> TBH, I haven’t considered the ETL scenario that you mentioned. I think they’re managed just like other jobs interns of job lifecycles (please correct me if I’m wrong).
>>>>> 
>>>>> WRT to the SQL statements about SQL lineages, I think it might be a little bit out of the scope of the FLIP, since it’s mainly about lifecycles. By the way, do we have these functionalities in Flink CLI or REST API already?
>>>>> 
>>>>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m updating the FLIP arcading to the latest discussions.
>>>>> 
>>>>> Best,
>>>>> Paul Lam
>>>>> 
>>>>> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
>>>>> 
>>>>> Hi Paul,
>>>>> 
>>>>> Sorry that I am a little bit too late to join this thread. Thanks for driving this and starting this informative discussion. The FLIP looks really interesting. It will help us a lot to manage Flink SQL jobs.
>>>>> 
>>>>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs build a DAG for many DAGs?
>>>>> 
>>>>> 1)
>>>>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to support ETL jobs. Briefly speaking, SQLs that used to build the DAG are responsible to *produce* data as the result(cube, materialized view, etc.) for the future consumption by queries. The INSERT INTO SELECT FROM example in FLIP and CTAS are typical SQL in this case. I would prefer to call them Jobs instead of Queries.
>>>>> 
>>>>> 2)
>>>>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to support syntax like:
>>>>> 
>>>>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>>>>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given job_id
>>>>> SHOW JOBTREES // shows all DAGs
>>>>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>>>> 
>>>>> 3)
>>>>> Could we also support Savepoint housekeeping syntax? We ran into this issue that a lot of savepoints have been created by customers (via their apps). It will take extra (hacking) effort to clean it.
>>>>> 
>>>>> RELEASE SAVEPOINT ALL
>>>>> 
>>>>> Best regards,
>>>>> Jing
>>>>> 
>>>>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org> wrote:
>>>>>> 
>>>>>> Hi Paul,
>>>>>> 
>>>>>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
>>>>>> imply that this will actually show the query, but we're returning IDs of
>>>>>> the running application. At first I was also not very much in favour of
>>>>>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>>>>>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>>>>> 
>>>>>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>>>>> 
>>>>>> Best regards,
>>>>>> 
>>>>>> Martijn
>>>>>> 
>>>>>> [1]
>>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>>>>> 
>>>>>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>>>>>> 
>>>>>>> Hi Godfrey,
>>>>>>> 
>>>>>>> Sorry for the late reply, I was on vacation.
>>>>>>> 
>>>>>>> It looks like we have a variety of preferences on the syntax, how about we
>>>>>>> choose the most acceptable one?
>>>>>>> 
>>>>>>> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
>>>>>>> would be:
>>>>>>> 
>>>>>>> - SHOW JOBS
>>>>>>> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>>>>>> `table.job.stop-with-drain`)
>>>>>>> 
>>>>>>> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>>>>>>> JOB`:
>>>>>>> 
>>>>>>> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>>>>> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>>>>>>> manager remembers)
>>>>>>> - DROP SAVEPOINT <savepoint_path>
>>>>>>> 
>>>>>>> cc @Jark @ShengKai @Martijn @Timo .
>>>>>>> 
>>>>>>> Best,
>>>>>>> Paul Lam
>>>>>>> 
>>>>>>> 
>>>>>>> godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>>>>>>> 
>>>>>>>> Hi Paul,
>>>>>>>> 
>>>>>>>> Thanks for the update.
>>>>>>>> 
>>>>>>>>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>>>>> (DataStream or SQL) or
>>>>>>>> clients (SQL client or CLI).
>>>>>>>> 
>>>>>>>> Is DataStream job a QUERY? I think not.
>>>>>>>> For a QUERY, the most important concept is the statement. But the
>>>>>>>> result does not contain this info.
>>>>>>>> If we need to contain all jobs in the cluster, I think the name should
>>>>>>>> be JOB or PIPELINE.
>>>>>>>> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>>>>>>> 
>>>>>>>>> SHOW SAVEPOINTS
>>>>>>>> To list the savepoint for a specific job, we need to specify a
>>>>>>>> specific pipeline,
>>>>>>>> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Godfrey
>>>>>>>> 
>>>>>>>> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>>>>>>>>> 
>>>>>>>>> Hi Jark,
>>>>>>>>> 
>>>>>>>>> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>>>>>>>>> part of the reason why I proposed “STOP/CANCEL QUERY” at the
>>>>>>>>> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>>>>>>>> 
>>>>>>>>> Another question is, what should be the syntax for ungracefully
>>>>>>>>> canceling a query? As ShengKai pointed out in a offline discussion,
>>>>>>>>> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>>>>>>>>> Flink CLI has both stop and cancel, mostly due to historical problems.
>>>>>>>>> 
>>>>>>>>> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>>>>>>>>> that savepoints are owned by users and beyond the lifecycle of a Flink
>>>>>>>>> cluster. For example, a user might take a savepoint at a custom path
>>>>>>>>> that’s different than the default savepoint path, I think jobmanager
>>>>>>>> would
>>>>>>>>> not remember that, not to mention the jobmanager may be a fresh new
>>>>>>>>> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
>>>>>>>>> probably a best-effort one.
>>>>>>>>> 
>>>>>>>>> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>>>>>>>>> Savepoints are alias for nested transactions in DB area[1], and there’s
>>>>>>>>> correspondingly global transactions. If we consider Flink jobs as
>>>>>>>>> global transactions and Flink checkpoints as nested transactions,
>>>>>>>>> then the savepoint semantics are close, thus I think savepoint syntax
>>>>>>>>> in SQL-standard could be considered. But again, I’m don’t have very
>>>>>>>>> strong preference.
>>>>>>>>> 
>>>>>>>>> Ping @Timo to get more inputs.
>>>>>>>>> 
>>>>>>>>> [1] https://en.wikipedia.org/wiki/Nested_transaction <
>>>>>>>> https://en.wikipedia.org/wiki/Nested_transaction>
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Paul Lam
>>>>>>>>> 
>>>>>>>>>> 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>>>>>>>>>> 
>>>>>>>>>> Hi Paul,
>>>>>>>>>> 
>>>>>>>>>> 1) SHOW QUERIES
>>>>>>>>>> +1 to add finished time, but it would be better to call it "end_time"
>>>>>>>> to
>>>>>>>>>> keep aligned with names in Web UI.
>>>>>>>>>> 
>>>>>>>>>> 2) DROP QUERY
>>>>>>>>>> I think we shouldn't throw exceptions for batch jobs, otherwise, how
>>>>>>>> to
>>>>>>>>>> stop batch queries?
>>>>>>>>>> At present, I don't think "DROP" is a suitable keyword for this
>>>>>>>> statement.
>>>>>>>>>> From the perspective of users, "DROP" sounds like the query should be
>>>>>>>>>> removed from the
>>>>>>>>>> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>>>>>>> more
>>>>>>>>>> suitable and
>>>>>>>>>> compliant with commands of Flink CLI.
>>>>>>>>>> 
>>>>>>>>>> 3) SHOW SAVEPOINTS
>>>>>>>>>> I think this statement is needed, otherwise, savepoints are lost
>>>>>>>> after the
>>>>>>>>>> SAVEPOINT
>>>>>>>>>> command is executed. Savepoints can be retrieved from REST API
>>>>>>>>>> "/jobs/:jobid/checkpoints"
>>>>>>>>>> with filtering "checkpoint_type"="savepoint". It's also worth
>>>>>>>> considering
>>>>>>>>>> providing "SHOW CHECKPOINTS"
>>>>>>>>>> to list all checkpoints.
>>>>>>>>>> 
>>>>>>>>>> 4) SAVEPOINT & RELEASE SAVEPOINT
>>>>>>>>>> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>>>>>>> statements
>>>>>>>>>> now.
>>>>>>>>>> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>>>>>>>> both
>>>>>>>>>> the same savepoint id.
>>>>>>>>>> However, in our syntax, the first one is query id, and the second one
>>>>>>>> is
>>>>>>>>>> savepoint path, which is confusing and
>>>>>>>>>> not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>>>>>>> they
>>>>>>>>>> should be in the same syntax set.
>>>>>>>>>> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
>>>>>>>>>> <sp_path>.
>>>>>>>>>> That means we don't follow the majority of vendors in SAVEPOINT
>>>>>>>> commands. I
>>>>>>>>>> would say the purpose is different in Flink.
>>>>>>>>>> What other's opinion on this?
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Jark
>>>>>>>>>> 
>>>>>>>>>> [1]:
>>>>>>>>>> 
>>>>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Hi Godfrey,
>>>>>>>>>>> 
>>>>>>>>>>> Thanks a lot for your inputs!
>>>>>>>>>>> 
>>>>>>>>>>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>>>>> (DataStream
>>>>>>>>>>> or SQL) or
>>>>>>>>>>> clients (SQL client or CLI). Under the hook, it’s based on
>>>>>>>>>>> ClusterClient#listJobs, the
>>>>>>>>>>> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>>>>>>>> in SQL
>>>>>>>>>>> client, because
>>>>>>>>>>> these jobs can be managed via SQL client too.
>>>>>>>>>>> 
>>>>>>>>>>> WRT finished time, I think you’re right. Adding it to the FLIP. But
>>>>>>>> I’m a
>>>>>>>>>>> bit afraid that the
>>>>>>>>>>> rows would be too long.
>>>>>>>>>>> 
>>>>>>>>>>> WRT ‘DROP QUERY’,
>>>>>>>>>>>> What's the behavior for batch jobs and the non-running jobs?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> In general, the behavior would be aligned with Flink CLI. Triggering
>>>>>>>> a
>>>>>>>>>>> savepoint for
>>>>>>>>>>> a non-running job would cause errors, and the error message would be
>>>>>>>>>>> printed to
>>>>>>>>>>> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>>>>>>>>>>> streaming
>>>>>>>>>>> execution mode would be the same with streaming jobs. However, for
>>>>>>>> batch
>>>>>>>>>>> jobs in
>>>>>>>>>>> batch execution mode, I think there would be an error, because batch
>>>>>>>>>>> execution
>>>>>>>>>>> doesn’t support checkpoints currently (please correct me if I’m
>>>>>>>> wrong).
>>>>>>>>>>> 
>>>>>>>>>>> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>>>>>>> clusterClient/
>>>>>>>>>>> jobClient doesn’t have such a functionality at the moment, neither do
>>>>>>>>>>> Flink CLI.
>>>>>>>>>>> Maybe we could make it a follow-up FLIP, which includes the
>>>>>>>> modifications
>>>>>>>>>>> to
>>>>>>>>>>> clusterClient/jobClient and Flink CLI. WDYT?
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Paul Lam
>>>>>>>>>>> 
>>>>>>>>>>>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>>>>>>>>>>>> 
>>>>>>>>>>>> Godfrey
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by godfrey he <go...@gmail.com>.

Hi all,

Regarding `PIPELINE`, it comes from flink-core module, see
`PipelineOptions` class for more details.
`JOBS` is a more generic concept than `PIPELINES`. I'm also be fine with `JOBS`.

+1 to discuss JOBTREE in other FLIP.

+1 to `STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] `

+1 to `CREATE SAVEPOINT FOR JOB <job_id>` and `DROP SAVEPOINT <savepoint_path>`

Best,
Godfrey

Jing Ge <ji...@ververica.com> 于2022年6月9日周四 01:48写道：
>
> Hi Paul, Hi Jark,
>
> Re JOBTREE, agree that it is out of the scope of this FLIP
>
> Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP SAVEPOINT ALL' housekeeping. WDYT?
>
> Best regards,
> Jing
>
>
> On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <im...@gmail.com> wrote:
>>
>> Hi Jing,
>>
>> Regarding JOBTREE (job lineage), I agree with Paul that this is out of the scope
>>  of this FLIP and can be discussed in another FLIP.
>>
>> Job lineage is a big topic that may involve many problems:
>> 1) how to collect and report job entities, attributes, and lineages?
>> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
>> 3) how does Flink SQL CLI/Gateway know the lineage information and show jobtree?
>> 4) ...
>>
>> Best,
>> Jark
>>
>> On Wed, 8 Jun 2022 at 20:44, Jark Wu <im...@gmail.com> wrote:
>>>
>>> Hi Paul,
>>>
>>> I'm fine with using JOBS. The only concern is that this may conflict with displaying more detailed
>>> information for query (e.g. query content, plan) in the future, e.g. SHOW QUERIES EXTENDED in ksqldb[1].
>>> This is not a big problem as we can introduce SHOW QUERIES in the future if necessary.
>>>
>>> > STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and `table.job.stop-with-drain`)
>>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>>> It might be trivial and error-prone to set configuration before executing a statement,
>>> and the configuration will affect all statements after that.
>>>
>>> > CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>>> and always use configuration "state.savepoints.dir" as the default savepoint dir.
>>> The concern with using "<savepoint_path>" is here should be savepoint dir,
>>> and savepoint_path is the returned value.
>>>
>>> I'm fine with other changes.
>>>
>>> Thanks,
>>> Jark
>>>
>>> [1]: https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>>
>>>
>>>
>>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <pa...@gmail.com> wrote:
>>>>
>>>> Hi Jing,
>>>>
>>>> Thank you for your inputs!
>>>>
>>>> TBH, I haven’t considered the ETL scenario that you mentioned. I think they’re managed just like other jobs interns of job lifecycles (please correct me if I’m wrong).
>>>>
>>>> WRT to the SQL statements about SQL lineages, I think it might be a little bit out of the scope of the FLIP, since it’s mainly about lifecycles. By the way, do we have these functionalities in Flink CLI or REST API already?
>>>>
>>>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m updating the FLIP arcading to the latest discussions.
>>>>
>>>> Best,
>>>> Paul Lam
>>>>
>>>> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
>>>>
>>>> Hi Paul,
>>>>
>>>> Sorry that I am a little bit too late to join this thread. Thanks for driving this and starting this informative discussion. The FLIP looks really interesting. It will help us a lot to manage Flink SQL jobs.
>>>>
>>>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs build a DAG for many DAGs?
>>>>
>>>> 1)
>>>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to support ETL jobs. Briefly speaking, SQLs that used to build the DAG are responsible to *produce* data as the result(cube, materialized view, etc.) for the future consumption by queries. The INSERT INTO SELECT FROM example in FLIP and CTAS are typical SQL in this case. I would prefer to call them Jobs instead of Queries.
>>>>
>>>> 2)
>>>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to support syntax like:
>>>>
>>>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>>>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given job_id
>>>> SHOW JOBTREES // shows all DAGs
>>>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>>>
>>>> 3)
>>>> Could we also support Savepoint housekeeping syntax? We ran into this issue that a lot of savepoints have been created by customers (via their apps). It will take extra (hacking) effort to clean it.
>>>>
>>>> RELEASE SAVEPOINT ALL
>>>>
>>>> Best regards,
>>>> Jing
>>>>
>>>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org> wrote:
>>>>>
>>>>> Hi Paul,
>>>>>
>>>>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
>>>>> imply that this will actually show the query, but we're returning IDs of
>>>>> the running application. At first I was also not very much in favour of
>>>>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>>>>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>>>>
>>>>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>>>>
>>>>> Best regards,
>>>>>
>>>>> Martijn
>>>>>
>>>>> [1]
>>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>>>>
>>>>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>>>>>
>>>>> > Hi Godfrey,
>>>>> >
>>>>> > Sorry for the late reply, I was on vacation.
>>>>> >
>>>>> > It looks like we have a variety of preferences on the syntax, how about we
>>>>> > choose the most acceptable one?
>>>>> >
>>>>> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
>>>>> > would be:
>>>>> >
>>>>> > - SHOW JOBS
>>>>> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>>>> > `table.job.stop-with-drain`)
>>>>> >
>>>>> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>>>>> > JOB`:
>>>>> >
>>>>> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>>> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>>>>> > manager remembers)
>>>>> > - DROP SAVEPOINT <savepoint_path>
>>>>> >
>>>>> > cc @Jark @ShengKai @Martijn @Timo .
>>>>> >
>>>>> > Best,
>>>>> > Paul Lam
>>>>> >
>>>>> >
>>>>> > godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>>>>> >
>>>>> >> Hi Paul,
>>>>> >>
>>>>> >> Thanks for the update.
>>>>> >>
>>>>> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>> >> (DataStream or SQL) or
>>>>> >> clients (SQL client or CLI).
>>>>> >>
>>>>> >> Is DataStream job a QUERY? I think not.
>>>>> >> For a QUERY, the most important concept is the statement. But the
>>>>> >> result does not contain this info.
>>>>> >> If we need to contain all jobs in the cluster, I think the name should
>>>>> >> be JOB or PIPELINE.
>>>>> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>>>> >>
>>>>> >> > SHOW SAVEPOINTS
>>>>> >> To list the savepoint for a specific job, we need to specify a
>>>>> >> specific pipeline,
>>>>> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>>>> >>
>>>>> >> Best,
>>>>> >> Godfrey
>>>>> >>
>>>>> >> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>>>>> >> >
>>>>> >> > Hi Jark,
>>>>> >> >
>>>>> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>>>>> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
>>>>> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>>>> >> >
>>>>> >> > Another question is, what should be the syntax for ungracefully
>>>>> >> > canceling a query? As ShengKai pointed out in a offline discussion,
>>>>> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>>>>> >> > Flink CLI has both stop and cancel, mostly due to historical problems.
>>>>> >> >
>>>>> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>>>>> >> > that savepoints are owned by users and beyond the lifecycle of a Flink
>>>>> >> > cluster. For example, a user might take a savepoint at a custom path
>>>>> >> > that’s different than the default savepoint path, I think jobmanager
>>>>> >> would
>>>>> >> > not remember that, not to mention the jobmanager may be a fresh new
>>>>> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
>>>>> >> > probably a best-effort one.
>>>>> >> >
>>>>> >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>>>>> >> > Savepoints are alias for nested transactions in DB area[1], and there’s
>>>>> >> > correspondingly global transactions. If we consider Flink jobs as
>>>>> >> > global transactions and Flink checkpoints as nested transactions,
>>>>> >> > then the savepoint semantics are close, thus I think savepoint syntax
>>>>> >> > in SQL-standard could be considered. But again, I’m don’t have very
>>>>> >> > strong preference.
>>>>> >> >
>>>>> >> > Ping @Timo to get more inputs.
>>>>> >> >
>>>>> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
>>>>> >> https://en.wikipedia.org/wiki/Nested_transaction>
>>>>> >> >
>>>>> >> > Best,
>>>>> >> > Paul Lam
>>>>> >> >
>>>>> >> > > 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>>>>> >> > >
>>>>> >> > > Hi Paul,
>>>>> >> > >
>>>>> >> > > 1) SHOW QUERIES
>>>>> >> > > +1 to add finished time, but it would be better to call it "end_time"
>>>>> >> to
>>>>> >> > > keep aligned with names in Web UI.
>>>>> >> > >
>>>>> >> > > 2) DROP QUERY
>>>>> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how
>>>>> >> to
>>>>> >> > > stop batch queries?
>>>>> >> > > At present, I don't think "DROP" is a suitable keyword for this
>>>>> >> statement.
>>>>> >> > > From the perspective of users, "DROP" sounds like the query should be
>>>>> >> > > removed from the
>>>>> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>>>> >> more
>>>>> >> > > suitable and
>>>>> >> > > compliant with commands of Flink CLI.
>>>>> >> > >
>>>>> >> > > 3) SHOW SAVEPOINTS
>>>>> >> > > I think this statement is needed, otherwise, savepoints are lost
>>>>> >> after the
>>>>> >> > > SAVEPOINT
>>>>> >> > > command is executed. Savepoints can be retrieved from REST API
>>>>> >> > > "/jobs/:jobid/checkpoints"
>>>>> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
>>>>> >> considering
>>>>> >> > > providing "SHOW CHECKPOINTS"
>>>>> >> > > to list all checkpoints.
>>>>> >> > >
>>>>> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
>>>>> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>>>> >> statements
>>>>> >> > > now.
>>>>> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>>>>> >> both
>>>>> >> > > the same savepoint id.
>>>>> >> > > However, in our syntax, the first one is query id, and the second one
>>>>> >> is
>>>>> >> > > savepoint path, which is confusing and
>>>>> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>>>> >> they
>>>>> >> > > should be in the same syntax set.
>>>>> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
>>>>> >> > > <sp_path>.
>>>>> >> > > That means we don't follow the majority of vendors in SAVEPOINT
>>>>> >> commands. I
>>>>> >> > > would say the purpose is different in Flink.
>>>>> >> > > What other's opinion on this?
>>>>> >> > >
>>>>> >> > > Best,
>>>>> >> > > Jark
>>>>> >> > >
>>>>> >> > > [1]:
>>>>> >> > >
>>>>> >> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>>>>> >> > >
>>>>> >> > >
>>>>> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
>>>>> >> > >
>>>>> >> > >> Hi Godfrey,
>>>>> >> > >>
>>>>> >> > >> Thanks a lot for your inputs!
>>>>> >> > >>
>>>>> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>>> >> (DataStream
>>>>> >> > >> or SQL) or
>>>>> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
>>>>> >> > >> ClusterClient#listJobs, the
>>>>> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>>>>> >> in SQL
>>>>> >> > >> client, because
>>>>> >> > >> these jobs can be managed via SQL client too.
>>>>> >> > >>
>>>>> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But
>>>>> >> I’m a
>>>>> >> > >> bit afraid that the
>>>>> >> > >> rows would be too long.
>>>>> >> > >>
>>>>> >> > >> WRT ‘DROP QUERY’,
>>>>> >> > >>> What's the behavior for batch jobs and the non-running jobs?
>>>>> >> > >>
>>>>> >> > >>
>>>>> >> > >> In general, the behavior would be aligned with Flink CLI. Triggering
>>>>> >> a
>>>>> >> > >> savepoint for
>>>>> >> > >> a non-running job would cause errors, and the error message would be
>>>>> >> > >> printed to
>>>>> >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>>>>> >> > >> streaming
>>>>> >> > >> execution mode would be the same with streaming jobs. However, for
>>>>> >> batch
>>>>> >> > >> jobs in
>>>>> >> > >> batch execution mode, I think there would be an error, because batch
>>>>> >> > >> execution
>>>>> >> > >> doesn’t support checkpoints currently (please correct me if I’m
>>>>> >> wrong).
>>>>> >> > >>
>>>>> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>>>> >> clusterClient/
>>>>> >> > >> jobClient doesn’t have such a functionality at the moment, neither do
>>>>> >> > >> Flink CLI.
>>>>> >> > >> Maybe we could make it a follow-up FLIP, which includes the
>>>>> >> modifications
>>>>> >> > >> to
>>>>> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
>>>>> >> > >>
>>>>> >> > >> Best,
>>>>> >> > >> Paul Lam
>>>>> >> > >>
>>>>> >> > >>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>>>>> >> > >>>
>>>>> >> > >>> Godfrey
>>>>> >> > >>
>>>>> >> > >>
>>>>> >> >
>>>>> >>
>>>>> >
>>>>
>>>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Jing Ge <ji...@ververica.com>.

Hi Paul, Hi Jark,

Re JOBTREE, agree that it is out of the scope of this FLIP

Re `RELEASE SAVEPOINT ALL', if the community prefers 'DROP' then 'DROP
SAVEPOINT ALL' housekeeping. WDYT?

Best regards,
Jing


On Wed, Jun 8, 2022 at 2:54 PM Jark Wu <im...@gmail.com> wrote:

> Hi Jing,
>
> Regarding JOBTREE (job lineage), I agree with Paul that this is out of the
> scope
>  of this FLIP and can be discussed in another FLIP.
>
> Job lineage is a big topic that may involve many problems:
> 1) how to collect and report job entities, attributes, and lineages?
> 2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
> 3) how does Flink SQL CLI/Gateway know the lineage information and show
> jobtree?
> 4) ...
>
> Best,
> Jark
>
> On Wed, 8 Jun 2022 at 20:44, Jark Wu <im...@gmail.com> wrote:
>
>> Hi Paul,
>>
>> I'm fine with using JOBS. The only concern is that this may conflict with
>> displaying more detailed
>> information for query (e.g. query content, plan) in the future, e.g. SHOW
>> QUERIES EXTENDED in ksqldb[1].
>> This is not a big problem as we can introduce SHOW QUERIES in the future
>> if necessary.
>>
>> > STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>> `table.job.stop-with-drain`)
>> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
>> It might be trivial and error-prone to set configuration before executing
>> a statement,
>> and the configuration will affect all statements after that.
>>
>> > CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
>> and always use configuration "state.savepoints.dir" as the default
>> savepoint dir.
>> The concern with using "<savepoint_path>" is here should be savepoint
>> dir,
>> and savepoint_path is the returned value.
>>
>> I'm fine with other changes.
>>
>> Thanks,
>> Jark
>>
>> [1]:
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>
>>
>>
>>
>> On Wed, 8 Jun 2022 at 15:07, Paul Lam <pa...@gmail.com> wrote:
>>
>>> Hi Jing,
>>>
>>> Thank you for your inputs!
>>>
>>> TBH, I haven’t considered the ETL scenario that you mentioned. I think
>>> they’re managed just like other jobs interns of job lifecycles (please
>>> correct me if I’m wrong).
>>>
>>> WRT to the SQL statements about SQL lineages, I think it might be a
>>> little bit out of the scope of the FLIP, since it’s mainly about
>>> lifecycles. By the way, do we have these functionalities in Flink CLI or
>>> REST API already?
>>>
>>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the
>>> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m
>>> updating the FLIP arcading to the latest discussions.
>>>
>>> Best,
>>> Paul Lam
>>>
>>> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
>>>
>>> Hi Paul,
>>>
>>> Sorry that I am a little bit too late to join this thread. Thanks for
>>> driving this and starting this informative discussion. The FLIP looks
>>> really interesting. It will help us a lot to manage Flink SQL jobs.
>>>
>>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs
>>> build a DAG for many DAGs?
>>>
>>> 1)
>>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how
>>> to support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
>>> responsible to *produce* data as the result(cube, materialized view, etc.)
>>> for the future consumption by queries. The INSERT INTO SELECT FROM example
>>> in FLIP and CTAS are typical SQL in this case. I would prefer to call them
>>> Jobs instead of Queries.
>>>
>>> 2)
>>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to
>>> support syntax like:
>>>
>>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the
>>> given job_id
>>> SHOW JOBTREES // shows all DAGs
>>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>>
>>> 3)
>>> Could we also support Savepoint housekeeping syntax? We ran into this
>>> issue that a lot of savepoints have been created by customers (via their
>>> apps). It will take extra (hacking) effort to clean it.
>>>
>>> RELEASE SAVEPOINT ALL
>>>
>>> Best regards,
>>> Jing
>>>
>>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org>
>>> wrote:
>>>
>>>> Hi Paul,
>>>>
>>>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES
>>>> could
>>>> imply that this will actually show the query, but we're returning IDs of
>>>> the running application. At first I was also not very much in favour of
>>>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>>>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP
>>>> JOBS
>>>>
>>>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>>>
>>>> Best regards,
>>>>
>>>> Martijn
>>>>
>>>> [1]
>>>>
>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>>>
>>>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>>>>
>>>> > Hi Godfrey,
>>>> >
>>>> > Sorry for the late reply, I was on vacation.
>>>> >
>>>> > It looks like we have a variety of preferences on the syntax, how
>>>> about we
>>>> > choose the most acceptable one?
>>>> >
>>>> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to
>>>> jobs
>>>> > would be:
>>>> >
>>>> > - SHOW JOBS
>>>> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>>> > `table.job.stop-with-drain`)
>>>> >
>>>> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>>>> > JOB`:
>>>> >
>>>> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>>> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>>>> > manager remembers)
>>>> > - DROP SAVEPOINT <savepoint_path>
>>>> >
>>>> > cc @Jark @ShengKai @Martijn @Timo .
>>>> >
>>>> > Best,
>>>> > Paul Lam
>>>> >
>>>> >
>>>> > godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>>>> >
>>>> >> Hi Paul,
>>>> >>
>>>> >> Thanks for the update.
>>>> >>
>>>> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>> >> (DataStream or SQL) or
>>>> >> clients (SQL client or CLI).
>>>> >>
>>>> >> Is DataStream job a QUERY? I think not.
>>>> >> For a QUERY, the most important concept is the statement. But the
>>>> >> result does not contain this info.
>>>> >> If we need to contain all jobs in the cluster, I think the name
>>>> should
>>>> >> be JOB or PIPELINE.
>>>> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>>> >>
>>>> >> > SHOW SAVEPOINTS
>>>> >> To list the savepoint for a specific job, we need to specify a
>>>> >> specific pipeline,
>>>> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>>> >>
>>>> >> Best,
>>>> >> Godfrey
>>>> >>
>>>> >> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>>>> >> >
>>>> >> > Hi Jark,
>>>> >> >
>>>> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>>>> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
>>>> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>>> >> >
>>>> >> > Another question is, what should be the syntax for ungracefully
>>>> >> > canceling a query? As ShengKai pointed out in a offline discussion,
>>>> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>>>> >> > Flink CLI has both stop and cancel, mostly due to historical
>>>> problems.
>>>> >> >
>>>> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>>>> >> > that savepoints are owned by users and beyond the lifecycle of a
>>>> Flink
>>>> >> > cluster. For example, a user might take a savepoint at a custom
>>>> path
>>>> >> > that’s different than the default savepoint path, I think
>>>> jobmanager
>>>> >> would
>>>> >> > not remember that, not to mention the jobmanager may be a fresh new
>>>> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”,
>>>> it's
>>>> >> > probably a best-effort one.
>>>> >> >
>>>> >> > WRT savepoint syntax, I’m thinking of the semantic of the
>>>> savepoint.
>>>> >> > Savepoints are alias for nested transactions in DB area[1], and
>>>> there’s
>>>> >> > correspondingly global transactions. If we consider Flink jobs as
>>>> >> > global transactions and Flink checkpoints as nested transactions,
>>>> >> > then the savepoint semantics are close, thus I think savepoint
>>>> syntax
>>>> >> > in SQL-standard could be considered. But again, I’m don’t have very
>>>> >> > strong preference.
>>>> >> >
>>>> >> > Ping @Timo to get more inputs.
>>>> >> >
>>>> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
>>>> >> https://en.wikipedia.org/wiki/Nested_transaction>
>>>> >> >
>>>> >> > Best,
>>>> >> > Paul Lam
>>>> >> >
>>>> >> > > 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>>>> >> > >
>>>> >> > > Hi Paul,
>>>> >> > >
>>>> >> > > 1) SHOW QUERIES
>>>> >> > > +1 to add finished time, but it would be better to call it
>>>> "end_time"
>>>> >> to
>>>> >> > > keep aligned with names in Web UI.
>>>> >> > >
>>>> >> > > 2) DROP QUERY
>>>> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise,
>>>> how
>>>> >> to
>>>> >> > > stop batch queries?
>>>> >> > > At present, I don't think "DROP" is a suitable keyword for this
>>>> >> statement.
>>>> >> > > From the perspective of users, "DROP" sounds like the query
>>>> should be
>>>> >> > > removed from the
>>>> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY"
>>>> is
>>>> >> more
>>>> >> > > suitable and
>>>> >> > > compliant with commands of Flink CLI.
>>>> >> > >
>>>> >> > > 3) SHOW SAVEPOINTS
>>>> >> > > I think this statement is needed, otherwise, savepoints are lost
>>>> >> after the
>>>> >> > > SAVEPOINT
>>>> >> > > command is executed. Savepoints can be retrieved from REST API
>>>> >> > > "/jobs/:jobid/checkpoints"
>>>> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
>>>> >> considering
>>>> >> > > providing "SHOW CHECKPOINTS"
>>>> >> > > to list all checkpoints.
>>>> >> > >
>>>> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
>>>> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>>> >> statements
>>>> >> > > now.
>>>> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE
>>>> SAVEPOINT are
>>>> >> both
>>>> >> > > the same savepoint id.
>>>> >> > > However, in our syntax, the first one is query id, and the
>>>> second one
>>>> >> is
>>>> >> > > savepoint path, which is confusing and
>>>> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought
>>>> maybe
>>>> >> they
>>>> >> > > should be in the same syntax set.
>>>> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP
>>>> SAVEPOINT
>>>> >> > > <sp_path>.
>>>> >> > > That means we don't follow the majority of vendors in SAVEPOINT
>>>> >> commands. I
>>>> >> > > would say the purpose is different in Flink.
>>>> >> > > What other's opinion on this?
>>>> >> > >
>>>> >> > > Best,
>>>> >> > > Jark
>>>> >> > >
>>>> >> > > [1]:
>>>> >> > >
>>>> >>
>>>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>>>> >> > >
>>>> >> > >
>>>> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com>
>>>> wrote:
>>>> >> > >
>>>> >> > >> Hi Godfrey,
>>>> >> > >>
>>>> >> > >> Thanks a lot for your inputs!
>>>> >> > >>
>>>> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>>> >> (DataStream
>>>> >> > >> or SQL) or
>>>> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
>>>> >> > >> ClusterClient#listJobs, the
>>>> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs
>>>> listed
>>>> >> in SQL
>>>> >> > >> client, because
>>>> >> > >> these jobs can be managed via SQL client too.
>>>> >> > >>
>>>> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP.
>>>> But
>>>> >> I’m a
>>>> >> > >> bit afraid that the
>>>> >> > >> rows would be too long.
>>>> >> > >>
>>>> >> > >> WRT ‘DROP QUERY’,
>>>> >> > >>> What's the behavior for batch jobs and the non-running jobs?
>>>> >> > >>
>>>> >> > >>
>>>> >> > >> In general, the behavior would be aligned with Flink CLI.
>>>> Triggering
>>>> >> a
>>>> >> > >> savepoint for
>>>> >> > >> a non-running job would cause errors, and the error message
>>>> would be
>>>> >> > >> printed to
>>>> >> > >> the SQL client. Triggering a savepoint for batch(unbounded)
>>>> jobs in
>>>> >> > >> streaming
>>>> >> > >> execution mode would be the same with streaming jobs. However,
>>>> for
>>>> >> batch
>>>> >> > >> jobs in
>>>> >> > >> batch execution mode, I think there would be an error, because
>>>> batch
>>>> >> > >> execution
>>>> >> > >> doesn’t support checkpoints currently (please correct me if I’m
>>>> >> wrong).
>>>> >> > >>
>>>> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>>> >> clusterClient/
>>>> >> > >> jobClient doesn’t have such a functionality at the moment,
>>>> neither do
>>>> >> > >> Flink CLI.
>>>> >> > >> Maybe we could make it a follow-up FLIP, which includes the
>>>> >> modifications
>>>> >> > >> to
>>>> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
>>>> >> > >>
>>>> >> > >> Best,
>>>> >> > >> Paul Lam
>>>> >> > >>
>>>> >> > >>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>>>> >> > >>>
>>>> >> > >>> Godfrey
>>>> >> > >>
>>>> >> > >>
>>>> >> >
>>>> >>
>>>> >
>>>>
>>>
>>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Jark Wu <im...@gmail.com>.

Hi Jing,

Regarding JOBTREE (job lineage), I agree with Paul that this is out of the
scope
 of this FLIP and can be discussed in another FLIP.

Job lineage is a big topic that may involve many problems:
1) how to collect and report job entities, attributes, and lineages?
2) how to integrate with data catalogs, e.g. Apache Atlas, DataHub?
3) how does Flink SQL CLI/Gateway know the lineage information and show
jobtree?
4) ...

Best,
Jark

On Wed, 8 Jun 2022 at 20:44, Jark Wu <im...@gmail.com> wrote:

> Hi Paul,
>
> I'm fine with using JOBS. The only concern is that this may conflict with
> displaying more detailed
> information for query (e.g. query content, plan) in the future, e.g. SHOW
> QUERIES EXTENDED in ksqldb[1].
> This is not a big problem as we can introduce SHOW QUERIES in the future
> if necessary.
>
> > STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> `table.job.stop-with-drain`)
> What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
> It might be trivial and error-prone to set configuration before executing
> a statement,
> and the configuration will affect all statements after that.
>
> > CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
> We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
> and always use configuration "state.savepoints.dir" as the default
> savepoint dir.
> The concern with using "<savepoint_path>" is here should be savepoint dir,
> and savepoint_path is the returned value.
>
> I'm fine with other changes.
>
> Thanks,
> Jark
>
> [1]:
> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>
>
>
>
> On Wed, 8 Jun 2022 at 15:07, Paul Lam <pa...@gmail.com> wrote:
>
>> Hi Jing,
>>
>> Thank you for your inputs!
>>
>> TBH, I haven’t considered the ETL scenario that you mentioned. I think
>> they’re managed just like other jobs interns of job lifecycles (please
>> correct me if I’m wrong).
>>
>> WRT to the SQL statements about SQL lineages, I think it might be a
>> little bit out of the scope of the FLIP, since it’s mainly about
>> lifecycles. By the way, do we have these functionalities in Flink CLI or
>> REST API already?
>>
>> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the
>> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m
>> updating the FLIP arcading to the latest discussions.
>>
>> Best,
>> Paul Lam
>>
>> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
>>
>> Hi Paul,
>>
>> Sorry that I am a little bit too late to join this thread. Thanks for
>> driving this and starting this informative discussion. The FLIP looks
>> really interesting. It will help us a lot to manage Flink SQL jobs.
>>
>> Have you considered the ETL scenario with Flink SQL, where multiple SQLs
>> build a DAG for many DAGs?
>>
>> 1)
>> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to
>> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
>> responsible to *produce* data as the result(cube, materialized view, etc.)
>> for the future consumption by queries. The INSERT INTO SELECT FROM example
>> in FLIP and CTAS are typical SQL in this case. I would prefer to call them
>> Jobs instead of Queries.
>>
>> 2)
>> Speaking of ETL DAG, we might want to see the lineage. Is it possible to
>> support syntax like:
>>
>> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
>> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given
>> job_id
>> SHOW JOBTREES // shows all DAGs
>> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>>
>> 3)
>> Could we also support Savepoint housekeeping syntax? We ran into this
>> issue that a lot of savepoints have been created by customers (via their
>> apps). It will take extra (hacking) effort to clean it.
>>
>> RELEASE SAVEPOINT ALL
>>
>> Best regards,
>> Jing
>>
>> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org>
>> wrote:
>>
>>> Hi Paul,
>>>
>>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES
>>> could
>>> imply that this will actually show the query, but we're returning IDs of
>>> the running application. At first I was also not very much in favour of
>>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>>
>>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>>
>>> Best regards,
>>>
>>> Martijn
>>>
>>> [1]
>>>
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>>
>>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>>>
>>> > Hi Godfrey,
>>> >
>>> > Sorry for the late reply, I was on vacation.
>>> >
>>> > It looks like we have a variety of preferences on the syntax, how
>>> about we
>>> > choose the most acceptable one?
>>> >
>>> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to
>>> jobs
>>> > would be:
>>> >
>>> > - SHOW JOBS
>>> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>>> > `table.job.stop-with-drain`)
>>> >
>>> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>>> > JOB`:
>>> >
>>> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>>> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>>> > manager remembers)
>>> > - DROP SAVEPOINT <savepoint_path>
>>> >
>>> > cc @Jark @ShengKai @Martijn @Timo .
>>> >
>>> > Best,
>>> > Paul Lam
>>> >
>>> >
>>> > godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>>> >
>>> >> Hi Paul,
>>> >>
>>> >> Thanks for the update.
>>> >>
>>> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>> >> (DataStream or SQL) or
>>> >> clients (SQL client or CLI).
>>> >>
>>> >> Is DataStream job a QUERY? I think not.
>>> >> For a QUERY, the most important concept is the statement. But the
>>> >> result does not contain this info.
>>> >> If we need to contain all jobs in the cluster, I think the name should
>>> >> be JOB or PIPELINE.
>>> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>> >>
>>> >> > SHOW SAVEPOINTS
>>> >> To list the savepoint for a specific job, we need to specify a
>>> >> specific pipeline,
>>> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>> >>
>>> >> Best,
>>> >> Godfrey
>>> >>
>>> >> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>>> >> >
>>> >> > Hi Jark,
>>> >> >
>>> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>>> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
>>> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
>>> >> >
>>> >> > Another question is, what should be the syntax for ungracefully
>>> >> > canceling a query? As ShengKai pointed out in a offline discussion,
>>> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>>> >> > Flink CLI has both stop and cancel, mostly due to historical
>>> problems.
>>> >> >
>>> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>>> >> > that savepoints are owned by users and beyond the lifecycle of a
>>> Flink
>>> >> > cluster. For example, a user might take a savepoint at a custom path
>>> >> > that’s different than the default savepoint path, I think jobmanager
>>> >> would
>>> >> > not remember that, not to mention the jobmanager may be a fresh new
>>> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”,
>>> it's
>>> >> > probably a best-effort one.
>>> >> >
>>> >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>>> >> > Savepoints are alias for nested transactions in DB area[1], and
>>> there’s
>>> >> > correspondingly global transactions. If we consider Flink jobs as
>>> >> > global transactions and Flink checkpoints as nested transactions,
>>> >> > then the savepoint semantics are close, thus I think savepoint
>>> syntax
>>> >> > in SQL-standard could be considered. But again, I’m don’t have very
>>> >> > strong preference.
>>> >> >
>>> >> > Ping @Timo to get more inputs.
>>> >> >
>>> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
>>> >> https://en.wikipedia.org/wiki/Nested_transaction>
>>> >> >
>>> >> > Best,
>>> >> > Paul Lam
>>> >> >
>>> >> > > 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>>> >> > >
>>> >> > > Hi Paul,
>>> >> > >
>>> >> > > 1) SHOW QUERIES
>>> >> > > +1 to add finished time, but it would be better to call it
>>> "end_time"
>>> >> to
>>> >> > > keep aligned with names in Web UI.
>>> >> > >
>>> >> > > 2) DROP QUERY
>>> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise,
>>> how
>>> >> to
>>> >> > > stop batch queries?
>>> >> > > At present, I don't think "DROP" is a suitable keyword for this
>>> >> statement.
>>> >> > > From the perspective of users, "DROP" sounds like the query
>>> should be
>>> >> > > removed from the
>>> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>>> >> more
>>> >> > > suitable and
>>> >> > > compliant with commands of Flink CLI.
>>> >> > >
>>> >> > > 3) SHOW SAVEPOINTS
>>> >> > > I think this statement is needed, otherwise, savepoints are lost
>>> >> after the
>>> >> > > SAVEPOINT
>>> >> > > command is executed. Savepoints can be retrieved from REST API
>>> >> > > "/jobs/:jobid/checkpoints"
>>> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
>>> >> considering
>>> >> > > providing "SHOW CHECKPOINTS"
>>> >> > > to list all checkpoints.
>>> >> > >
>>> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
>>> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>>> >> statements
>>> >> > > now.
>>> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT
>>> are
>>> >> both
>>> >> > > the same savepoint id.
>>> >> > > However, in our syntax, the first one is query id, and the second
>>> one
>>> >> is
>>> >> > > savepoint path, which is confusing and
>>> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>>> >> they
>>> >> > > should be in the same syntax set.
>>> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP
>>> SAVEPOINT
>>> >> > > <sp_path>.
>>> >> > > That means we don't follow the majority of vendors in SAVEPOINT
>>> >> commands. I
>>> >> > > would say the purpose is different in Flink.
>>> >> > > What other's opinion on this?
>>> >> > >
>>> >> > > Best,
>>> >> > > Jark
>>> >> > >
>>> >> > > [1]:
>>> >> > >
>>> >>
>>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>>> >> > >
>>> >> > >
>>> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com>
>>> wrote:
>>> >> > >
>>> >> > >> Hi Godfrey,
>>> >> > >>
>>> >> > >> Thanks a lot for your inputs!
>>> >> > >>
>>> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>>> >> (DataStream
>>> >> > >> or SQL) or
>>> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
>>> >> > >> ClusterClient#listJobs, the
>>> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs
>>> listed
>>> >> in SQL
>>> >> > >> client, because
>>> >> > >> these jobs can be managed via SQL client too.
>>> >> > >>
>>> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP.
>>> But
>>> >> I’m a
>>> >> > >> bit afraid that the
>>> >> > >> rows would be too long.
>>> >> > >>
>>> >> > >> WRT ‘DROP QUERY’,
>>> >> > >>> What's the behavior for batch jobs and the non-running jobs?
>>> >> > >>
>>> >> > >>
>>> >> > >> In general, the behavior would be aligned with Flink CLI.
>>> Triggering
>>> >> a
>>> >> > >> savepoint for
>>> >> > >> a non-running job would cause errors, and the error message
>>> would be
>>> >> > >> printed to
>>> >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs
>>> in
>>> >> > >> streaming
>>> >> > >> execution mode would be the same with streaming jobs. However,
>>> for
>>> >> batch
>>> >> > >> jobs in
>>> >> > >> batch execution mode, I think there would be an error, because
>>> batch
>>> >> > >> execution
>>> >> > >> doesn’t support checkpoints currently (please correct me if I’m
>>> >> wrong).
>>> >> > >>
>>> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>>> >> clusterClient/
>>> >> > >> jobClient doesn’t have such a functionality at the moment,
>>> neither do
>>> >> > >> Flink CLI.
>>> >> > >> Maybe we could make it a follow-up FLIP, which includes the
>>> >> modifications
>>> >> > >> to
>>> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
>>> >> > >>
>>> >> > >> Best,
>>> >> > >> Paul Lam
>>> >> > >>
>>> >> > >>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>>> >> > >>>
>>> >> > >>> Godfrey
>>> >> > >>
>>> >> > >>
>>> >> >
>>> >>
>>> >
>>>
>>
>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Jark Wu <im...@gmail.com>.

Hi Paul,

I'm fine with using JOBS. The only concern is that this may conflict with
displaying more detailed
information for query (e.g. query content, plan) in the future, e.g. SHOW
QUERIES EXTENDED in ksqldb[1].
This is not a big problem as we can introduce SHOW QUERIES in the future if
necessary.

> STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
`table.job.stop-with-drain`)
What about STOP JOB <job_id> [WITH SAVEPOINT] [WITH DRAIN] ?
It might be trivial and error-prone to set configuration before executing a
statement,
and the configuration will affect all statements after that.

> CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
We can simplify the statement to "CREATE SAVEPOINT FOR JOB <job_id>",
and always use configuration "state.savepoints.dir" as the default
savepoint dir.
The concern with using "<savepoint_path>" is here should be savepoint dir,
and savepoint_path is the returned value.

I'm fine with other changes.

Thanks,
Jark

[1]:
https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/




On Wed, 8 Jun 2022 at 15:07, Paul Lam <pa...@gmail.com> wrote:

> Hi Jing,
>
> Thank you for your inputs!
>
> TBH, I haven’t considered the ETL scenario that you mentioned. I think
> they’re managed just like other jobs interns of job lifecycles (please
> correct me if I’m wrong).
>
> WRT to the SQL statements about SQL lineages, I think it might be a little
> bit out of the scope of the FLIP, since it’s mainly about lifecycles. By
> the way, do we have these functionalities in Flink CLI or REST API already?
>
> WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the
> community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m
> updating the FLIP arcading to the latest discussions.
>
> Best,
> Paul Lam
>
> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
>
> Hi Paul,
>
> Sorry that I am a little bit too late to join this thread. Thanks for
> driving this and starting this informative discussion. The FLIP looks
> really interesting. It will help us a lot to manage Flink SQL jobs.
>
> Have you considered the ETL scenario with Flink SQL, where multiple SQLs
> build a DAG for many DAGs?
>
> 1)
> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to
> support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
> responsible to *produce* data as the result(cube, materialized view, etc.)
> for the future consumption by queries. The INSERT INTO SELECT FROM example
> in FLIP and CTAS are typical SQL in this case. I would prefer to call them
> Jobs instead of Queries.
>
> 2)
> Speaking of ETL DAG, we might want to see the lineage. Is it possible to
> support syntax like:
>
> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given
> job_id
> SHOW JOBTREES // shows all DAGs
> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
>
> 3)
> Could we also support Savepoint housekeeping syntax? We ran into this
> issue that a lot of savepoints have been created by customers (via their
> apps). It will take extra (hacking) effort to clean it.
>
> RELEASE SAVEPOINT ALL
>
> Best regards,
> Jing
>
> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org>
> wrote:
>
>> Hi Paul,
>>
>> I'm still doubting the keyword for the SQL applications. SHOW QUERIES
>> could
>> imply that this will actually show the query, but we're returning IDs of
>> the running application. At first I was also not very much in favour of
>> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
>> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>>
>> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>>
>> Best regards,
>>
>> Martijn
>>
>> [1]
>>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>>
>> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>>
>> > Hi Godfrey,
>> >
>> > Sorry for the late reply, I was on vacation.
>> >
>> > It looks like we have a variety of preferences on the syntax, how about
>> we
>> > choose the most acceptable one?
>> >
>> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to
>> jobs
>> > would be:
>> >
>> > - SHOW JOBS
>> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
>> > `table.job.stop-with-drain`)
>> >
>> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
>> > JOB`:
>> >
>> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
>> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
>> > manager remembers)
>> > - DROP SAVEPOINT <savepoint_path>
>> >
>> > cc @Jark @ShengKai @Martijn @Timo .
>> >
>> > Best,
>> > Paul Lam
>> >
>> >
>> > godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>> >
>> >> Hi Paul,
>> >>
>> >> Thanks for the update.
>> >>
>> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>> >> (DataStream or SQL) or
>> >> clients (SQL client or CLI).
>> >>
>> >> Is DataStream job a QUERY? I think not.
>> >> For a QUERY, the most important concept is the statement. But the
>> >> result does not contain this info.
>> >> If we need to contain all jobs in the cluster, I think the name should
>> >> be JOB or PIPELINE.
>> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>> >>
>> >> > SHOW SAVEPOINTS
>> >> To list the savepoint for a specific job, we need to specify a
>> >> specific pipeline,
>> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>> >>
>> >> Best,
>> >> Godfrey
>> >>
>> >> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>> >> >
>> >> > Hi Jark,
>> >> >
>> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
>> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
>> >> >
>> >> > Another question is, what should be the syntax for ungracefully
>> >> > canceling a query? As ShengKai pointed out in a offline discussion,
>> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>> >> > Flink CLI has both stop and cancel, mostly due to historical
>> problems.
>> >> >
>> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>> >> > that savepoints are owned by users and beyond the lifecycle of a
>> Flink
>> >> > cluster. For example, a user might take a savepoint at a custom path
>> >> > that’s different than the default savepoint path, I think jobmanager
>> >> would
>> >> > not remember that, not to mention the jobmanager may be a fresh new
>> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”,
>> it's
>> >> > probably a best-effort one.
>> >> >
>> >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>> >> > Savepoints are alias for nested transactions in DB area[1], and
>> there’s
>> >> > correspondingly global transactions. If we consider Flink jobs as
>> >> > global transactions and Flink checkpoints as nested transactions,
>> >> > then the savepoint semantics are close, thus I think savepoint syntax
>> >> > in SQL-standard could be considered. But again, I’m don’t have very
>> >> > strong preference.
>> >> >
>> >> > Ping @Timo to get more inputs.
>> >> >
>> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
>> >> https://en.wikipedia.org/wiki/Nested_transaction>
>> >> >
>> >> > Best,
>> >> > Paul Lam
>> >> >
>> >> > > 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>> >> > >
>> >> > > Hi Paul,
>> >> > >
>> >> > > 1) SHOW QUERIES
>> >> > > +1 to add finished time, but it would be better to call it
>> "end_time"
>> >> to
>> >> > > keep aligned with names in Web UI.
>> >> > >
>> >> > > 2) DROP QUERY
>> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise,
>> how
>> >> to
>> >> > > stop batch queries?
>> >> > > At present, I don't think "DROP" is a suitable keyword for this
>> >> statement.
>> >> > > From the perspective of users, "DROP" sounds like the query should
>> be
>> >> > > removed from the
>> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>> >> more
>> >> > > suitable and
>> >> > > compliant with commands of Flink CLI.
>> >> > >
>> >> > > 3) SHOW SAVEPOINTS
>> >> > > I think this statement is needed, otherwise, savepoints are lost
>> >> after the
>> >> > > SAVEPOINT
>> >> > > command is executed. Savepoints can be retrieved from REST API
>> >> > > "/jobs/:jobid/checkpoints"
>> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
>> >> considering
>> >> > > providing "SHOW CHECKPOINTS"
>> >> > > to list all checkpoints.
>> >> > >
>> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
>> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>> >> statements
>> >> > > now.
>> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT
>> are
>> >> both
>> >> > > the same savepoint id.
>> >> > > However, in our syntax, the first one is query id, and the second
>> one
>> >> is
>> >> > > savepoint path, which is confusing and
>> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>> >> they
>> >> > > should be in the same syntax set.
>> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP
>> SAVEPOINT
>> >> > > <sp_path>.
>> >> > > That means we don't follow the majority of vendors in SAVEPOINT
>> >> commands. I
>> >> > > would say the purpose is different in Flink.
>> >> > > What other's opinion on this?
>> >> > >
>> >> > > Best,
>> >> > > Jark
>> >> > >
>> >> > > [1]:
>> >> > >
>> >>
>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>> >> > >
>> >> > >
>> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com>
>> wrote:
>> >> > >
>> >> > >> Hi Godfrey,
>> >> > >>
>> >> > >> Thanks a lot for your inputs!
>> >> > >>
>> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>> >> (DataStream
>> >> > >> or SQL) or
>> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
>> >> > >> ClusterClient#listJobs, the
>> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>> >> in SQL
>> >> > >> client, because
>> >> > >> these jobs can be managed via SQL client too.
>> >> > >>
>> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP.
>> But
>> >> I’m a
>> >> > >> bit afraid that the
>> >> > >> rows would be too long.
>> >> > >>
>> >> > >> WRT ‘DROP QUERY’,
>> >> > >>> What's the behavior for batch jobs and the non-running jobs?
>> >> > >>
>> >> > >>
>> >> > >> In general, the behavior would be aligned with Flink CLI.
>> Triggering
>> >> a
>> >> > >> savepoint for
>> >> > >> a non-running job would cause errors, and the error message would
>> be
>> >> > >> printed to
>> >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs
>> in
>> >> > >> streaming
>> >> > >> execution mode would be the same with streaming jobs. However, for
>> >> batch
>> >> > >> jobs in
>> >> > >> batch execution mode, I think there would be an error, because
>> batch
>> >> > >> execution
>> >> > >> doesn’t support checkpoints currently (please correct me if I’m
>> >> wrong).
>> >> > >>
>> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>> >> clusterClient/
>> >> > >> jobClient doesn’t have such a functionality at the moment,
>> neither do
>> >> > >> Flink CLI.
>> >> > >> Maybe we could make it a follow-up FLIP, which includes the
>> >> modifications
>> >> > >> to
>> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
>> >> > >>
>> >> > >> Best,
>> >> > >> Paul Lam
>> >> > >>
>> >> > >>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>> >> > >>>
>> >> > >>> Godfrey
>> >> > >>
>> >> > >>
>> >> >
>> >>
>> >
>>
>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

Hi Jing,

Thank you for your inputs!

TBH, I haven’t considered the ETL scenario that you mentioned. I think they’re managed just like other jobs interns of job lifecycles (please correct me if I’m wrong).

WRT to the SQL statements about SQL lineages, I think it might be a little bit out of the scope of the FLIP, since it’s mainly about lifecycles. By the way, do we have these functionalities in Flink CLI or REST API already? 

WRT `RELEASE SAVEPOINT ALL`, I’m sorry for the deprecated FLIP docs, the community is more in favor of `DROP SAVEPOINT <savepoint_path>`. I’m updating the FLIP arcading to the latest discussions.

Best,
Paul Lam

> 2022年6月8日 07:31，Jing Ge <ji...@ververica.com> 写道：
> 
> Hi Paul,
> 
> Sorry that I am a little bit too late to join this thread. Thanks for driving this and starting this informative discussion. The FLIP looks really interesting. It will help us a lot to manage Flink SQL jobs. 
> 
> Have you considered the ETL scenario with Flink SQL, where multiple SQLs build a DAG for many DAGs?
> 
> 1)
> +1 for SHOW JOBS. I think sooner or later we will start to discuss how to support ETL jobs. Briefly speaking, SQLs that used to build the DAG are responsible to *produce* data as the result(cube, materialized view, etc.) for the future consumption by queries. The INSERT INTO SELECT FROM example in FLIP and CTAS are typical SQL in this case. I would prefer to call them Jobs instead of Queries.
> 
> 2)
> Speaking of ETL DAG, we might want to see the lineage. Is it possible to support syntax like:
> 
> SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
> SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given job_id
> SHOW JOBTREES // shows all DAGs
> SHOW ANCIENTS <job_id> // shows all parents of the given job_id
> 
> 3)
> Could we also support Savepoint housekeeping syntax? We ran into this issue that a lot of savepoints have been created by customers (via their apps). It will take extra (hacking) effort to clean it.
> 
> RELEASE SAVEPOINT ALL   
> 
> Best regards,
> Jing
> 
> On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> wrote:
> Hi Paul,
> 
> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
> imply that this will actually show the query, but we're returning IDs of
> the running application. At first I was also not very much in favour of
> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
> 
> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
> 
> Best regards,
> 
> Martijn
> 
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary <https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary>
> 
> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>:
> 
> > Hi Godfrey,
> >
> > Sorry for the late reply, I was on vacation.
> >
> > It looks like we have a variety of preferences on the syntax, how about we
> > choose the most acceptable one?
> >
> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
> > would be:
> >
> > - SHOW JOBS
> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> > `table.job.stop-with-drain`)
> >
> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
> > JOB`:
> >
> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
> > manager remembers)
> > - DROP SAVEPOINT <savepoint_path>
> >
> > cc @Jark @ShengKai @Martijn @Timo .
> >
> > Best,
> > Paul Lam
> >
> >
> > godfrey he <godfreyhe@gmail.com <ma...@gmail.com>> 于2022年5月23日周一 21:34写道：
> >
> >> Hi Paul,
> >>
> >> Thanks for the update.
> >>
> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> >> (DataStream or SQL) or
> >> clients (SQL client or CLI).
> >>
> >> Is DataStream job a QUERY? I think not.
> >> For a QUERY, the most important concept is the statement. But the
> >> result does not contain this info.
> >> If we need to contain all jobs in the cluster, I think the name should
> >> be JOB or PIPELINE.
> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
> >>
> >> > SHOW SAVEPOINTS
> >> To list the savepoint for a specific job, we need to specify a
> >> specific pipeline,
> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
> >>
> >> Best,
> >> Godfrey
> >>
> >> Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 于2022年5月20日周五 11:25写道：
> >> >
> >> > Hi Jark,
> >> >
> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
> >> >
> >> > Another question is, what should be the syntax for ungracefully
> >> > canceling a query? As ShengKai pointed out in a offline discussion,
> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
> >> > Flink CLI has both stop and cancel, mostly due to historical problems.
> >> >
> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
> >> > that savepoints are owned by users and beyond the lifecycle of a Flink
> >> > cluster. For example, a user might take a savepoint at a custom path
> >> > that’s different than the default savepoint path, I think jobmanager
> >> would
> >> > not remember that, not to mention the jobmanager may be a fresh new
> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
> >> > probably a best-effort one.
> >> >
> >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
> >> > Savepoints are alias for nested transactions in DB area[1], and there’s
> >> > correspondingly global transactions. If we consider Flink jobs as
> >> > global transactions and Flink checkpoints as nested transactions,
> >> > then the savepoint semantics are close, thus I think savepoint syntax
> >> > in SQL-standard could be considered. But again, I’m don’t have very
> >> > strong preference.
> >> >
> >> > Ping @Timo to get more inputs.
> >> >
> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction <https://en.wikipedia.org/wiki/Nested_transaction> <
> >> https://en.wikipedia.org/wiki/Nested_transaction <https://en.wikipedia.org/wiki/Nested_transaction>>
> >> >
> >> > Best,
> >> > Paul Lam
> >> >
> >> > > 2022年5月18日 17:48，Jark Wu <imjark@gmail.com <ma...@gmail.com>> 写道：
> >> > >
> >> > > Hi Paul,
> >> > >
> >> > > 1) SHOW QUERIES
> >> > > +1 to add finished time, but it would be better to call it "end_time"
> >> to
> >> > > keep aligned with names in Web UI.
> >> > >
> >> > > 2) DROP QUERY
> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how
> >> to
> >> > > stop batch queries?
> >> > > At present, I don't think "DROP" is a suitable keyword for this
> >> statement.
> >> > > From the perspective of users, "DROP" sounds like the query should be
> >> > > removed from the
> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
> >> more
> >> > > suitable and
> >> > > compliant with commands of Flink CLI.
> >> > >
> >> > > 3) SHOW SAVEPOINTS
> >> > > I think this statement is needed, otherwise, savepoints are lost
> >> after the
> >> > > SAVEPOINT
> >> > > command is executed. Savepoints can be retrieved from REST API
> >> > > "/jobs/:jobid/checkpoints"
> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
> >> considering
> >> > > providing "SHOW CHECKPOINTS"
> >> > > to list all checkpoints.
> >> > >
> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
> >> statements
> >> > > now.
> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
> >> both
> >> > > the same savepoint id.
> >> > > However, in our syntax, the first one is query id, and the second one
> >> is
> >> > > savepoint path, which is confusing and
> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
> >> they
> >> > > should be in the same syntax set.
> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
> >> > > <sp_path>.
> >> > > That means we don't follow the majority of vendors in SAVEPOINT
> >> commands. I
> >> > > would say the purpose is different in Flink.
> >> > > What other's opinion on this?
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > > [1]:
> >> > >
> >> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints <https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints>
> >> > >
> >> > >
> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
> >> > >
> >> > >> Hi Godfrey,
> >> > >>
> >> > >> Thanks a lot for your inputs!
> >> > >>
> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> >> (DataStream
> >> > >> or SQL) or
> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
> >> > >> ClusterClient#listJobs, the
> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
> >> in SQL
> >> > >> client, because
> >> > >> these jobs can be managed via SQL client too.
> >> > >>
> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But
> >> I’m a
> >> > >> bit afraid that the
> >> > >> rows would be too long.
> >> > >>
> >> > >> WRT ‘DROP QUERY’,
> >> > >>> What's the behavior for batch jobs and the non-running jobs?
> >> > >>
> >> > >>
> >> > >> In general, the behavior would be aligned with Flink CLI. Triggering
> >> a
> >> > >> savepoint for
> >> > >> a non-running job would cause errors, and the error message would be
> >> > >> printed to
> >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> >> > >> streaming
> >> > >> execution mode would be the same with streaming jobs. However, for
> >> batch
> >> > >> jobs in
> >> > >> batch execution mode, I think there would be an error, because batch
> >> > >> execution
> >> > >> doesn’t support checkpoints currently (please correct me if I’m
> >> wrong).
> >> > >>
> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
> >> clusterClient/
> >> > >> jobClient doesn’t have such a functionality at the moment, neither do
> >> > >> Flink CLI.
> >> > >> Maybe we could make it a follow-up FLIP, which includes the
> >> modifications
> >> > >> to
> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
> >> > >>
> >> > >> Best,
> >> > >> Paul Lam
> >> > >>
> >> > >>> 2022年5月17日 20:34，godfrey he <godfreyhe@gmail.com <ma...@gmail.com>> 写道：
> >> > >>>
> >> > >>> Godfrey
> >> > >>
> >> > >>
> >> >
> >>
> >

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Jing Ge <ji...@ververica.com>.

Hi Paul,

Sorry that I am a little bit too late to join this thread. Thanks for
driving this and starting this informative discussion. The FLIP looks
really interesting. It will help us a lot to manage Flink SQL jobs.

Have you considered the ETL scenario with Flink SQL, where multiple SQLs
build a DAG for many DAGs?

1)
+1 for SHOW JOBS. I think sooner or later we will start to discuss how to
support ETL jobs. Briefly speaking, SQLs that used to build the DAG are
responsible to *produce* data as the result(cube, materialized view, etc.)
for the future consumption by queries. The INSERT INTO SELECT FROM example
in FLIP and CTAS are typical SQL in this case. I would prefer to call them
Jobs instead of Queries.

2)
Speaking of ETL DAG, we might want to see the lineage. Is it possible to
support syntax like:

SHOW JOBTREE <job_id>  // shows the downstream DAG from the given job_id
SHOW JOBTREE <job_id> FULL // shows the whole DAG that contains the given
job_id
SHOW JOBTREES // shows all DAGs
SHOW ANCIENTS <job_id> // shows all parents of the given job_id

3)
Could we also support Savepoint housekeeping syntax? We ran into this issue
that a lot of savepoints have been created by customers (via their apps).
It will take extra (hacking) effort to clean it.

RELEASE SAVEPOINT ALL

Best regards,
Jing

On Tue, Jun 7, 2022 at 2:35 PM Martijn Visser <ma...@apache.org>
wrote:

> Hi Paul,
>
> I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
> imply that this will actually show the query, but we're returning IDs of
> the running application. At first I was also not very much in favour of
> SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
> jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS
>
> Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.
>
> Best regards,
>
> Martijn
>
> [1]
> https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary
>
> Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:
>
> > Hi Godfrey,
> >
> > Sorry for the late reply, I was on vacation.
> >
> > It looks like we have a variety of preferences on the syntax, how about
> we
> > choose the most acceptable one?
> >
> > WRT keyword for SQL jobs, we use JOBS, thus the statements related to
> jobs
> > would be:
> >
> > - SHOW JOBS
> > - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> > `table.job.stop-with-drain`)
> >
> > WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
> > JOB`:
> >
> > - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
> > - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
> > manager remembers)
> > - DROP SAVEPOINT <savepoint_path>
> >
> > cc @Jark @ShengKai @Martijn @Timo .
> >
> > Best,
> > Paul Lam
> >
> >
> > godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
> >
> >> Hi Paul,
> >>
> >> Thanks for the update.
> >>
> >> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> >> (DataStream or SQL) or
> >> clients (SQL client or CLI).
> >>
> >> Is DataStream job a QUERY? I think not.
> >> For a QUERY, the most important concept is the statement. But the
> >> result does not contain this info.
> >> If we need to contain all jobs in the cluster, I think the name should
> >> be JOB or PIPELINE.
> >> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
> >>
> >> > SHOW SAVEPOINTS
> >> To list the savepoint for a specific job, we need to specify a
> >> specific pipeline,
> >> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
> >>
> >> Best,
> >> Godfrey
> >>
> >> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
> >> >
> >> > Hi Jark,
> >> >
> >> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
> >> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
> >> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
> >> >
> >> > Another question is, what should be the syntax for ungracefully
> >> > canceling a query? As ShengKai pointed out in a offline discussion,
> >> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
> >> > Flink CLI has both stop and cancel, mostly due to historical problems.
> >> >
> >> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
> >> > that savepoints are owned by users and beyond the lifecycle of a Flink
> >> > cluster. For example, a user might take a savepoint at a custom path
> >> > that’s different than the default savepoint path, I think jobmanager
> >> would
> >> > not remember that, not to mention the jobmanager may be a fresh new
> >> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
> >> > probably a best-effort one.
> >> >
> >> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
> >> > Savepoints are alias for nested transactions in DB area[1], and
> there’s
> >> > correspondingly global transactions. If we consider Flink jobs as
> >> > global transactions and Flink checkpoints as nested transactions,
> >> > then the savepoint semantics are close, thus I think savepoint syntax
> >> > in SQL-standard could be considered. But again, I’m don’t have very
> >> > strong preference.
> >> >
> >> > Ping @Timo to get more inputs.
> >> >
> >> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
> >> https://en.wikipedia.org/wiki/Nested_transaction>
> >> >
> >> > Best,
> >> > Paul Lam
> >> >
> >> > > 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
> >> > >
> >> > > Hi Paul,
> >> > >
> >> > > 1) SHOW QUERIES
> >> > > +1 to add finished time, but it would be better to call it
> "end_time"
> >> to
> >> > > keep aligned with names in Web UI.
> >> > >
> >> > > 2) DROP QUERY
> >> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how
> >> to
> >> > > stop batch queries?
> >> > > At present, I don't think "DROP" is a suitable keyword for this
> >> statement.
> >> > > From the perspective of users, "DROP" sounds like the query should
> be
> >> > > removed from the
> >> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
> >> more
> >> > > suitable and
> >> > > compliant with commands of Flink CLI.
> >> > >
> >> > > 3) SHOW SAVEPOINTS
> >> > > I think this statement is needed, otherwise, savepoints are lost
> >> after the
> >> > > SAVEPOINT
> >> > > command is executed. Savepoints can be retrieved from REST API
> >> > > "/jobs/:jobid/checkpoints"
> >> > > with filtering "checkpoint_type"="savepoint". It's also worth
> >> considering
> >> > > providing "SHOW CHECKPOINTS"
> >> > > to list all checkpoints.
> >> > >
> >> > > 4) SAVEPOINT & RELEASE SAVEPOINT
> >> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
> >> statements
> >> > > now.
> >> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT
> are
> >> both
> >> > > the same savepoint id.
> >> > > However, in our syntax, the first one is query id, and the second
> one
> >> is
> >> > > savepoint path, which is confusing and
> >> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
> >> they
> >> > > should be in the same syntax set.
> >> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP
> SAVEPOINT
> >> > > <sp_path>.
> >> > > That means we don't follow the majority of vendors in SAVEPOINT
> >> commands. I
> >> > > would say the purpose is different in Flink.
> >> > > What other's opinion on this?
> >> > >
> >> > > Best,
> >> > > Jark
> >> > >
> >> > > [1]:
> >> > >
> >>
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
> >> > >
> >> > >
> >> > > On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com>
> wrote:
> >> > >
> >> > >> Hi Godfrey,
> >> > >>
> >> > >> Thanks a lot for your inputs!
> >> > >>
> >> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> >> (DataStream
> >> > >> or SQL) or
> >> > >> clients (SQL client or CLI). Under the hook, it’s based on
> >> > >> ClusterClient#listJobs, the
> >> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
> >> in SQL
> >> > >> client, because
> >> > >> these jobs can be managed via SQL client too.
> >> > >>
> >> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But
> >> I’m a
> >> > >> bit afraid that the
> >> > >> rows would be too long.
> >> > >>
> >> > >> WRT ‘DROP QUERY’,
> >> > >>> What's the behavior for batch jobs and the non-running jobs?
> >> > >>
> >> > >>
> >> > >> In general, the behavior would be aligned with Flink CLI.
> Triggering
> >> a
> >> > >> savepoint for
> >> > >> a non-running job would cause errors, and the error message would
> be
> >> > >> printed to
> >> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> >> > >> streaming
> >> > >> execution mode would be the same with streaming jobs. However, for
> >> batch
> >> > >> jobs in
> >> > >> batch execution mode, I think there would be an error, because
> batch
> >> > >> execution
> >> > >> doesn’t support checkpoints currently (please correct me if I’m
> >> wrong).
> >> > >>
> >> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
> >> clusterClient/
> >> > >> jobClient doesn’t have such a functionality at the moment, neither
> do
> >> > >> Flink CLI.
> >> > >> Maybe we could make it a follow-up FLIP, which includes the
> >> modifications
> >> > >> to
> >> > >> clusterClient/jobClient and Flink CLI. WDYT?
> >> > >>
> >> > >> Best,
> >> > >> Paul Lam
> >> > >>
> >> > >>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
> >> > >>>
> >> > >>> Godfrey
> >> > >>
> >> > >>
> >> >
> >>
> >
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Martijn Visser <ma...@apache.org>.

Hi Paul,

I'm still doubting the keyword for the SQL applications. SHOW QUERIES could
imply that this will actually show the query, but we're returning IDs of
the running application. At first I was also not very much in favour of
SHOW JOBS since I prefer calling it 'Flink applications' and not 'Flink
jobs', but the glossary [1] made me reconsider. I would +1 SHOW/STOP JOBS

Also +1 for the CREATE/SHOW/DROP SAVEPOINT syntax.

Best regards,

Martijn

[1]
https://nightlies.apache.org/flink/flink-docs-master/docs/concepts/glossary

Op za 4 jun. 2022 om 10:38 schreef Paul Lam <pa...@gmail.com>:

> Hi Godfrey,
>
> Sorry for the late reply, I was on vacation.
>
> It looks like we have a variety of preferences on the syntax, how about we
> choose the most acceptable one?
>
> WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
> would be:
>
> - SHOW JOBS
> - STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
> `table.job.stop-with-drain`)
>
> WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR
> JOB`:
>
> - CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
> - SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job
> manager remembers)
> - DROP SAVEPOINT <savepoint_path>
>
> cc @Jark @ShengKai @Martijn @Timo .
>
> Best,
> Paul Lam
>
>
> godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：
>
>> Hi Paul,
>>
>> Thanks for the update.
>>
>> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>> (DataStream or SQL) or
>> clients (SQL client or CLI).
>>
>> Is DataStream job a QUERY? I think not.
>> For a QUERY, the most important concept is the statement. But the
>> result does not contain this info.
>> If we need to contain all jobs in the cluster, I think the name should
>> be JOB or PIPELINE.
>> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>>
>> > SHOW SAVEPOINTS
>> To list the savepoint for a specific job, we need to specify a
>> specific pipeline,
>> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>>
>> Best,
>> Godfrey
>>
>> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>> >
>> > Hi Jark,
>> >
>> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
>> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
>> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
>> >
>> > Another question is, what should be the syntax for ungracefully
>> > canceling a query? As ShengKai pointed out in a offline discussion,
>> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
>> > Flink CLI has both stop and cancel, mostly due to historical problems.
>> >
>> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
>> > that savepoints are owned by users and beyond the lifecycle of a Flink
>> > cluster. For example, a user might take a savepoint at a custom path
>> > that’s different than the default savepoint path, I think jobmanager
>> would
>> > not remember that, not to mention the jobmanager may be a fresh new
>> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
>> > probably a best-effort one.
>> >
>> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
>> > Savepoints are alias for nested transactions in DB area[1], and there’s
>> > correspondingly global transactions. If we consider Flink jobs as
>> > global transactions and Flink checkpoints as nested transactions,
>> > then the savepoint semantics are close, thus I think savepoint syntax
>> > in SQL-standard could be considered. But again, I’m don’t have very
>> > strong preference.
>> >
>> > Ping @Timo to get more inputs.
>> >
>> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
>> https://en.wikipedia.org/wiki/Nested_transaction>
>> >
>> > Best,
>> > Paul Lam
>> >
>> > > 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
>> > >
>> > > Hi Paul,
>> > >
>> > > 1) SHOW QUERIES
>> > > +1 to add finished time, but it would be better to call it "end_time"
>> to
>> > > keep aligned with names in Web UI.
>> > >
>> > > 2) DROP QUERY
>> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how
>> to
>> > > stop batch queries?
>> > > At present, I don't think "DROP" is a suitable keyword for this
>> statement.
>> > > From the perspective of users, "DROP" sounds like the query should be
>> > > removed from the
>> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is
>> more
>> > > suitable and
>> > > compliant with commands of Flink CLI.
>> > >
>> > > 3) SHOW SAVEPOINTS
>> > > I think this statement is needed, otherwise, savepoints are lost
>> after the
>> > > SAVEPOINT
>> > > command is executed. Savepoints can be retrieved from REST API
>> > > "/jobs/:jobid/checkpoints"
>> > > with filtering "checkpoint_type"="savepoint". It's also worth
>> considering
>> > > providing "SHOW CHECKPOINTS"
>> > > to list all checkpoints.
>> > >
>> > > 4) SAVEPOINT & RELEASE SAVEPOINT
>> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
>> statements
>> > > now.
>> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
>> both
>> > > the same savepoint id.
>> > > However, in our syntax, the first one is query id, and the second one
>> is
>> > > savepoint path, which is confusing and
>> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe
>> they
>> > > should be in the same syntax set.
>> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
>> > > <sp_path>.
>> > > That means we don't follow the majority of vendors in SAVEPOINT
>> commands. I
>> > > would say the purpose is different in Flink.
>> > > What other's opinion on this?
>> > >
>> > > Best,
>> > > Jark
>> > >
>> > > [1]:
>> > >
>> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
>> > >
>> > >
>> > > On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
>> > >
>> > >> Hi Godfrey,
>> > >>
>> > >> Thanks a lot for your inputs!
>> > >>
>> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
>> (DataStream
>> > >> or SQL) or
>> > >> clients (SQL client or CLI). Under the hook, it’s based on
>> > >> ClusterClient#listJobs, the
>> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed
>> in SQL
>> > >> client, because
>> > >> these jobs can be managed via SQL client too.
>> > >>
>> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But
>> I’m a
>> > >> bit afraid that the
>> > >> rows would be too long.
>> > >>
>> > >> WRT ‘DROP QUERY’,
>> > >>> What's the behavior for batch jobs and the non-running jobs?
>> > >>
>> > >>
>> > >> In general, the behavior would be aligned with Flink CLI. Triggering
>> a
>> > >> savepoint for
>> > >> a non-running job would cause errors, and the error message would be
>> > >> printed to
>> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>> > >> streaming
>> > >> execution mode would be the same with streaming jobs. However, for
>> batch
>> > >> jobs in
>> > >> batch execution mode, I think there would be an error, because batch
>> > >> execution
>> > >> doesn’t support checkpoints currently (please correct me if I’m
>> wrong).
>> > >>
>> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink
>> clusterClient/
>> > >> jobClient doesn’t have such a functionality at the moment, neither do
>> > >> Flink CLI.
>> > >> Maybe we could make it a follow-up FLIP, which includes the
>> modifications
>> > >> to
>> > >> clusterClient/jobClient and Flink CLI. WDYT?
>> > >>
>> > >> Best,
>> > >> Paul Lam
>> > >>
>> > >>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>> > >>>
>> > >>> Godfrey
>> > >>
>> > >>
>> >
>>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

Hi Godfrey,

Sorry for the late reply, I was on vacation.

It looks like we have a variety of preferences on the syntax, how about we
choose the most acceptable one?

WRT keyword for SQL jobs, we use JOBS, thus the statements related to jobs
would be:

- SHOW JOBS
- STOP JOBS <job_id> (with options `table.job.stop-with-savepoint` and
`table.job.stop-with-drain`)

WRT savepoint for SQL jobs, we use the `CREATE/DROP` pattern with `FOR JOB`:

- CREATE SAVEPOINT <savepoint_path> FOR JOB <job_id>
- SHOW SAVEPOINTS FOR JOB <job_id> (show savepoints the current job manager
remembers)
- DROP SAVEPOINT <savepoint_path>

cc @Jark @ShengKai @Martijn @Timo .

Best,
Paul Lam


godfrey he <go...@gmail.com> 于2022年5月23日周一 21:34写道：

> Hi Paul,
>
> Thanks for the update.
>
> >'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> (DataStream or SQL) or
> clients (SQL client or CLI).
>
> Is DataStream job a QUERY? I think not.
> For a QUERY, the most important concept is the statement. But the
> result does not contain this info.
> If we need to contain all jobs in the cluster, I think the name should
> be JOB or PIPELINE.
> I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.
>
> > SHOW SAVEPOINTS
> To list the savepoint for a specific job, we need to specify a
> specific pipeline,
> the syntax should be SHOW SAVEPOINTS FOR PIPELINE id
>
> Best,
> Godfrey
>
> Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
> >
> > Hi Jark,
> >
> > WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
> > part of the reason why I proposed “STOP/CANCEL QUERY” at the
> > beginning. The downside of it is that it’s not ANSI-SQL compatible.
> >
> > Another question is, what should be the syntax for ungracefully
> > canceling a query? As ShengKai pointed out in a offline discussion,
> > “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
> > Flink CLI has both stop and cancel, mostly due to historical problems.
> >
> > WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
> > that savepoints are owned by users and beyond the lifecycle of a Flink
> > cluster. For example, a user might take a savepoint at a custom path
> > that’s different than the default savepoint path, I think jobmanager
> would
> > not remember that, not to mention the jobmanager may be a fresh new
> > one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
> > probably a best-effort one.
> >
> > WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
> > Savepoints are alias for nested transactions in DB area[1], and there’s
> > correspondingly global transactions. If we consider Flink jobs as
> > global transactions and Flink checkpoints as nested transactions,
> > then the savepoint semantics are close, thus I think savepoint syntax
> > in SQL-standard could be considered. But again, I’m don’t have very
> > strong preference.
> >
> > Ping @Timo to get more inputs.
> >
> > [1] https://en.wikipedia.org/wiki/Nested_transaction <
> https://en.wikipedia.org/wiki/Nested_transaction>
> >
> > Best,
> > Paul Lam
> >
> > > 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
> > >
> > > Hi Paul,
> > >
> > > 1) SHOW QUERIES
> > > +1 to add finished time, but it would be better to call it "end_time"
> to
> > > keep aligned with names in Web UI.
> > >
> > > 2) DROP QUERY
> > > I think we shouldn't throw exceptions for batch jobs, otherwise, how to
> > > stop batch queries?
> > > At present, I don't think "DROP" is a suitable keyword for this
> statement.
> > > From the perspective of users, "DROP" sounds like the query should be
> > > removed from the
> > > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is more
> > > suitable and
> > > compliant with commands of Flink CLI.
> > >
> > > 3) SHOW SAVEPOINTS
> > > I think this statement is needed, otherwise, savepoints are lost after
> the
> > > SAVEPOINT
> > > command is executed. Savepoints can be retrieved from REST API
> > > "/jobs/:jobid/checkpoints"
> > > with filtering "checkpoint_type"="savepoint". It's also worth
> considering
> > > providing "SHOW CHECKPOINTS"
> > > to list all checkpoints.
> > >
> > > 4) SAVEPOINT & RELEASE SAVEPOINT
> > > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT
> statements
> > > now.
> > > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are
> both
> > > the same savepoint id.
> > > However, in our syntax, the first one is query id, and the second one
> is
> > > savepoint path, which is confusing and
> > > not consistent. When I came across SHOW SAVEPOINT, I thought maybe they
> > > should be in the same syntax set.
> > > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
> > > <sp_path>.
> > > That means we don't follow the majority of vendors in SAVEPOINT
> commands. I
> > > would say the purpose is different in Flink.
> > > What other's opinion on this?
> > >
> > > Best,
> > > Jark
> > >
> > > [1]:
> > >
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
> > >
> > >
> > > On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
> > >
> > >> Hi Godfrey,
> > >>
> > >> Thanks a lot for your inputs!
> > >>
> > >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs
> (DataStream
> > >> or SQL) or
> > >> clients (SQL client or CLI). Under the hook, it’s based on
> > >> ClusterClient#listJobs, the
> > >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed in
> SQL
> > >> client, because
> > >> these jobs can be managed via SQL client too.
> > >>
> > >> WRT finished time, I think you’re right. Adding it to the FLIP. But
> I’m a
> > >> bit afraid that the
> > >> rows would be too long.
> > >>
> > >> WRT ‘DROP QUERY’,
> > >>> What's the behavior for batch jobs and the non-running jobs?
> > >>
> > >>
> > >> In general, the behavior would be aligned with Flink CLI. Triggering a
> > >> savepoint for
> > >> a non-running job would cause errors, and the error message would be
> > >> printed to
> > >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> > >> streaming
> > >> execution mode would be the same with streaming jobs. However, for
> batch
> > >> jobs in
> > >> batch execution mode, I think there would be an error, because batch
> > >> execution
> > >> doesn’t support checkpoints currently (please correct me if I’m
> wrong).
> > >>
> > >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink clusterClient/
> > >> jobClient doesn’t have such a functionality at the moment, neither do
> > >> Flink CLI.
> > >> Maybe we could make it a follow-up FLIP, which includes the
> modifications
> > >> to
> > >> clusterClient/jobClient and Flink CLI. WDYT?
> > >>
> > >> Best,
> > >> Paul Lam
> > >>
> > >>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
> > >>>
> > >>> Godfrey
> > >>
> > >>
> >
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by godfrey he <go...@gmail.com>.

Hi Paul,

Thanks for the update.

>'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs (DataStream or SQL) or
clients (SQL client or CLI).

Is DataStream job a QUERY? I think not.
For a QUERY, the most important concept is the statement. But the
result does not contain this info.
If we need to contain all jobs in the cluster, I think the name should
be JOB or PIPELINE.
I learn to SHOW PIPELINES and STOP PIPELINE [IF RUNNING] id.

> SHOW SAVEPOINTS
To list the savepoint for a specific job, we need to specify a
specific pipeline,
the syntax should be SHOW SAVEPOINTS FOR PIPELINE id

Best,
Godfrey

Paul Lam <pa...@gmail.com> 于2022年5月20日周五 11:25写道：
>
> Hi Jark,
>
> WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
> part of the reason why I proposed “STOP/CANCEL QUERY” at the
> beginning. The downside of it is that it’s not ANSI-SQL compatible.
>
> Another question is, what should be the syntax for ungracefully
> canceling a query? As ShengKai pointed out in a offline discussion,
> “STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
> Flink CLI has both stop and cancel, mostly due to historical problems.
>
> WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
> that savepoints are owned by users and beyond the lifecycle of a Flink
> cluster. For example, a user might take a savepoint at a custom path
> that’s different than the default savepoint path, I think jobmanager would
> not remember that, not to mention the jobmanager may be a fresh new
> one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's
> probably a best-effort one.
>
> WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
> Savepoints are alias for nested transactions in DB area[1], and there’s
> correspondingly global transactions. If we consider Flink jobs as
> global transactions and Flink checkpoints as nested transactions,
> then the savepoint semantics are close, thus I think savepoint syntax
> in SQL-standard could be considered. But again, I’m don’t have very
> strong preference.
>
> Ping @Timo to get more inputs.
>
> [1] https://en.wikipedia.org/wiki/Nested_transaction <https://en.wikipedia.org/wiki/Nested_transaction>
>
> Best,
> Paul Lam
>
> > 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
> >
> > Hi Paul,
> >
> > 1) SHOW QUERIES
> > +1 to add finished time, but it would be better to call it "end_time" to
> > keep aligned with names in Web UI.
> >
> > 2) DROP QUERY
> > I think we shouldn't throw exceptions for batch jobs, otherwise, how to
> > stop batch queries?
> > At present, I don't think "DROP" is a suitable keyword for this statement.
> > From the perspective of users, "DROP" sounds like the query should be
> > removed from the
> > list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is more
> > suitable and
> > compliant with commands of Flink CLI.
> >
> > 3) SHOW SAVEPOINTS
> > I think this statement is needed, otherwise, savepoints are lost after the
> > SAVEPOINT
> > command is executed. Savepoints can be retrieved from REST API
> > "/jobs/:jobid/checkpoints"
> > with filtering "checkpoint_type"="savepoint". It's also worth considering
> > providing "SHOW CHECKPOINTS"
> > to list all checkpoints.
> >
> > 4) SAVEPOINT & RELEASE SAVEPOINT
> > I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT statements
> > now.
> > In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are both
> > the same savepoint id.
> > However, in our syntax, the first one is query id, and the second one is
> > savepoint path, which is confusing and
> > not consistent. When I came across SHOW SAVEPOINT, I thought maybe they
> > should be in the same syntax set.
> > For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
> > <sp_path>.
> > That means we don't follow the majority of vendors in SAVEPOINT commands. I
> > would say the purpose is different in Flink.
> > What other's opinion on this?
> >
> > Best,
> > Jark
> >
> > [1]:
> > https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
> >
> >
> > On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
> >
> >> Hi Godfrey,
> >>
> >> Thanks a lot for your inputs!
> >>
> >> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs (DataStream
> >> or SQL) or
> >> clients (SQL client or CLI). Under the hook, it’s based on
> >> ClusterClient#listJobs, the
> >> same with Flink CLI. I think it’s okay to have non-SQL jobs listed in SQL
> >> client, because
> >> these jobs can be managed via SQL client too.
> >>
> >> WRT finished time, I think you’re right. Adding it to the FLIP. But I’m a
> >> bit afraid that the
> >> rows would be too long.
> >>
> >> WRT ‘DROP QUERY’,
> >>> What's the behavior for batch jobs and the non-running jobs?
> >>
> >>
> >> In general, the behavior would be aligned with Flink CLI. Triggering a
> >> savepoint for
> >> a non-running job would cause errors, and the error message would be
> >> printed to
> >> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> >> streaming
> >> execution mode would be the same with streaming jobs. However, for batch
> >> jobs in
> >> batch execution mode, I think there would be an error, because batch
> >> execution
> >> doesn’t support checkpoints currently (please correct me if I’m wrong).
> >>
> >> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink clusterClient/
> >> jobClient doesn’t have such a functionality at the moment, neither do
> >> Flink CLI.
> >> Maybe we could make it a follow-up FLIP, which includes the modifications
> >> to
> >> clusterClient/jobClient and Flink CLI. WDYT?
> >>
> >> Best,
> >> Paul Lam
> >>
> >>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
> >>>
> >>> Godfrey
> >>
> >>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

Hi Jark,

WRT “DROP QUERY”, I agree that it’s not very intuitive, and that’s
part of the reason why I proposed “STOP/CANCEL QUERY” at the
beginning. The downside of it is that it’s not ANSI-SQL compatible.

Another question is, what should be the syntax for ungracefully 
canceling a query? As ShengKai pointed out in a offline discussion, 
“STOP QUERY” and “CANCEL QUERY” might confuse SQL users.
Flink CLI has both stop and cancel, mostly due to historical problems.

WRT “SHOW SAVEPOINT”, I agree it’s a missing part. My concern is
that savepoints are owned by users and beyond the lifecycle of a Flink
cluster. For example, a user might take a savepoint at a custom path
that’s different than the default savepoint path, I think jobmanager would
not remember that, not to mention the jobmanager may be a fresh new
one after a cluster restart. Thus if we support “SHOW SAVEPOINT”, it's 
probably a best-effort one.

WRT savepoint syntax, I’m thinking of the semantic of the savepoint.
Savepoints are alias for nested transactions in DB area[1], and there’s
correspondingly global transactions. If we consider Flink jobs as 
global transactions and Flink checkpoints as nested transactions,
then the savepoint semantics are close, thus I think savepoint syntax 
in SQL-standard could be considered. But again, I’m don’t have very
strong preference.

Ping @Timo to get more inputs.

[1] https://en.wikipedia.org/wiki/Nested_transaction <https://en.wikipedia.org/wiki/Nested_transaction>

Best,
Paul Lam

> 2022年5月18日 17:48，Jark Wu <im...@gmail.com> 写道：
> 
> Hi Paul,
> 
> 1) SHOW QUERIES
> +1 to add finished time, but it would be better to call it "end_time" to
> keep aligned with names in Web UI.
> 
> 2) DROP QUERY
> I think we shouldn't throw exceptions for batch jobs, otherwise, how to
> stop batch queries?
> At present, I don't think "DROP" is a suitable keyword for this statement.
> From the perspective of users, "DROP" sounds like the query should be
> removed from the
> list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is more
> suitable and
> compliant with commands of Flink CLI.
> 
> 3) SHOW SAVEPOINTS
> I think this statement is needed, otherwise, savepoints are lost after the
> SAVEPOINT
> command is executed. Savepoints can be retrieved from REST API
> "/jobs/:jobid/checkpoints"
> with filtering "checkpoint_type"="savepoint". It's also worth considering
> providing "SHOW CHECKPOINTS"
> to list all checkpoints.
> 
> 4) SAVEPOINT & RELEASE SAVEPOINT
> I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT statements
> now.
> In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are both
> the same savepoint id.
> However, in our syntax, the first one is query id, and the second one is
> savepoint path, which is confusing and
> not consistent. When I came across SHOW SAVEPOINT, I thought maybe they
> should be in the same syntax set.
> For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
> <sp_path>.
> That means we don't follow the majority of vendors in SAVEPOINT commands. I
> would say the purpose is different in Flink.
> What other's opinion on this?
> 
> Best,
> Jark
> 
> [1]:
> https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints
> 
> 
> On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:
> 
>> Hi Godfrey,
>> 
>> Thanks a lot for your inputs!
>> 
>> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs (DataStream
>> or SQL) or
>> clients (SQL client or CLI). Under the hook, it’s based on
>> ClusterClient#listJobs, the
>> same with Flink CLI. I think it’s okay to have non-SQL jobs listed in SQL
>> client, because
>> these jobs can be managed via SQL client too.
>> 
>> WRT finished time, I think you’re right. Adding it to the FLIP. But I’m a
>> bit afraid that the
>> rows would be too long.
>> 
>> WRT ‘DROP QUERY’,
>>> What's the behavior for batch jobs and the non-running jobs?
>> 
>> 
>> In general, the behavior would be aligned with Flink CLI. Triggering a
>> savepoint for
>> a non-running job would cause errors, and the error message would be
>> printed to
>> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
>> streaming
>> execution mode would be the same with streaming jobs. However, for batch
>> jobs in
>> batch execution mode, I think there would be an error, because batch
>> execution
>> doesn’t support checkpoints currently (please correct me if I’m wrong).
>> 
>> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink clusterClient/
>> jobClient doesn’t have such a functionality at the moment, neither do
>> Flink CLI.
>> Maybe we could make it a follow-up FLIP, which includes the modifications
>> to
>> clusterClient/jobClient and Flink CLI. WDYT?
>> 
>> Best,
>> Paul Lam
>> 
>>> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
>>> 
>>> Godfrey
>> 
>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Jark Wu <im...@gmail.com>.

Hi Paul,

1) SHOW QUERIES
+1 to add finished time, but it would be better to call it "end_time" to
keep aligned with names in Web UI.

2) DROP QUERY
I think we shouldn't throw exceptions for batch jobs, otherwise, how to
stop batch queries?
At present, I don't think "DROP" is a suitable keyword for this statement.
From the perspective of users, "DROP" sounds like the query should be
removed from the
list of "SHOW QUERIES". However, it doesn't. Maybe "STOP QUERY" is more
suitable and
compliant with commands of Flink CLI.

3) SHOW SAVEPOINTS
I think this statement is needed, otherwise, savepoints are lost after the
SAVEPOINT
command is executed. Savepoints can be retrieved from REST API
"/jobs/:jobid/checkpoints"
with filtering "checkpoint_type"="savepoint". It's also worth considering
providing "SHOW CHECKPOINTS"
to list all checkpoints.

4) SAVEPOINT & RELEASE SAVEPOINT
I'm a little concerned with the SAVEPOINT and RELEASE SAVEPOINT statements
now.
In the vendors, the parameters of SAVEPOINT and RELEASE SAVEPOINT are both
the same savepoint id.
However, in our syntax, the first one is query id, and the second one is
savepoint path, which is confusing and
 not consistent. When I came across SHOW SAVEPOINT, I thought maybe they
should be in the same syntax set.
For example, CREATE SAVEPOINT FOR [QUERY] <query_id> & DROP SAVEPOINT
<sp_path>.
That means we don't follow the majority of vendors in SAVEPOINT commands. I
would say the purpose is different in Flink.
What other's opinion on this?

Best,
Jark

[1]:
https://nightlies.apache.org/flink/flink-docs-master/docs/ops/rest_api/#jobs-jobid-checkpoints

On Wed, 18 May 2022 at 14:43, Paul Lam <pa...@gmail.com> wrote:

> Hi Godfrey,
>
> Thanks a lot for your inputs!
>
> 'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs (DataStream
> or SQL) or
> clients (SQL client or CLI). Under the hook, it’s based on
> ClusterClient#listJobs, the
> same with Flink CLI. I think it’s okay to have non-SQL jobs listed in SQL
> client, because
> these jobs can be managed via SQL client too.
>
> WRT finished time, I think you’re right. Adding it to the FLIP. But I’m a
> bit afraid that the
> rows would be too long.
>
> WRT ‘DROP QUERY’,
> > What's the behavior for batch jobs and the non-running jobs?
>
>
> In general, the behavior would be aligned with Flink CLI. Triggering a
> savepoint for
> a non-running job would cause errors, and the error message would be
> printed to
> the SQL client. Triggering a savepoint for batch(unbounded) jobs in
> streaming
> execution mode would be the same with streaming jobs. However, for batch
> jobs in
> batch execution mode, I think there would be an error, because batch
> execution
> doesn’t support checkpoints currently (please correct me if I’m wrong).
>
> WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink clusterClient/
> jobClient doesn’t have such a functionality at the moment, neither do
> Flink CLI.
> Maybe we could make it a follow-up FLIP, which includes the modifications
> to
> clusterClient/jobClient and Flink CLI. WDYT?
>
> Best,
> Paul Lam
>
> > 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
> >
> > Godfrey
>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

Hi Godfrey,

Thanks a lot for your inputs!

'SHOW QUERIES' lists all jobs in the cluster, no limit on APIs (DataStream or SQL) or 
clients (SQL client or CLI). Under the hook, it’s based on ClusterClient#listJobs, the 
same with Flink CLI. I think it’s okay to have non-SQL jobs listed in SQL client, because
these jobs can be managed via SQL client too.

WRT finished time, I think you’re right. Adding it to the FLIP. But I’m a bit afraid that the
rows would be too long.

WRT ‘DROP QUERY’,
> What's the behavior for batch jobs and the non-running jobs?


In general, the behavior would be aligned with Flink CLI. Triggering a savepoint for 
a non-running job would cause errors, and the error message would be printed to
the SQL client. Triggering a savepoint for batch(unbounded) jobs in streaming
execution mode would be the same with streaming jobs. However, for batch jobs in 
batch execution mode, I think there would be an error, because batch execution
doesn’t support checkpoints currently (please correct me if I’m wrong).

WRT ’SHOW SAVEPOINTS’, I’ve thought about it, but Flink clusterClient/
jobClient doesn’t have such a functionality at the moment, neither do Flink CLI.
Maybe we could make it a follow-up FLIP, which includes the modifications to 
clusterClient/jobClient and Flink CLI. WDYT?

Best,
Paul Lam

> 2022年5月17日 20:34，godfrey he <go...@gmail.com> 写道：
> 
> Godfrey

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by godfrey he <go...@gmail.com>.

Hi Paul,

Thanks for driving this, LGTM overall.

I have a few minor comments:

>SHOW QUERIES
I want to clear the scope the command, does the command show the
queries submitted
via SqlClient, or all queries in current cluster (submitted via other CLI)?
History queries are included? What's the behavior for per-job cluster?

The result should contain 'finish_time' field, which is more friendly
for batch job.

>DROP QUERY '<query_id>'
What's the behavior for batch jobs and the non-running jobs?

>SAVEPOINT '<query_id>'
+1 to align with the SQL standard.
What's the behavior for batch jobs?

SHOW SAVEPOINTS is missing.

* Table API
+1 to introduce the API in Table API

Best,
Godfrey

Paul Lam <pa...@gmail.com> 于2022年5月11日周三 19:20写道：
>
> Hi Jark,
>
> Thanks a lot for your opinions and suggestions! Please see my replies inline.
>
> > 1) the display of savepoint_path
>
>
> Agreed. Adding it to the FLIP.
>
> > 2) Please make a decision on multiple options in the FLIP.
>
> Okay. I’ll keep one and move the other to the rejected alternatives section.
>
> > 4) +1 SHOW QUERIES
> > Btw, the displayed column "address" is a little confusing to me.
> > At the first glance, I'm not sure what address it is, JM RPC address? JM REST address? Gateway address?
> > If this is a link to the job's web UI URL, how about calling it "web_url" and display in
> > "http://<hostname>:<port>" format?
> > Besides, how about displaying "startTime" or "uptime" as well?
>
> I’m good with these changes. Updating the FLIP according to your suggestions.
>
> > 5) STOP/CANCEL QUERY vs DROP QUERY
> > I'm +1 to DROP, because it's more compliant with SQL standard naming, i.e., "SHOW/CREATE/DROP".
> > Separating STOP and CANCEL confuses users a lot what are the differences between them.
> > I'm +1 to add the "PURGE" keyword to the DROP QUERY statement, which indicates to stop query without savepoint.
> > Note that, PURGE doesn't mean stop with --drain flag. The drain flag will flush all the registered timers
> > and windows which could lead to incorrect results when the job is resumed. I think the drain flag is rarely used
> > (please correct me if I'm wrong), therefore, I suggest moving this feature into future work when the needs are clear.
>
> I’m +1 to represent ungrateful cancel by PURGE. I think —drain flag is not used very often as you said, and we
> could just add a table config option to enable that flag.
>
> > 7) <query_id> and <savepoint_path> should be quoted
> > All the <query_id> and <savepoint_path> should be string literal, otherwise it's hard to parse them.
> > For example, STOP QUERY '<query_id>’.
>
> Good point! Adding it to the FLIP.
>
> > 8) Examples
> > Could you add an example that consists of all the statements to show how to manage the full lifecycle of queries?
> > Including show queries, create savepoint, remove savepoint, stop query with a savepoint, and restart query with savepoint.
>
> Agreed. Adding it to the FLIP as well.
>
> Best,
> Paul Lam
>
> > 2022年5月7日 18:22，Jark Wu <im...@gmail.com> 写道：
> >
> > Hi Paul,
> >
> > I think this FLIP has already in a good shape. I just left some additional thoughts:
> >
> > 1) the display of savepoint_path
> > Could the displayed savepoint_path include the scheme part?
> > E.g. `hdfs:///flink-savepoints/savepoint-cca7bc-bb1e257f0dab`
> > IIUC, the scheme part is omitted when it's a local filesystem.
> > But the behavior would be clearer if including the scheme part in the design doc.
> >
> > 2) Please make a decision on multiple options in the FLIP.
> > It might give the impression that we will support all the options.
> >
> > 3) +1 SAVEPOINT and RELEASE SAVEPOINT
> > Personally, I also prefer "SAVEPOINT <query_id>" and "RELEASE SAVEPOINT <savepoint_path>"
> > to "CREATE/DROP SAVEPOINT", as they have been used in mature databases.
> >
> > 4) +1 SHOW QUERIES
> > Btw, the displayed column "address" is a little confusing to me.
> > At the first glance, I'm not sure what address it is, JM RPC address? JM REST address? Gateway address?
> > If this is a link to the job's web UI URL, how about calling it "web_url" and display in
> > "http://<hostname>:<port>" format?
> > Besides, how about displaying "startTime" or "uptime" as well?
> >
> > 5) STOP/CANCEL QUERY vs DROP QUERY
> > I'm +1 to DROP, because it's more compliant with SQL standard naming, i.e., "SHOW/CREATE/DROP".
> > Separating STOP and CANCEL confuses users a lot what are the differences between them.
> > I'm +1 to add the "PURGE" keyword to the DROP QUERY statement, which indicates to stop query without savepoint.
> > Note that, PURGE doesn't mean stop with --drain flag. The drain flag will flush all the registered timers
> > and windows which could lead to incorrect results when the job is resumed. I think the drain flag is rarely used
> > (please correct me if I'm wrong), therefore, I suggest moving this feature into future work when the needs are clear.
> >
> > 6) Table API
> > I think it makes sense to support the new statements in Table API.
> > We should try to make the Gateway and CLI simple which just forward statement to the underlying TableEnvironemnt.
> > JAR statements are being re-implemented in Table API as well, see FLIP-214[1].
> >
> > 7) <query_id> and <savepoint_path> should be quoted
> > All the <query_id> and <savepoint_path> should be string literal, otherwise it's hard to parse them.
> > For example, STOP QUERY '<query_id>'.
> >
> > 8) Examples
> > Could you add an example that consists of all the statements to show how to manage the full lifecycle of queries?
> > Including show queries, create savepoint, remove savepoint, stop query with a savepoint, and restart query with savepoint.
> >
> > Best,
> > Jark
> >
> > [1]: https://cwiki.apache.org/confluence/display/FLINK/FLIP-214+Support+Advanced+Function+DDL?src=contextnavpagetreemode <https://cwiki.apache.org/confluence/display/FLINK/FLIP-214+Support+Advanced+Function+DDL?src=contextnavpagetreemode>
> >
> >
> > On Fri, 6 May 2022 at 19:13, Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> wrote:
> > Hi Paul,
> >
> > Great that you could find something in the SQL standard! I'll try to read the FLIP once more completely next week to see if I have any more concerns.
> >
> > Best regards,
> >
> > Martijn
> >
> > On Fri, 6 May 2022 at 08:21, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
> > I had a look at SQL-2016 that Martijn mentioned, and found that
> > maybe we could follow the transaction savepoint syntax.
> > SAVEPOINT <savepoint specifier>
> > RELEASE SAVEPOINT <savepoint specifier>
> > These savepoint statements are supported in lots of databases, like
> > Oracle[1], PG[2], MariaDB[3].
> >
> > They’re usually used in the middle of a SQL transaction, so the target
> > would be the current transaction. But if used in Flink SQL session, we
> > need to add a JOB/QUERY id when create a savepoint, thus the syntax
> > would be:
> > SAVEPOINT <job/query id> <savepoint path>
> > RELEASE SAVEPOINT <savepoint path>
> > I’m adding it as an alternative in the FLIP.
> >
> > [1] https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm <https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm>
> > [2] https://www.postgresql.org/docs/current/sql-savepoint.html <https://www.postgresql.org/docs/current/sql-savepoint.html>
> > [3] https://mariadb.com/kb/en/savepoint/ <https://mariadb.com/kb/en/savepoint/>
> >
> > Best,
> > Paul Lam
> >
> >> 2022年5月4日 16:42，Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 写道：
> >>
> >> Hi Shengkai,
> >>
> >> Thanks a lot for your input!
> >>
> >> > I just wonder how the users can get the web ui in the application mode.
> >> Therefore, it's better we can list the Web UI using the SHOW statement.
> >> WDYT?
> >>
> >> I think it's a valid approach. I'm adding it to the FLIP.
> >>
> >> > After the investigation, I am fine with the QUERY but the keyword JOB is
> >> also okay to me.
> >>
> >> In addition, CockroachDB has both SHOW QUERIES [1] and SHOW JOBS [2],
> >> while the former shows the active running queries and the latter shows the
> >> background tasks like schema changes. FYI.
> >>
> >> WRT the questions:
> >>
> >> > 1. Could you add some details about the behaviour with the different
> >> execution.target, e.g. session, application mode?
> >>
> >> IMHO, the difference between different `execution.target` is mostly about
> >> cluster startup, which has little relation with the proposed statements.
> >> These statements rely on the current ClusterClient/JobClient API,
> >> which is deployment mode agnostic. Canceling a job in an application
> >> cluster is the same as in a session cluster.
> >>
> >> BTW, application mode is still in the development progress ATM [3].
> >>
> >> > 2. Considering the SQL Client/Gateway is not limited to submitting the job
> >> to the specified cluster, is it able to list jobs in the other clusters?
> >>
> >> I think multi-cluster support in SQL Client/Gateway should be aligned with
> >> CLI, at least at the early phase. We may use SET  to set a cluster id for a
> >> session, then we have access to the cluster. However,  every SHOW
> >> statement would only involve one cluster.
> >>
> >> Best,
> >> Paul Lam
> >>
> >> [1] https://www.cockroachlabs.com/docs/stable/show-statements.html <https://www.cockroachlabs.com/docs/stable/show-statements.html>
> >> [2] https://www.cockroachlabs.com/docs/v21.2/show-jobs <https://www.cockroachlabs.com/docs/v21.2/show-jobs>
> >> [3] https://issues.apache.org/jira/browse/FLINK-26541 <https://issues.apache.org/jira/browse/FLINK-26541>
> >> Shengkai Fang <fskmine@gmail.com <ma...@gmail.com>> 于2022年4月29日周五 15:36写道：
> >> Hi.
> >>
> >> Thanks for Paul's update.
> >>
> >> > It's better we can also get the infos about the cluster where the job is
> >> > running through the DESCRIBE statement.
> >>
> >> I just wonder how the users can get the web ui in the application mode.
> >> Therefore, it's better we can list the Web UI using the SHOW statement.
> >> WDYT?
> >>
> >>
> >> > QUERY or other keywords.
> >>
> >> I list the statement to manage the lifecycle of the query/dml in other
> >> systems:
> >>
> >> Mysql[1] allows users to SHOW [FULL] PROCESSLIST and use the KILL command
> >> to kill the query.
> >>
> >> ```
> >> mysql> SHOW PROCESSLIST;
> >>
> >> mysql> KILL 27;
> >> ```
> >>
> >>
> >> Postgres use the following statements to kill the queries.
> >>
> >> ```
> >> SELECT pg_cancel_backend(<pid of the process>)
> >>
> >> SELECT pg_terminate_backend(<pid of the process>)
> >> ```
> >>
> >> KSQL uses the following commands to control the query lifecycle[4].
> >>
> >> ```
> >> SHOW QUERIES;
> >>
> >> TERMINATE <query id>;
> >>
> >> ```
> >>
> >> [1] https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html <https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html>
> >> [2] https://scaledynamix.com/blog/how-to-kill-mysql-queries/ <https://scaledynamix.com/blog/how-to-kill-mysql-queries/>
> >> [3]
> >> https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql <https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql>
> >> [4]
> >> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/ <https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/>
> >> [5]
> >> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/ <https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/>
> >>
> >> After the investigation, I am fine with the QUERY but the keyword JOB is
> >> also okay to me.
> >>
> >> We also have two questions here.
> >>
> >> 1. Could you add some details about the behaviour with the different
> >> execution.target, e.g. session, application mode?
> >>
> >> 2. Considering the SQL Client/Gateway is not limited to submitting the job
> >> to the specified cluster, is it able to list jobs in the other clusters?
> >>
> >>
> >> Best,
> >> Shengkai
> >>
> >> Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 于2022年4月28日周四 17:17写道：
> >>
> >> > Hi Martjin,
> >> >
> >> > Thanks a lot for your reply! I agree that the scope may be a bit confusing,
> >> > please let me clarify.
> >> >
> >> > The FLIP aims to add new SQL statements that are supported only in
> >> > sql-client, similar to
> >> > jar statements [1]. Jar statements can be parsed into jar operations, which
> >> > are used only in
> >> > CliClient in sql-client module and cannot be executed by TableEnvironment
> >> > (not available in
> >> > Table API program that contains SQL that you mentioned).
> >> >
> >> > WRT the unchanged CLI client, I mean CliClient instead of the sql-client
> >> > module, which
> >> > currently contains the gateway codes (e.g. Executor). The FLIP mainly
> >> > extends
> >> > the gateway part, and barely touches CliClient and REST server (REST
> >> > endpoint in FLIP-91).
> >> >
> >> > WRT the syntax, I don't have much experience with SQL standards, and I'd
> >> > like to hear
> >> > more opinions from the community. I prefer Hive-style syntax because I
> >> > think many users
> >> > are familiar with Hive, and there're on-going efforts to improve Flink-Hive
> >> > integration [2][3].
> >> > But my preference is not strong, I'm okay with other options too. Do you
> >> > think JOB/Task is
> >> > a good choice, or do you have other preferred keywords?
> >> >
> >> > [1]
> >> >
> >> > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/ <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/>
> >> > [2]
> >> >
> >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility <https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility>
> >> > [3]
> >> >
> >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint <https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint>
> >> >
> >> > Best,
> >> > Paul Lam
> >> >
> >> > Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> 于2022年4月26日周二 20:14写道：
> >> >
> >> > > Hi Paul,
> >> > >
> >> > > Thanks for creating the FLIP and opening the discussion. I did get a bit
> >> > > confused about the title, being "query lifecycle statements in SQL
> >> > client".
> >> > > This sounds like you want to adopt the SQL client, but you want to expand
> >> > > the SQL syntax with lifecycle statements, which could be used from the
> >> > SQL
> >> > > client, but of course also in a Table API program that contains SQL.
> >> > GIven
> >> > > that you're highlighting the CLI client as unchanged, this adds to more
> >> > > confusion.
> >> > >
> >> > > I am interested if there's anything listed in the SQL 2016 standard on
> >> > > these types of lifecycle statements. I did a quick scan for "SHOW
> >> > QUERIES"
> >> > > but couldn't find it. It would be great if we could stay as close as
> >> > > possible to such syntax. Overall I'm not in favour of using QUERIES as a
> >> > > keyword. I think Flink applications are not queries, but short- or long
> >> > > running applications. Why should we follow Hive's setup and indeed not
> >> > > others such as Snowflake, but also Postgres or MySQL?
> >> > >
> >> > > Best regards,
> >> > >
> >> > > Martijn Visser
> >> > > https://twitter.com/MartijnVisser82 <https://twitter.com/MartijnVisser82>
> >> > > https://github.com/MartijnVisser <https://github.com/MartijnVisser>
> >> > >
> >> > >
> >> > > On Fri, 22 Apr 2022 at 12:06, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
> >> > >
> >> > > > Hi Shengkai,
> >> > > >
> >> > > > Thanks a lot for your opinions!
> >> > > >
> >> > > > > 1. I think the keyword QUERY may confuse users because the statement
> >> > > also
> >> > > > > works for the DML statement.
> >> > > >
> >> > > > I slightly lean to QUERY, because:
> >> > > >
> >> > > > Hive calls DMLs queries. We could be better aligned with Hive using
> >> > > QUERY,
> >> > > > especially given that we plan to introduce Hive endpoint.
> >> > > > QUERY is a more SQL-like concept and friendly to SQL users.
> >> > > >
> >> > > > In general, my preference: QUERY > JOB > TASK. I’m okay with JOB, but
> >> > not
> >> > > > very good with TASK, as it conflicts with the task concept in Flink
> >> > > runtime.
> >> > > >
> >> > > > We could wait for more feedbacks from the community.
> >> > > >
> >> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
> >> > > terminate
> >> > > > > their jobs.
> >> > > >
> >> > > > Agreed. I’m okay with DROP. And if we want to align with Hive, KILL
> >> > might
> >> > > > an alternative.
> >> > > >
> >> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
> >> > > >
> >> > > > Agreed. It’s more SQL-like and intuitive. I’m updating the syntax on
> >> > the
> >> > > > FLIP.
> >> > > >
> >> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
> >> > more
> >> > > > > detailed job infos.
> >> > > >
> >> > > > That is a more SQL-like approach I think. But considering the
> >> > > > ClusterClient APIs, we can fetch the names and the status along in one
> >> > > > request,
> >> > > > thus it may be more user friendly to return them all in the SHOW
> >> > > > statement?
> >> > > >
> >> > > > > It's better we can also get the infos about the cluster where the job
> >> > > is
> >> > > > > running on through the DESCRIBE statement.
> >> > > >
> >> > > > I think cluster info could be part of session properties instead. WDYT?
> >> > > >
> >> > > > Best,
> >> > > > Paul Lam
> >> > > >
> >> > > > > 2022年4月22日 11:14，Shengkai Fang <fskmine@gmail.com <ma...@gmail.com>> 写道：
> >> > > > >
> >> > > > > Hi Paul
> >> > > > >
> >> > > > > Sorry for the late response. I propose my thoughts here.
> >> > > > >
> >> > > > > 1. I think the keyword QUERY may confuse users because the statement
> >> > > also
> >> > > > > works for the DML statement. I find the Snowflakes[1] supports
> >> > > > >
> >> > > > > - CREATE TASK
> >> > > > > - DROP TASK
> >> > > > > - ALTER TASK
> >> > > > > - SHOW TASKS
> >> > > > > - DESCRIPE TASK
> >> > > > >
> >> > > > > I think we can follow snowflake to use `TASK` as the keyword or use
> >> > the
> >> > > > > keyword `JOB`?
> >> > > > >
> >> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
> >> > > terminate
> >> > > > > their jobs.
> >> > > > >
> >> > > > > ```
> >> > > > > DROP TASK [IF EXISTS] <job id> PURGE; -- Forcely stop the job with
> >> > > drain
> >> > > > >
> >> > > > > DROP TASK [IF EXISTS] <job id>; -- Stop the task with savepoints
> >> > > > > ```
> >> > > > >
> >> > > > > Oracle[2] uses the PURGE to clean up the table and users can't not
> >> > > > recover.
> >> > > > > I think it also works for us to terminate the job permanently.
> >> > > > >
> >> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like. Users
> >> > can
> >> > > > use
> >> > > > > the
> >> > > > >
> >> > > > > ```
> >> > > > >  SET 'state.savepoints.dir' = '<path_to_savepoint>';
> >> > > > >  SET 'state.savepoints.fomat' = 'native';
> >> > > > >  CREATE SAVEPOINT <job id>;
> >> > > > >
> >> > > > >  DROP SAVEPOINT <path_to_savepoint>;
> >> > > > > ```
> >> > > > >
> >> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
> >> > more
> >> > > > > detailed job infos.
> >> > > > >
> >> > > > > ```
> >> > > > >
> >> > > > > SHOW TASKS;
> >> > > > >
> >> > > > >
> >> > > > > +----------------------------------+
> >> > > > > |            job_id                |
> >> > > > > +----------------------------------+
> >> > > > > | 0f6413c33757fbe0277897dd94485f04 |
> >> > > > > +----------------------------------+
> >> > > > >
> >> > > > > DESCRIPE TASK <job id>;
> >> > > > >
> >> > > > > +------------------------
> >> > > > > |  job name   | status  |
> >> > > > > +------------------------
> >> > > > > | insert-sink | running |
> >> > > > > +------------------------
> >> > > > >
> >> > > > > ```
> >> > > > > It's better we can also get the infos about the cluster where the job
> >> > > is
> >> > > > > running on through the DESCRIBE statement.
> >> > > > >
> >> > > > >
> >> > > > > [1]
> >> > > > >
> >> > > >
> >> > >
> >> > https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management <https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management>
> >> > > > <
> >> > > >
> >> > >
> >> > https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management <https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management>
> >> > > > >
> >> > > > > [2]
> >> > > > >
> >> > > >
> >> > >
> >> > https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806 <https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806>
> >> > > > <
> >> > > >
> >> > >
> >> > https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806 <https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806>
> >> > > > >
> >> > > > >
> >> > > > > Paul Lam <paullin3280@gmail.com <ma...@gmail.com> <mailto:paullin3280@gmail.com <ma...@gmail.com>>>
> >> > > > 于2022年4月21日周四 10:36写道：
> >> > > > >
> >> > > > >> ping @Timo @Jark @Shengkai
> >> > > > >>
> >> > > > >> Best,
> >> > > > >> Paul Lam
> >> > > > >>
> >> > > > >>> 2022年4月18日 17:12，Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 写道：
> >> > > > >>>
> >> > > > >>> Hi team,
> >> > > > >>>
> >> > > > >>> I’d like to start a discussion about FLIP-222 [1], which adds query
> >> > > > >> lifecycle
> >> > > > >>> statements to SQL client.
> >> > > > >>>
> >> > > > >>> Currently, SQL client supports submitting queries (queries in a
> >> > broad
> >> > > > >> sense,
> >> > > > >>> including DQLs and DMLs) but no further lifecycle statements, like
> >> > > > >> canceling
> >> > > > >>> a query or triggering a savepoint. That makes SQL users have to
> >> > rely
> >> > > on
> >> > > > >>> CLI or REST API to manage theirs queries.
> >> > > > >>>
> >> > > > >>> Thus, I propose to introduce the following statements to fill the
> >> > > gap.
> >> > > > >>> SHOW QUERIES
> >> > > > >>> STOP QUERY <query_id>
> >> > > > >>> CANCEL QUERY <query_id>
> >> > > > >>> TRIGGER SAVEPOINT <savepoint_path>
> >> > > > >>> DISPOSE SAVEPOINT <savepoint_path>
> >> > > > >>> These statement would align SQL client with CLI, providing the full
> >> > > > >> lifecycle
> >> > > > >>> management for queries/jobs.
> >> > > > >>>
> >> > > > >>> Please see the FLIP page[1] for more details. Thanks a lot!
> >> > > > >>> (For reference, the previous discussion thread see [2].)
> >> > > > >>>
> >> > > > >>> [1]
> >> > > > >>
> >> > > >
> >> > >
> >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client>
> >> > > > >> <
> >> > > > >>
> >> > > >
> >> > >
> >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client>
> >> > > > <
> >> > > >
> >> > >
> >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client>
> >> > > > >
> >> > > > >>>
> >> > > > >>> [2]
> >> > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>
> >> > > <
> >> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>> <
> >> > > > >> https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb> <
> >> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>>>
> >> > > > >>>
> >> > > > >>> Best,
> >> > > > >>> Paul Lam
> >> > > >
> >> > > >
> >> > >
> >> >
> >
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

Hi Jark,

Thanks a lot for your opinions and suggestions! Please see my replies inline. 

> 1) the display of savepoint_path


Agreed. Adding it to the FLIP.

> 2) Please make a decision on multiple options in the FLIP.

Okay. I’ll keep one and move the other to the rejected alternatives section.

> 4) +1 SHOW QUERIES
> Btw, the displayed column "address" is a little confusing to me. 
> At the first glance, I'm not sure what address it is, JM RPC address? JM REST address? Gateway address?
> If this is a link to the job's web UI URL, how about calling it "web_url" and display in 
> "http://<hostname>:<port>" format?
> Besides, how about displaying "startTime" or "uptime" as well?

I’m good with these changes. Updating the FLIP according to your suggestions.

> 5) STOP/CANCEL QUERY vs DROP QUERY
> I'm +1 to DROP, because it's more compliant with SQL standard naming, i.e., "SHOW/CREATE/DROP". 
> Separating STOP and CANCEL confuses users a lot what are the differences between them. 
> I'm +1 to add the "PURGE" keyword to the DROP QUERY statement, which indicates to stop query without savepoint. 
> Note that, PURGE doesn't mean stop with --drain flag. The drain flag will flush all the registered timers 
> and windows which could lead to incorrect results when the job is resumed. I think the drain flag is rarely used 
> (please correct me if I'm wrong), therefore, I suggest moving this feature into future work when the needs are clear. 

I’m +1 to represent ungrateful cancel by PURGE. I think —drain flag is not used very often as you said, and we 
could just add a table config option to enable that flag.

> 7) <query_id> and <savepoint_path> should be quoted
> All the <query_id> and <savepoint_path> should be string literal, otherwise it's hard to parse them.
> For example, STOP QUERY '<query_id>’.

Good point! Adding it to the FLIP.

> 8) Examples
> Could you add an example that consists of all the statements to show how to manage the full lifecycle of queries? 
> Including show queries, create savepoint, remove savepoint, stop query with a savepoint, and restart query with savepoint. 

Agreed. Adding it to the FLIP as well.

Best,
Paul Lam

> 2022年5月7日 18:22，Jark Wu <im...@gmail.com> 写道：
> 
> Hi Paul, 
> 
> I think this FLIP has already in a good shape. I just left some additional thoughts: 
> 
> 1) the display of savepoint_path
> Could the displayed savepoint_path include the scheme part? 
> E.g. `hdfs:///flink-savepoints/savepoint-cca7bc-bb1e257f0dab`
> IIUC, the scheme part is omitted when it's a local filesystem. 
> But the behavior would be clearer if including the scheme part in the design doc. 
> 
> 2) Please make a decision on multiple options in the FLIP.
> It might give the impression that we will support all the options. 
> 
> 3) +1 SAVEPOINT and RELEASE SAVEPOINT
> Personally, I also prefer "SAVEPOINT <query_id>" and "RELEASE SAVEPOINT <savepoint_path>" 
> to "CREATE/DROP SAVEPOINT", as they have been used in mature databases.
> 
> 4) +1 SHOW QUERIES
> Btw, the displayed column "address" is a little confusing to me. 
> At the first glance, I'm not sure what address it is, JM RPC address? JM REST address? Gateway address?
> If this is a link to the job's web UI URL, how about calling it "web_url" and display in 
> "http://<hostname>:<port>" format?
> Besides, how about displaying "startTime" or "uptime" as well?
> 
> 5) STOP/CANCEL QUERY vs DROP QUERY
> I'm +1 to DROP, because it's more compliant with SQL standard naming, i.e., "SHOW/CREATE/DROP". 
> Separating STOP and CANCEL confuses users a lot what are the differences between them. 
> I'm +1 to add the "PURGE" keyword to the DROP QUERY statement, which indicates to stop query without savepoint. 
> Note that, PURGE doesn't mean stop with --drain flag. The drain flag will flush all the registered timers 
> and windows which could lead to incorrect results when the job is resumed. I think the drain flag is rarely used 
> (please correct me if I'm wrong), therefore, I suggest moving this feature into future work when the needs are clear. 
> 
> 6) Table API
> I think it makes sense to support the new statements in Table API. 
> We should try to make the Gateway and CLI simple which just forward statement to the underlying TableEnvironemnt. 
> JAR statements are being re-implemented in Table API as well, see FLIP-214[1].
> 
> 7) <query_id> and <savepoint_path> should be quoted
> All the <query_id> and <savepoint_path> should be string literal, otherwise it's hard to parse them.
> For example, STOP QUERY '<query_id>'.
> 
> 8) Examples
> Could you add an example that consists of all the statements to show how to manage the full lifecycle of queries? 
> Including show queries, create savepoint, remove savepoint, stop query with a savepoint, and restart query with savepoint. 
> 
> Best,
> Jark
> 
> [1]: https://cwiki.apache.org/confluence/display/FLINK/FLIP-214+Support+Advanced+Function+DDL?src=contextnavpagetreemode <https://cwiki.apache.org/confluence/display/FLINK/FLIP-214+Support+Advanced+Function+DDL?src=contextnavpagetreemode>
> 
> 
> On Fri, 6 May 2022 at 19:13, Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> wrote:
> Hi Paul,
> 
> Great that you could find something in the SQL standard! I'll try to read the FLIP once more completely next week to see if I have any more concerns.
> 
> Best regards,
> 
> Martijn
> 
> On Fri, 6 May 2022 at 08:21, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
> I had a look at SQL-2016 that Martijn mentioned, and found that
> maybe we could follow the transaction savepoint syntax.
> SAVEPOINT <savepoint specifier>
> RELEASE SAVEPOINT <savepoint specifier>
> These savepoint statements are supported in lots of databases, like 
> Oracle[1], PG[2], MariaDB[3].
> 
> They’re usually used in the middle of a SQL transaction, so the target 
> would be the current transaction. But if used in Flink SQL session, we 
> need to add a JOB/QUERY id when create a savepoint, thus the syntax 
> would be:
> SAVEPOINT <job/query id> <savepoint path>
> RELEASE SAVEPOINT <savepoint path>
> I’m adding it as an alternative in the FLIP.
> 
> [1] https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm <https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm>
> [2] https://www.postgresql.org/docs/current/sql-savepoint.html <https://www.postgresql.org/docs/current/sql-savepoint.html>
> [3] https://mariadb.com/kb/en/savepoint/ <https://mariadb.com/kb/en/savepoint/>
> 
> Best,
> Paul Lam
> 
>> 2022年5月4日 16:42，Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 写道：
>> 
>> Hi Shengkai,
>> 
>> Thanks a lot for your input!
>> 
>> > I just wonder how the users can get the web ui in the application mode.
>> Therefore, it's better we can list the Web UI using the SHOW statement.
>> WDYT?
>> 
>> I think it's a valid approach. I'm adding it to the FLIP.
>> 
>> > After the investigation, I am fine with the QUERY but the keyword JOB is
>> also okay to me.
>> 
>> In addition, CockroachDB has both SHOW QUERIES [1] and SHOW JOBS [2],
>> while the former shows the active running queries and the latter shows the 
>> background tasks like schema changes. FYI.
>> 
>> WRT the questions:
>> 
>> > 1. Could you add some details about the behaviour with the different
>> execution.target, e.g. session, application mode?
>> 
>> IMHO, the difference between different `execution.target` is mostly about
>> cluster startup, which has little relation with the proposed statements.
>> These statements rely on the current ClusterClient/JobClient API, 
>> which is deployment mode agnostic. Canceling a job in an application 
>> cluster is the same as in a session cluster. 
>> 
>> BTW, application mode is still in the development progress ATM [3].
>> 
>> > 2. Considering the SQL Client/Gateway is not limited to submitting the job
>> to the specified cluster, is it able to list jobs in the other clusters?
>> 
>> I think multi-cluster support in SQL Client/Gateway should be aligned with
>> CLI, at least at the early phase. We may use SET  to set a cluster id for a 
>> session, then we have access to the cluster. However,  every SHOW 
>> statement would only involve one cluster.
>> 
>> Best,
>> Paul Lam
>> 
>> [1] https://www.cockroachlabs.com/docs/stable/show-statements.html <https://www.cockroachlabs.com/docs/stable/show-statements.html>
>> [2] https://www.cockroachlabs.com/docs/v21.2/show-jobs <https://www.cockroachlabs.com/docs/v21.2/show-jobs>
>> [3] https://issues.apache.org/jira/browse/FLINK-26541 <https://issues.apache.org/jira/browse/FLINK-26541>
>> Shengkai Fang <fskmine@gmail.com <ma...@gmail.com>> 于2022年4月29日周五 15:36写道：
>> Hi.
>> 
>> Thanks for Paul's update.
>> 
>> > It's better we can also get the infos about the cluster where the job is
>> > running through the DESCRIBE statement.
>> 
>> I just wonder how the users can get the web ui in the application mode.
>> Therefore, it's better we can list the Web UI using the SHOW statement.
>> WDYT?
>> 
>> 
>> > QUERY or other keywords.
>> 
>> I list the statement to manage the lifecycle of the query/dml in other
>> systems:
>> 
>> Mysql[1] allows users to SHOW [FULL] PROCESSLIST and use the KILL command
>> to kill the query.
>> 
>> ```
>> mysql> SHOW PROCESSLIST;
>> 
>> mysql> KILL 27;
>> ```
>> 
>> 
>> Postgres use the following statements to kill the queries.
>> 
>> ```
>> SELECT pg_cancel_backend(<pid of the process>)
>> 
>> SELECT pg_terminate_backend(<pid of the process>)
>> ```
>> 
>> KSQL uses the following commands to control the query lifecycle[4].
>> 
>> ```
>> SHOW QUERIES;
>> 
>> TERMINATE <query id>;
>> 
>> ```
>> 
>> [1] https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html <https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html>
>> [2] https://scaledynamix.com/blog/how-to-kill-mysql-queries/ <https://scaledynamix.com/blog/how-to-kill-mysql-queries/>
>> [3]
>> https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql <https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql>
>> [4]
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/ <https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/>
>> [5]
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/ <https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/>
>> 
>> After the investigation, I am fine with the QUERY but the keyword JOB is
>> also okay to me.
>> 
>> We also have two questions here.
>> 
>> 1. Could you add some details about the behaviour with the different
>> execution.target, e.g. session, application mode?
>> 
>> 2. Considering the SQL Client/Gateway is not limited to submitting the job
>> to the specified cluster, is it able to list jobs in the other clusters?
>> 
>> 
>> Best,
>> Shengkai
>> 
>> Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 于2022年4月28日周四 17:17写道：
>> 
>> > Hi Martjin,
>> >
>> > Thanks a lot for your reply! I agree that the scope may be a bit confusing,
>> > please let me clarify.
>> >
>> > The FLIP aims to add new SQL statements that are supported only in
>> > sql-client, similar to
>> > jar statements [1]. Jar statements can be parsed into jar operations, which
>> > are used only in
>> > CliClient in sql-client module and cannot be executed by TableEnvironment
>> > (not available in
>> > Table API program that contains SQL that you mentioned).
>> >
>> > WRT the unchanged CLI client, I mean CliClient instead of the sql-client
>> > module, which
>> > currently contains the gateway codes (e.g. Executor). The FLIP mainly
>> > extends
>> > the gateway part, and barely touches CliClient and REST server (REST
>> > endpoint in FLIP-91).
>> >
>> > WRT the syntax, I don't have much experience with SQL standards, and I'd
>> > like to hear
>> > more opinions from the community. I prefer Hive-style syntax because I
>> > think many users
>> > are familiar with Hive, and there're on-going efforts to improve Flink-Hive
>> > integration [2][3].
>> > But my preference is not strong, I'm okay with other options too. Do you
>> > think JOB/Task is
>> > a good choice, or do you have other preferred keywords?
>> >
>> > [1]
>> >
>> > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/ <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/>
>> > [2]
>> >
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility <https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility>
>> > [3]
>> >
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint <https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint>
>> >
>> > Best,
>> > Paul Lam
>> >
>> > Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> 于2022年4月26日周二 20:14写道：
>> >
>> > > Hi Paul,
>> > >
>> > > Thanks for creating the FLIP and opening the discussion. I did get a bit
>> > > confused about the title, being "query lifecycle statements in SQL
>> > client".
>> > > This sounds like you want to adopt the SQL client, but you want to expand
>> > > the SQL syntax with lifecycle statements, which could be used from the
>> > SQL
>> > > client, but of course also in a Table API program that contains SQL.
>> > GIven
>> > > that you're highlighting the CLI client as unchanged, this adds to more
>> > > confusion.
>> > >
>> > > I am interested if there's anything listed in the SQL 2016 standard on
>> > > these types of lifecycle statements. I did a quick scan for "SHOW
>> > QUERIES"
>> > > but couldn't find it. It would be great if we could stay as close as
>> > > possible to such syntax. Overall I'm not in favour of using QUERIES as a
>> > > keyword. I think Flink applications are not queries, but short- or long
>> > > running applications. Why should we follow Hive's setup and indeed not
>> > > others such as Snowflake, but also Postgres or MySQL?
>> > >
>> > > Best regards,
>> > >
>> > > Martijn Visser
>> > > https://twitter.com/MartijnVisser82 <https://twitter.com/MartijnVisser82>
>> > > https://github.com/MartijnVisser <https://github.com/MartijnVisser>
>> > >
>> > >
>> > > On Fri, 22 Apr 2022 at 12:06, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
>> > >
>> > > > Hi Shengkai,
>> > > >
>> > > > Thanks a lot for your opinions!
>> > > >
>> > > > > 1. I think the keyword QUERY may confuse users because the statement
>> > > also
>> > > > > works for the DML statement.
>> > > >
>> > > > I slightly lean to QUERY, because:
>> > > >
>> > > > Hive calls DMLs queries. We could be better aligned with Hive using
>> > > QUERY,
>> > > > especially given that we plan to introduce Hive endpoint.
>> > > > QUERY is a more SQL-like concept and friendly to SQL users.
>> > > >
>> > > > In general, my preference: QUERY > JOB > TASK. I’m okay with JOB, but
>> > not
>> > > > very good with TASK, as it conflicts with the task concept in Flink
>> > > runtime.
>> > > >
>> > > > We could wait for more feedbacks from the community.
>> > > >
>> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
>> > > terminate
>> > > > > their jobs.
>> > > >
>> > > > Agreed. I’m okay with DROP. And if we want to align with Hive, KILL
>> > might
>> > > > an alternative.
>> > > >
>> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
>> > > >
>> > > > Agreed. It’s more SQL-like and intuitive. I’m updating the syntax on
>> > the
>> > > > FLIP.
>> > > >
>> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
>> > more
>> > > > > detailed job infos.
>> > > >
>> > > > That is a more SQL-like approach I think. But considering the
>> > > > ClusterClient APIs, we can fetch the names and the status along in one
>> > > > request,
>> > > > thus it may be more user friendly to return them all in the SHOW
>> > > > statement?
>> > > >
>> > > > > It's better we can also get the infos about the cluster where the job
>> > > is
>> > > > > running on through the DESCRIBE statement.
>> > > >
>> > > > I think cluster info could be part of session properties instead. WDYT?
>> > > >
>> > > > Best,
>> > > > Paul Lam
>> > > >
>> > > > > 2022年4月22日 11:14，Shengkai Fang <fskmine@gmail.com <ma...@gmail.com>> 写道：
>> > > > >
>> > > > > Hi Paul
>> > > > >
>> > > > > Sorry for the late response. I propose my thoughts here.
>> > > > >
>> > > > > 1. I think the keyword QUERY may confuse users because the statement
>> > > also
>> > > > > works for the DML statement. I find the Snowflakes[1] supports
>> > > > >
>> > > > > - CREATE TASK
>> > > > > - DROP TASK
>> > > > > - ALTER TASK
>> > > > > - SHOW TASKS
>> > > > > - DESCRIPE TASK
>> > > > >
>> > > > > I think we can follow snowflake to use `TASK` as the keyword or use
>> > the
>> > > > > keyword `JOB`?
>> > > > >
>> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
>> > > terminate
>> > > > > their jobs.
>> > > > >
>> > > > > ```
>> > > > > DROP TASK [IF EXISTS] <job id> PURGE; -- Forcely stop the job with
>> > > drain
>> > > > >
>> > > > > DROP TASK [IF EXISTS] <job id>; -- Stop the task with savepoints
>> > > > > ```
>> > > > >
>> > > > > Oracle[2] uses the PURGE to clean up the table and users can't not
>> > > > recover.
>> > > > > I think it also works for us to terminate the job permanently.
>> > > > >
>> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like. Users
>> > can
>> > > > use
>> > > > > the
>> > > > >
>> > > > > ```
>> > > > >  SET 'state.savepoints.dir' = '<path_to_savepoint>';
>> > > > >  SET 'state.savepoints.fomat' = 'native';
>> > > > >  CREATE SAVEPOINT <job id>;
>> > > > >
>> > > > >  DROP SAVEPOINT <path_to_savepoint>;
>> > > > > ```
>> > > > >
>> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
>> > more
>> > > > > detailed job infos.
>> > > > >
>> > > > > ```
>> > > > >
>> > > > > SHOW TASKS;
>> > > > >
>> > > > >
>> > > > > +----------------------------------+
>> > > > > |            job_id                |
>> > > > > +----------------------------------+
>> > > > > | 0f6413c33757fbe0277897dd94485f04 |
>> > > > > +----------------------------------+
>> > > > >
>> > > > > DESCRIPE TASK <job id>;
>> > > > >
>> > > > > +------------------------
>> > > > > |  job name   | status  |
>> > > > > +------------------------
>> > > > > | insert-sink | running |
>> > > > > +------------------------
>> > > > >
>> > > > > ```
>> > > > > It's better we can also get the infos about the cluster where the job
>> > > is
>> > > > > running on through the DESCRIBE statement.
>> > > > >
>> > > > >
>> > > > > [1]
>> > > > >
>> > > >
>> > >
>> > https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management <https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management>
>> > > > <
>> > > >
>> > >
>> > https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management <https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management>
>> > > > >
>> > > > > [2]
>> > > > >
>> > > >
>> > >
>> > https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806 <https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806>
>> > > > <
>> > > >
>> > >
>> > https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806 <https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806>
>> > > > >
>> > > > >
>> > > > > Paul Lam <paullin3280@gmail.com <ma...@gmail.com> <mailto:paullin3280@gmail.com <ma...@gmail.com>>>
>> > > > 于2022年4月21日周四 10:36写道：
>> > > > >
>> > > > >> ping @Timo @Jark @Shengkai
>> > > > >>
>> > > > >> Best,
>> > > > >> Paul Lam
>> > > > >>
>> > > > >>> 2022年4月18日 17:12，Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 写道：
>> > > > >>>
>> > > > >>> Hi team,
>> > > > >>>
>> > > > >>> I’d like to start a discussion about FLIP-222 [1], which adds query
>> > > > >> lifecycle
>> > > > >>> statements to SQL client.
>> > > > >>>
>> > > > >>> Currently, SQL client supports submitting queries (queries in a
>> > broad
>> > > > >> sense,
>> > > > >>> including DQLs and DMLs) but no further lifecycle statements, like
>> > > > >> canceling
>> > > > >>> a query or triggering a savepoint. That makes SQL users have to
>> > rely
>> > > on
>> > > > >>> CLI or REST API to manage theirs queries.
>> > > > >>>
>> > > > >>> Thus, I propose to introduce the following statements to fill the
>> > > gap.
>> > > > >>> SHOW QUERIES
>> > > > >>> STOP QUERY <query_id>
>> > > > >>> CANCEL QUERY <query_id>
>> > > > >>> TRIGGER SAVEPOINT <savepoint_path>
>> > > > >>> DISPOSE SAVEPOINT <savepoint_path>
>> > > > >>> These statement would align SQL client with CLI, providing the full
>> > > > >> lifecycle
>> > > > >>> management for queries/jobs.
>> > > > >>>
>> > > > >>> Please see the FLIP page[1] for more details. Thanks a lot!
>> > > > >>> (For reference, the previous discussion thread see [2].)
>> > > > >>>
>> > > > >>> [1]
>> > > > >>
>> > > >
>> > >
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client>
>> > > > >> <
>> > > > >>
>> > > >
>> > >
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client>
>> > > > <
>> > > >
>> > >
>> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client>
>> > > > >
>> > > > >>>
>> > > > >>> [2]
>> > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>
>> > > <
>> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>> <
>> > > > >> https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb> <
>> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>>>
>> > > > >>>
>> > > > >>> Best,
>> > > > >>> Paul Lam
>> > > >
>> > > >
>> > >
>> >
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Jark Wu <im...@gmail.com>.

Hi Paul,

I think this FLIP has already in a good shape. I just left some additional
thoughts:

*1) the display of savepoint_path*
Could the displayed savepoint_path include the scheme part?
E.g. `hdfs:///flink-savepoints/savepoint-cca7bc-bb1e257f0dab`
IIUC, the scheme part is omitted when it's a local filesystem.
But the behavior would be clearer if including the scheme part in the
design doc.

*2) Please make a decision on multiple options in the FLIP.*
It might give the impression that we will support all the options.

*3) +1 SAVEPOINT and RELEASE SAVEPOINT*
Personally, I also prefer "SAVEPOINT <query_id>" and "RELEASE SAVEPOINT
<savepoint_path>"
to "CREATE/DROP SAVEPOINT", as they have been used in mature databases.

*4) +1 SHOW QUERIES*
Btw, the displayed column "address" is a little confusing to me.
At the first glance, I'm not sure what address it is, JM RPC address? JM
REST address? Gateway address?
If this is a link to the job's web UI URL, how about calling it "web_url"
and display in
"http://<hostname>:<port>" format?
Besides, how about displaying "startTime" or "uptime" as well?

*5) STOP/CANCEL QUERY vs DROP QUERY*
I'm +1 to DROP, because it's more compliant with SQL standard naming, i.e.,
"SHOW/CREATE/DROP".
Separating STOP and CANCEL confuses users a lot what are the differences
between them.
I'm +1 to add the "PURGE" keyword to the DROP QUERY statement, which
indicates to stop query without savepoint.
Note that, PURGE doesn't mean stop with --drain flag. The drain flag will
flush all the registered timers
and windows which could lead to incorrect results when the job is resumed.
I think the drain flag is rarely used
(please correct me if I'm wrong), therefore, I suggest moving this feature
into future work when the needs are clear.

*6) Table API*
I think it makes sense to support the new statements in Table API.
We should try to make the Gateway and CLI simple which just forward
statement to the underlying TableEnvironemnt.
JAR statements are being re-implemented in Table API as well, see
FLIP-214[1].

*7) <query_id> and <savepoint_path> should be quoted*
All the <query_id> and <savepoint_path> should be string literal, otherwise
it's hard to parse them.
For example, STOP QUERY '<query_id>'.

*8) Examples*
Could you add an example that consists of all the statements to show how to
manage the full lifecycle of queries?
Including show queries, create savepoint, remove savepoint, stop query with
a savepoint, and restart query with savepoint.

Best,
Jark

[1]:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-214+Support+Advanced+Function+DDL?src=contextnavpagetreemode


On Fri, 6 May 2022 at 19:13, Martijn Visser <ma...@apache.org>
wrote:

> Hi Paul,
>
> Great that you could find something in the SQL standard! I'll try to read
> the FLIP once more completely next week to see if I have any more concerns.
>
> Best regards,
>
> Martijn
>
> On Fri, 6 May 2022 at 08:21, Paul Lam <pa...@gmail.com> wrote:
>
>> I had a look at SQL-2016 that Martijn mentioned, and found that
>> maybe we could follow the transaction savepoint syntax.
>>
>>    - SAVEPOINT <savepoint specifier>
>>    - RELEASE SAVEPOINT <savepoint specifier>
>>
>> These savepoint statements are supported in lots of databases, like
>> Oracle[1], PG[2], MariaDB[3].
>>
>> They’re usually used in the middle of a SQL transaction, so the target
>> would be the current transaction. But if used in Flink SQL session, we
>> need to add a JOB/QUERY id when create a savepoint, thus the syntax
>> would be:
>>
>>    - SAVEPOINT <job/query id> <savepoint path>
>>    - RELEASE SAVEPOINT <savepoint path>
>>
>> I’m adding it as an alternative in the FLIP.
>>
>> [1]
>> https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm
>> [2] https://www.postgresql.org/docs/current/sql-savepoint.html
>> [3] https://mariadb.com/kb/en/savepoint/
>>
>> Best,
>> Paul Lam
>>
>> 2022年5月4日 16:42，Paul Lam <pa...@gmail.com> 写道：
>>
>> Hi Shengkai,
>>
>> Thanks a lot for your input!
>>
>> > I just wonder how the users can get the web ui in the application mode.
>> Therefore, it's better we can list the Web UI using the SHOW statement.
>> WDYT?
>>
>> I think it's a valid approach. I'm adding it to the FLIP.
>>
>> > After the investigation, I am fine with the QUERY but the keyword JOB is
>> also okay to me.
>>
>> In addition, CockroachDB has both SHOW QUERIES [1] and SHOW JOBS [2],
>> while the former shows the active running queries and the latter shows
>> the
>> background tasks like schema changes. FYI.
>>
>> WRT the questions:
>>
>> > 1. Could you add some details about the behaviour with the different
>> execution.target, e.g. session, application mode?
>>
>> IMHO, the difference between different `execution.target` is mostly about
>> cluster startup, which has little relation with the proposed statements.
>> These statements rely on the current ClusterClient/JobClient API,
>> which is deployment mode agnostic. Canceling a job in an application
>> cluster is the same as in a session cluster.
>>
>> BTW, application mode is still in the development progress ATM [3].
>>
>> > 2. Considering the SQL Client/Gateway is not limited to submitting the
>> job
>> to the specified cluster, is it able to list jobs in the other clusters?
>>
>> I think multi-cluster support in SQL Client/Gateway should be aligned with
>> CLI, at least at the early phase. We may use SET  to set a cluster id for
>> a
>> session, then we have access to the cluster. However,  every SHOW
>> statement would only involve one cluster.
>>
>> Best,
>> Paul Lam
>>
>> [1] https://www.cockroachlabs.com/docs/stable/show-statements.html
>> [2] https://www.cockroachlabs.com/docs/v21.2/show-jobs
>> [3] https://issues.apache.org/jira/browse/FLINK-26541
>>
>> Shengkai Fang <fs...@gmail.com> 于2022年4月29日周五 15:36写道：
>>
>>> Hi.
>>>
>>> Thanks for Paul's update.
>>>
>>> > It's better we can also get the infos about the cluster where the job
>>> is
>>> > running through the DESCRIBE statement.
>>>
>>> I just wonder how the users can get the web ui in the application mode.
>>> Therefore, it's better we can list the Web UI using the SHOW statement.
>>> WDYT?
>>>
>>>
>>> > QUERY or other keywords.
>>>
>>> I list the statement to manage the lifecycle of the query/dml in other
>>> systems:
>>>
>>> Mysql[1] allows users to SHOW [FULL] PROCESSLIST and use the KILL command
>>> to kill the query.
>>>
>>> ```
>>> mysql> SHOW PROCESSLIST;
>>>
>>> mysql> KILL 27;
>>> ```
>>>
>>>
>>> Postgres use the following statements to kill the queries.
>>>
>>> ```
>>> SELECT pg_cancel_backend(<pid of the process>)
>>>
>>> SELECT pg_terminate_backend(<pid of the process>)
>>> ```
>>>
>>> KSQL uses the following commands to control the query lifecycle[4].
>>>
>>> ```
>>> SHOW QUERIES;
>>>
>>> TERMINATE <query id>;
>>>
>>> ```
>>>
>>> [1] https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html
>>> [2] https://scaledynamix.com/blog/how-to-kill-mysql-queries/
>>> [3]
>>>
>>> https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql
>>> [4]
>>>
>>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>>> [5]
>>>
>>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/
>>>
>>> After the investigation, I am fine with the QUERY but the keyword JOB is
>>> also okay to me.
>>>
>>> We also have two questions here.
>>>
>>> 1. Could you add some details about the behaviour with the different
>>> execution.target, e.g. session, application mode?
>>>
>>> 2. Considering the SQL Client/Gateway is not limited to submitting the
>>> job
>>> to the specified cluster, is it able to list jobs in the other clusters?
>>>
>>>
>>> Best,
>>> Shengkai
>>>
>>> Paul Lam <pa...@gmail.com> 于2022年4月28日周四 17:17写道：
>>>
>>> > Hi Martjin,
>>> >
>>> > Thanks a lot for your reply! I agree that the scope may be a bit
>>> confusing,
>>> > please let me clarify.
>>> >
>>> > The FLIP aims to add new SQL statements that are supported only in
>>> > sql-client, similar to
>>> > jar statements [1]. Jar statements can be parsed into jar operations,
>>> which
>>> > are used only in
>>> > CliClient in sql-client module and cannot be executed by
>>> TableEnvironment
>>> > (not available in
>>> > Table API program that contains SQL that you mentioned).
>>> >
>>> > WRT the unchanged CLI client, I mean CliClient instead of the
>>> sql-client
>>> > module, which
>>> > currently contains the gateway codes (e.g. Executor). The FLIP mainly
>>> > extends
>>> > the gateway part, and barely touches CliClient and REST server (REST
>>> > endpoint in FLIP-91).
>>> >
>>> > WRT the syntax, I don't have much experience with SQL standards, and
>>> I'd
>>> > like to hear
>>> > more opinions from the community. I prefer Hive-style syntax because I
>>> > think many users
>>> > are familiar with Hive, and there're on-going efforts to improve
>>> Flink-Hive
>>> > integration [2][3].
>>> > But my preference is not strong, I'm okay with other options too. Do
>>> you
>>> > think JOB/Task is
>>> > a good choice, or do you have other preferred keywords?
>>> >
>>> > [1]
>>> >
>>> >
>>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/
>>> > [2]
>>> >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility
>>> > [3]
>>> >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint
>>> >
>>> > Best,
>>> > Paul Lam
>>> >
>>> > Martijn Visser <ma...@apache.org> 于2022年4月26日周二 20:14写道：
>>> >
>>> > > Hi Paul,
>>> > >
>>> > > Thanks for creating the FLIP and opening the discussion. I did get a
>>> bit
>>> > > confused about the title, being "query lifecycle statements in SQL
>>> > client".
>>> > > This sounds like you want to adopt the SQL client, but you want to
>>> expand
>>> > > the SQL syntax with lifecycle statements, which could be used from
>>> the
>>> > SQL
>>> > > client, but of course also in a Table API program that contains SQL.
>>> > GIven
>>> > > that you're highlighting the CLI client as unchanged, this adds to
>>> more
>>> > > confusion.
>>> > >
>>> > > I am interested if there's anything listed in the SQL 2016 standard
>>> on
>>> > > these types of lifecycle statements. I did a quick scan for "SHOW
>>> > QUERIES"
>>> > > but couldn't find it. It would be great if we could stay as close as
>>> > > possible to such syntax. Overall I'm not in favour of using QUERIES
>>> as a
>>> > > keyword. I think Flink applications are not queries, but short- or
>>> long
>>> > > running applications. Why should we follow Hive's setup and indeed
>>> not
>>> > > others such as Snowflake, but also Postgres or MySQL?
>>> > >
>>> > > Best regards,
>>> > >
>>> > > Martijn Visser
>>> > > https://twitter.com/MartijnVisser82
>>> > > https://github.com/MartijnVisser
>>> > >
>>> > >
>>> > > On Fri, 22 Apr 2022 at 12:06, Paul Lam <pa...@gmail.com>
>>> wrote:
>>> > >
>>> > > > Hi Shengkai,
>>> > > >
>>> > > > Thanks a lot for your opinions!
>>> > > >
>>> > > > > 1. I think the keyword QUERY may confuse users because the
>>> statement
>>> > > also
>>> > > > > works for the DML statement.
>>> > > >
>>> > > > I slightly lean to QUERY, because:
>>> > > >
>>> > > > Hive calls DMLs queries. We could be better aligned with Hive using
>>> > > QUERY,
>>> > > > especially given that we plan to introduce Hive endpoint.
>>> > > > QUERY is a more SQL-like concept and friendly to SQL users.
>>> > > >
>>> > > > In general, my preference: QUERY > JOB > TASK. I’m okay with JOB,
>>> but
>>> > not
>>> > > > very good with TASK, as it conflicts with the task concept in Flink
>>> > > runtime.
>>> > > >
>>> > > > We could wait for more feedbacks from the community.
>>> > > >
>>> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
>>> > > terminate
>>> > > > > their jobs.
>>> > > >
>>> > > > Agreed. I’m okay with DROP. And if we want to align with Hive, KILL
>>> > might
>>> > > > an alternative.
>>> > > >
>>> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
>>> > > >
>>> > > > Agreed. It’s more SQL-like and intuitive. I’m updating the syntax
>>> on
>>> > the
>>> > > > FLIP.
>>> > > >
>>> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to
>>> get
>>> > more
>>> > > > > detailed job infos.
>>> > > >
>>> > > > That is a more SQL-like approach I think. But considering the
>>> > > > ClusterClient APIs, we can fetch the names and the status along in
>>> one
>>> > > > request,
>>> > > > thus it may be more user friendly to return them all in the SHOW
>>> > > > statement?
>>> > > >
>>> > > > > It's better we can also get the infos about the cluster where
>>> the job
>>> > > is
>>> > > > > running on through the DESCRIBE statement.
>>> > > >
>>> > > > I think cluster info could be part of session properties instead.
>>> WDYT?
>>> > > >
>>> > > > Best,
>>> > > > Paul Lam
>>> > > >
>>> > > > > 2022年4月22日 11:14，Shengkai Fang <fs...@gmail.com> 写道：
>>> > > > >
>>> > > > > Hi Paul
>>> > > > >
>>> > > > > Sorry for the late response. I propose my thoughts here.
>>> > > > >
>>> > > > > 1. I think the keyword QUERY may confuse users because the
>>> statement
>>> > > also
>>> > > > > works for the DML statement. I find the Snowflakes[1] supports
>>> > > > >
>>> > > > > - CREATE TASK
>>> > > > > - DROP TASK
>>> > > > > - ALTER TASK
>>> > > > > - SHOW TASKS
>>> > > > > - DESCRIPE TASK
>>> > > > >
>>> > > > > I think we can follow snowflake to use `TASK` as the keyword or
>>> use
>>> > the
>>> > > > > keyword `JOB`?
>>> > > > >
>>> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
>>> > > terminate
>>> > > > > their jobs.
>>> > > > >
>>> > > > > ```
>>> > > > > DROP TASK [IF EXISTS] <job id> PURGE; -- Forcely stop the job
>>> with
>>> > > drain
>>> > > > >
>>> > > > > DROP TASK [IF EXISTS] <job id>; -- Stop the task with savepoints
>>> > > > > ```
>>> > > > >
>>> > > > > Oracle[2] uses the PURGE to clean up the table and users can't
>>> not
>>> > > > recover.
>>> > > > > I think it also works for us to terminate the job permanently.
>>> > > > >
>>> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
>>> Users
>>> > can
>>> > > > use
>>> > > > > the
>>> > > > >
>>> > > > > ```
>>> > > > >  SET 'state.savepoints.dir' = '<path_to_savepoint>';
>>> > > > >  SET 'state.savepoints.fomat' = 'native';
>>> > > > >  CREATE SAVEPOINT <job id>;
>>> > > > >
>>> > > > >  DROP SAVEPOINT <path_to_savepoint>;
>>> > > > > ```
>>> > > > >
>>> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to
>>> get
>>> > more
>>> > > > > detailed job infos.
>>> > > > >
>>> > > > > ```
>>> > > > >
>>> > > > > SHOW TASKS;
>>> > > > >
>>> > > > >
>>> > > > > +----------------------------------+
>>> > > > > |            job_id                |
>>> > > > > +----------------------------------+
>>> > > > > | 0f6413c33757fbe0277897dd94485f04 |
>>> > > > > +----------------------------------+
>>> > > > >
>>> > > > > DESCRIPE TASK <job id>;
>>> > > > >
>>> > > > > +------------------------
>>> > > > > |  job name   | status  |
>>> > > > > +------------------------
>>> > > > > | insert-sink | running |
>>> > > > > +------------------------
>>> > > > >
>>> > > > > ```
>>> > > > > It's better we can also get the infos about the cluster where
>>> the job
>>> > > is
>>> > > > > running on through the DESCRIBE statement.
>>> > > > >
>>> > > > >
>>> > > > > [1]
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management
>>> > > > <
>>> > > >
>>> > >
>>> >
>>> https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management
>>> > > > >
>>> > > > > [2]
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806
>>> > > > <
>>> > > >
>>> > >
>>> >
>>> https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806
>>> > > > >
>>> > > > >
>>> > > > > Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>
>>> > > > 于2022年4月21日周四 10:36写道：
>>> > > > >
>>> > > > >> ping @Timo @Jark @Shengkai
>>> > > > >>
>>> > > > >> Best,
>>> > > > >> Paul Lam
>>> > > > >>
>>> > > > >>> 2022年4月18日 17:12，Paul Lam <pa...@gmail.com> 写道：
>>> > > > >>>
>>> > > > >>> Hi team,
>>> > > > >>>
>>> > > > >>> I’d like to start a discussion about FLIP-222 [1], which adds
>>> query
>>> > > > >> lifecycle
>>> > > > >>> statements to SQL client.
>>> > > > >>>
>>> > > > >>> Currently, SQL client supports submitting queries (queries in a
>>> > broad
>>> > > > >> sense,
>>> > > > >>> including DQLs and DMLs) but no further lifecycle statements,
>>> like
>>> > > > >> canceling
>>> > > > >>> a query or triggering a savepoint. That makes SQL users have to
>>> > rely
>>> > > on
>>> > > > >>> CLI or REST API to manage theirs queries.
>>> > > > >>>
>>> > > > >>> Thus, I propose to introduce the following statements to fill
>>> the
>>> > > gap.
>>> > > > >>> SHOW QUERIES
>>> > > > >>> STOP QUERY <query_id>
>>> > > > >>> CANCEL QUERY <query_id>
>>> > > > >>> TRIGGER SAVEPOINT <savepoint_path>
>>> > > > >>> DISPOSE SAVEPOINT <savepoint_path>
>>> > > > >>> These statement would align SQL client with CLI, providing the
>>> full
>>> > > > >> lifecycle
>>> > > > >>> management for queries/jobs.
>>> > > > >>>
>>> > > > >>> Please see the FLIP page[1] for more details. Thanks a lot!
>>> > > > >>> (For reference, the previous discussion thread see [2].)
>>> > > > >>>
>>> > > > >>> [1]
>>> > > > >>
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client
>>> > > > >> <
>>> > > > >>
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client
>>> > > > <
>>> > > >
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client
>>> > > > >
>>> > > > >>>
>>> > > > >>> [2]
>>> > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb
>>> > > <
>>> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>
>>> <
>>> > > > >>
>>> https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <
>>> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>>
>>> > > > >>>
>>> > > > >>> Best,
>>> > > > >>> Paul Lam
>>> > > >
>>> > > >
>>> > >
>>> >
>>>
>>
>>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Martijn Visser <ma...@apache.org>.

Hi Paul,

Great that you could find something in the SQL standard! I'll try to read
the FLIP once more completely next week to see if I have any more concerns.

Best regards,

Martijn

On Fri, 6 May 2022 at 08:21, Paul Lam <pa...@gmail.com> wrote:

> I had a look at SQL-2016 that Martijn mentioned, and found that
> maybe we could follow the transaction savepoint syntax.
>
>    - SAVEPOINT <savepoint specifier>
>    - RELEASE SAVEPOINT <savepoint specifier>
>
> These savepoint statements are supported in lots of databases, like
> Oracle[1], PG[2], MariaDB[3].
>
> They’re usually used in the middle of a SQL transaction, so the target
> would be the current transaction. But if used in Flink SQL session, we
> need to add a JOB/QUERY id when create a savepoint, thus the syntax
> would be:
>
>    - SAVEPOINT <job/query id> <savepoint path>
>    - RELEASE SAVEPOINT <savepoint path>
>
> I’m adding it as an alternative in the FLIP.
>
> [1]
> https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm
> [2] https://www.postgresql.org/docs/current/sql-savepoint.html
> [3] https://mariadb.com/kb/en/savepoint/
>
> Best,
> Paul Lam
>
> 2022年5月4日 16:42，Paul Lam <pa...@gmail.com> 写道：
>
> Hi Shengkai,
>
> Thanks a lot for your input!
>
> > I just wonder how the users can get the web ui in the application mode.
> Therefore, it's better we can list the Web UI using the SHOW statement.
> WDYT?
>
> I think it's a valid approach. I'm adding it to the FLIP.
>
> > After the investigation, I am fine with the QUERY but the keyword JOB is
> also okay to me.
>
> In addition, CockroachDB has both SHOW QUERIES [1] and SHOW JOBS [2],
> while the former shows the active running queries and the latter shows the
> background tasks like schema changes. FYI.
>
> WRT the questions:
>
> > 1. Could you add some details about the behaviour with the different
> execution.target, e.g. session, application mode?
>
> IMHO, the difference between different `execution.target` is mostly about
> cluster startup, which has little relation with the proposed statements.
> These statements rely on the current ClusterClient/JobClient API,
> which is deployment mode agnostic. Canceling a job in an application
> cluster is the same as in a session cluster.
>
> BTW, application mode is still in the development progress ATM [3].
>
> > 2. Considering the SQL Client/Gateway is not limited to submitting the
> job
> to the specified cluster, is it able to list jobs in the other clusters?
>
> I think multi-cluster support in SQL Client/Gateway should be aligned with
> CLI, at least at the early phase. We may use SET  to set a cluster id for
> a
> session, then we have access to the cluster. However,  every SHOW
> statement would only involve one cluster.
>
> Best,
> Paul Lam
>
> [1] https://www.cockroachlabs.com/docs/stable/show-statements.html
> [2] https://www.cockroachlabs.com/docs/v21.2/show-jobs
> [3] https://issues.apache.org/jira/browse/FLINK-26541
>
> Shengkai Fang <fs...@gmail.com> 于2022年4月29日周五 15:36写道：
>
>> Hi.
>>
>> Thanks for Paul's update.
>>
>> > It's better we can also get the infos about the cluster where the job is
>> > running through the DESCRIBE statement.
>>
>> I just wonder how the users can get the web ui in the application mode.
>> Therefore, it's better we can list the Web UI using the SHOW statement.
>> WDYT?
>>
>>
>> > QUERY or other keywords.
>>
>> I list the statement to manage the lifecycle of the query/dml in other
>> systems:
>>
>> Mysql[1] allows users to SHOW [FULL] PROCESSLIST and use the KILL command
>> to kill the query.
>>
>> ```
>> mysql> SHOW PROCESSLIST;
>>
>> mysql> KILL 27;
>> ```
>>
>>
>> Postgres use the following statements to kill the queries.
>>
>> ```
>> SELECT pg_cancel_backend(<pid of the process>)
>>
>> SELECT pg_terminate_backend(<pid of the process>)
>> ```
>>
>> KSQL uses the following commands to control the query lifecycle[4].
>>
>> ```
>> SHOW QUERIES;
>>
>> TERMINATE <query id>;
>>
>> ```
>>
>> [1] https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html
>> [2] https://scaledynamix.com/blog/how-to-kill-mysql-queries/
>> [3]
>>
>> https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql
>> [4]
>>
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/
>> [5]
>>
>> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/
>>
>> After the investigation, I am fine with the QUERY but the keyword JOB is
>> also okay to me.
>>
>> We also have two questions here.
>>
>> 1. Could you add some details about the behaviour with the different
>> execution.target, e.g. session, application mode?
>>
>> 2. Considering the SQL Client/Gateway is not limited to submitting the job
>> to the specified cluster, is it able to list jobs in the other clusters?
>>
>>
>> Best,
>> Shengkai
>>
>> Paul Lam <pa...@gmail.com> 于2022年4月28日周四 17:17写道：
>>
>> > Hi Martjin,
>> >
>> > Thanks a lot for your reply! I agree that the scope may be a bit
>> confusing,
>> > please let me clarify.
>> >
>> > The FLIP aims to add new SQL statements that are supported only in
>> > sql-client, similar to
>> > jar statements [1]. Jar statements can be parsed into jar operations,
>> which
>> > are used only in
>> > CliClient in sql-client module and cannot be executed by
>> TableEnvironment
>> > (not available in
>> > Table API program that contains SQL that you mentioned).
>> >
>> > WRT the unchanged CLI client, I mean CliClient instead of the sql-client
>> > module, which
>> > currently contains the gateway codes (e.g. Executor). The FLIP mainly
>> > extends
>> > the gateway part, and barely touches CliClient and REST server (REST
>> > endpoint in FLIP-91).
>> >
>> > WRT the syntax, I don't have much experience with SQL standards, and I'd
>> > like to hear
>> > more opinions from the community. I prefer Hive-style syntax because I
>> > think many users
>> > are familiar with Hive, and there're on-going efforts to improve
>> Flink-Hive
>> > integration [2][3].
>> > But my preference is not strong, I'm okay with other options too. Do you
>> > think JOB/Task is
>> > a good choice, or do you have other preferred keywords?
>> >
>> > [1]
>> >
>> >
>> https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/
>> > [2]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility
>> > [3]
>> >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint
>> >
>> > Best,
>> > Paul Lam
>> >
>> > Martijn Visser <ma...@apache.org> 于2022年4月26日周二 20:14写道：
>> >
>> > > Hi Paul,
>> > >
>> > > Thanks for creating the FLIP and opening the discussion. I did get a
>> bit
>> > > confused about the title, being "query lifecycle statements in SQL
>> > client".
>> > > This sounds like you want to adopt the SQL client, but you want to
>> expand
>> > > the SQL syntax with lifecycle statements, which could be used from the
>> > SQL
>> > > client, but of course also in a Table API program that contains SQL.
>> > GIven
>> > > that you're highlighting the CLI client as unchanged, this adds to
>> more
>> > > confusion.
>> > >
>> > > I am interested if there's anything listed in the SQL 2016 standard on
>> > > these types of lifecycle statements. I did a quick scan for "SHOW
>> > QUERIES"
>> > > but couldn't find it. It would be great if we could stay as close as
>> > > possible to such syntax. Overall I'm not in favour of using QUERIES
>> as a
>> > > keyword. I think Flink applications are not queries, but short- or
>> long
>> > > running applications. Why should we follow Hive's setup and indeed not
>> > > others such as Snowflake, but also Postgres or MySQL?
>> > >
>> > > Best regards,
>> > >
>> > > Martijn Visser
>> > > https://twitter.com/MartijnVisser82
>> > > https://github.com/MartijnVisser
>> > >
>> > >
>> > > On Fri, 22 Apr 2022 at 12:06, Paul Lam <pa...@gmail.com> wrote:
>> > >
>> > > > Hi Shengkai,
>> > > >
>> > > > Thanks a lot for your opinions!
>> > > >
>> > > > > 1. I think the keyword QUERY may confuse users because the
>> statement
>> > > also
>> > > > > works for the DML statement.
>> > > >
>> > > > I slightly lean to QUERY, because:
>> > > >
>> > > > Hive calls DMLs queries. We could be better aligned with Hive using
>> > > QUERY,
>> > > > especially given that we plan to introduce Hive endpoint.
>> > > > QUERY is a more SQL-like concept and friendly to SQL users.
>> > > >
>> > > > In general, my preference: QUERY > JOB > TASK. I’m okay with JOB,
>> but
>> > not
>> > > > very good with TASK, as it conflicts with the task concept in Flink
>> > > runtime.
>> > > >
>> > > > We could wait for more feedbacks from the community.
>> > > >
>> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
>> > > terminate
>> > > > > their jobs.
>> > > >
>> > > > Agreed. I’m okay with DROP. And if we want to align with Hive, KILL
>> > might
>> > > > an alternative.
>> > > >
>> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
>> > > >
>> > > > Agreed. It’s more SQL-like and intuitive. I’m updating the syntax on
>> > the
>> > > > FLIP.
>> > > >
>> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
>> > more
>> > > > > detailed job infos.
>> > > >
>> > > > That is a more SQL-like approach I think. But considering the
>> > > > ClusterClient APIs, we can fetch the names and the status along in
>> one
>> > > > request,
>> > > > thus it may be more user friendly to return them all in the SHOW
>> > > > statement?
>> > > >
>> > > > > It's better we can also get the infos about the cluster where the
>> job
>> > > is
>> > > > > running on through the DESCRIBE statement.
>> > > >
>> > > > I think cluster info could be part of session properties instead.
>> WDYT?
>> > > >
>> > > > Best,
>> > > > Paul Lam
>> > > >
>> > > > > 2022年4月22日 11:14，Shengkai Fang <fs...@gmail.com> 写道：
>> > > > >
>> > > > > Hi Paul
>> > > > >
>> > > > > Sorry for the late response. I propose my thoughts here.
>> > > > >
>> > > > > 1. I think the keyword QUERY may confuse users because the
>> statement
>> > > also
>> > > > > works for the DML statement. I find the Snowflakes[1] supports
>> > > > >
>> > > > > - CREATE TASK
>> > > > > - DROP TASK
>> > > > > - ALTER TASK
>> > > > > - SHOW TASKS
>> > > > > - DESCRIPE TASK
>> > > > >
>> > > > > I think we can follow snowflake to use `TASK` as the keyword or
>> use
>> > the
>> > > > > keyword `JOB`?
>> > > > >
>> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
>> > > terminate
>> > > > > their jobs.
>> > > > >
>> > > > > ```
>> > > > > DROP TASK [IF EXISTS] <job id> PURGE; -- Forcely stop the job with
>> > > drain
>> > > > >
>> > > > > DROP TASK [IF EXISTS] <job id>; -- Stop the task with savepoints
>> > > > > ```
>> > > > >
>> > > > > Oracle[2] uses the PURGE to clean up the table and users can't not
>> > > > recover.
>> > > > > I think it also works for us to terminate the job permanently.
>> > > > >
>> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
>> Users
>> > can
>> > > > use
>> > > > > the
>> > > > >
>> > > > > ```
>> > > > >  SET 'state.savepoints.dir' = '<path_to_savepoint>';
>> > > > >  SET 'state.savepoints.fomat' = 'native';
>> > > > >  CREATE SAVEPOINT <job id>;
>> > > > >
>> > > > >  DROP SAVEPOINT <path_to_savepoint>;
>> > > > > ```
>> > > > >
>> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
>> > more
>> > > > > detailed job infos.
>> > > > >
>> > > > > ```
>> > > > >
>> > > > > SHOW TASKS;
>> > > > >
>> > > > >
>> > > > > +----------------------------------+
>> > > > > |            job_id                |
>> > > > > +----------------------------------+
>> > > > > | 0f6413c33757fbe0277897dd94485f04 |
>> > > > > +----------------------------------+
>> > > > >
>> > > > > DESCRIPE TASK <job id>;
>> > > > >
>> > > > > +------------------------
>> > > > > |  job name   | status  |
>> > > > > +------------------------
>> > > > > | insert-sink | running |
>> > > > > +------------------------
>> > > > >
>> > > > > ```
>> > > > > It's better we can also get the infos about the cluster where the
>> job
>> > > is
>> > > > > running on through the DESCRIBE statement.
>> > > > >
>> > > > >
>> > > > > [1]
>> > > > >
>> > > >
>> > >
>> >
>> https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management
>> > > > <
>> > > >
>> > >
>> >
>> https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management
>> > > > >
>> > > > > [2]
>> > > > >
>> > > >
>> > >
>> >
>> https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806
>> > > > <
>> > > >
>> > >
>> >
>> https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806
>> > > > >
>> > > > >
>> > > > > Paul Lam <paullin3280@gmail.com <ma...@gmail.com>>
>> > > > 于2022年4月21日周四 10:36写道：
>> > > > >
>> > > > >> ping @Timo @Jark @Shengkai
>> > > > >>
>> > > > >> Best,
>> > > > >> Paul Lam
>> > > > >>
>> > > > >>> 2022年4月18日 17:12，Paul Lam <pa...@gmail.com> 写道：
>> > > > >>>
>> > > > >>> Hi team,
>> > > > >>>
>> > > > >>> I’d like to start a discussion about FLIP-222 [1], which adds
>> query
>> > > > >> lifecycle
>> > > > >>> statements to SQL client.
>> > > > >>>
>> > > > >>> Currently, SQL client supports submitting queries (queries in a
>> > broad
>> > > > >> sense,
>> > > > >>> including DQLs and DMLs) but no further lifecycle statements,
>> like
>> > > > >> canceling
>> > > > >>> a query or triggering a savepoint. That makes SQL users have to
>> > rely
>> > > on
>> > > > >>> CLI or REST API to manage theirs queries.
>> > > > >>>
>> > > > >>> Thus, I propose to introduce the following statements to fill
>> the
>> > > gap.
>> > > > >>> SHOW QUERIES
>> > > > >>> STOP QUERY <query_id>
>> > > > >>> CANCEL QUERY <query_id>
>> > > > >>> TRIGGER SAVEPOINT <savepoint_path>
>> > > > >>> DISPOSE SAVEPOINT <savepoint_path>
>> > > > >>> These statement would align SQL client with CLI, providing the
>> full
>> > > > >> lifecycle
>> > > > >>> management for queries/jobs.
>> > > > >>>
>> > > > >>> Please see the FLIP page[1] for more details. Thanks a lot!
>> > > > >>> (For reference, the previous discussion thread see [2].)
>> > > > >>>
>> > > > >>> [1]
>> > > > >>
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client
>> > > > >> <
>> > > > >>
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client
>> > > > <
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client
>> > > > >
>> > > > >>>
>> > > > >>> [2]
>> > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb
>> > > <
>> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb> <
>> > > > >> https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb
>> <
>> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>>
>> > > > >>>
>> > > > >>> Best,
>> > > > >>> Paul Lam
>> > > >
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] FLIP-222: Support full query lifecycle statements in SQL client

Posted by Paul Lam <pa...@gmail.com>.

I had a look at SQL-2016 that Martijn mentioned, and found that
maybe we could follow the transaction savepoint syntax.
SAVEPOINT <savepoint specifier>
RELEASE SAVEPOINT <savepoint specifier>
These savepoint statements are supported in lots of databases, like 
Oracle[1], PG[2], MariaDB[3].

They’re usually used in the middle of a SQL transaction, so the target 
would be the current transaction. But if used in Flink SQL session, we 
need to add a JOB/QUERY id when create a savepoint, thus the syntax 
would be:
SAVEPOINT <job/query id> <savepoint path>
RELEASE SAVEPOINT <savepoint path>
I’m adding it as an alternative in the FLIP.

[1] https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm <https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_10001.htm>
[2] https://www.postgresql.org/docs/current/sql-savepoint.html
[3] https://mariadb.com/kb/en/savepoint/

Best,
Paul Lam

> 2022年5月4日 16:42，Paul Lam <pa...@gmail.com> 写道：
> 
> Hi Shengkai,
> 
> Thanks a lot for your input!
> 
> > I just wonder how the users can get the web ui in the application mode.
> Therefore, it's better we can list the Web UI using the SHOW statement.
> WDYT?
> 
> I think it's a valid approach. I'm adding it to the FLIP.
> 
> > After the investigation, I am fine with the QUERY but the keyword JOB is
> also okay to me.
> 
> In addition, CockroachDB has both SHOW QUERIES [1] and SHOW JOBS [2],
> while the former shows the active running queries and the latter shows the 
> background tasks like schema changes. FYI.
> 
> WRT the questions:
> 
> > 1. Could you add some details about the behaviour with the different
> execution.target, e.g. session, application mode?
> 
> IMHO, the difference between different `execution.target` is mostly about
> cluster startup, which has little relation with the proposed statements.
> These statements rely on the current ClusterClient/JobClient API, 
> which is deployment mode agnostic. Canceling a job in an application 
> cluster is the same as in a session cluster. 
> 
> BTW, application mode is still in the development progress ATM [3].
> 
> > 2. Considering the SQL Client/Gateway is not limited to submitting the job
> to the specified cluster, is it able to list jobs in the other clusters?
> 
> I think multi-cluster support in SQL Client/Gateway should be aligned with
> CLI, at least at the early phase. We may use SET  to set a cluster id for a 
> session, then we have access to the cluster. However,  every SHOW 
> statement would only involve one cluster.
> 
> Best,
> Paul Lam
> 
> [1] https://www.cockroachlabs.com/docs/stable/show-statements.html <https://www.cockroachlabs.com/docs/stable/show-statements.html>
> [2] https://www.cockroachlabs.com/docs/v21.2/show-jobs <https://www.cockroachlabs.com/docs/v21.2/show-jobs>
> [3] https://issues.apache.org/jira/browse/FLINK-26541 <https://issues.apache.org/jira/browse/FLINK-26541>
> Shengkai Fang <fskmine@gmail.com <ma...@gmail.com>> 于2022年4月29日周五 15:36写道：
> Hi.
> 
> Thanks for Paul's update.
> 
> > It's better we can also get the infos about the cluster where the job is
> > running through the DESCRIBE statement.
> 
> I just wonder how the users can get the web ui in the application mode.
> Therefore, it's better we can list the Web UI using the SHOW statement.
> WDYT?
> 
> 
> > QUERY or other keywords.
> 
> I list the statement to manage the lifecycle of the query/dml in other
> systems:
> 
> Mysql[1] allows users to SHOW [FULL] PROCESSLIST and use the KILL command
> to kill the query.
> 
> ```
> mysql> SHOW PROCESSLIST;
> 
> mysql> KILL 27;
> ```
> 
> 
> Postgres use the following statements to kill the queries.
> 
> ```
> SELECT pg_cancel_backend(<pid of the process>)
> 
> SELECT pg_terminate_backend(<pid of the process>)
> ```
> 
> KSQL uses the following commands to control the query lifecycle[4].
> 
> ```
> SHOW QUERIES;
> 
> TERMINATE <query id>;
> 
> ```
> 
> [1] https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html <https://dev.mysql.com/doc/refman/8.0/en/show-processlist.html>
> [2] https://scaledynamix.com/blog/how-to-kill-mysql-queries/ <https://scaledynamix.com/blog/how-to-kill-mysql-queries/>
> [3]
> https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql <https://stackoverflow.com/questions/35319597/how-to-stop-kill-a-query-in-postgresql>
> [4]
> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/ <https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/show-queries/>
> [5]
> https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/ <https://docs.ksqldb.io/en/latest/developer-guide/ksqldb-reference/terminate/>
> 
> After the investigation, I am fine with the QUERY but the keyword JOB is
> also okay to me.
> 
> We also have two questions here.
> 
> 1. Could you add some details about the behaviour with the different
> execution.target, e.g. session, application mode?
> 
> 2. Considering the SQL Client/Gateway is not limited to submitting the job
> to the specified cluster, is it able to list jobs in the other clusters?
> 
> 
> Best,
> Shengkai
> 
> Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 于2022年4月28日周四 17:17写道：
> 
> > Hi Martjin,
> >
> > Thanks a lot for your reply! I agree that the scope may be a bit confusing,
> > please let me clarify.
> >
> > The FLIP aims to add new SQL statements that are supported only in
> > sql-client, similar to
> > jar statements [1]. Jar statements can be parsed into jar operations, which
> > are used only in
> > CliClient in sql-client module and cannot be executed by TableEnvironment
> > (not available in
> > Table API program that contains SQL that you mentioned).
> >
> > WRT the unchanged CLI client, I mean CliClient instead of the sql-client
> > module, which
> > currently contains the gateway codes (e.g. Executor). The FLIP mainly
> > extends
> > the gateway part, and barely touches CliClient and REST server (REST
> > endpoint in FLIP-91).
> >
> > WRT the syntax, I don't have much experience with SQL standards, and I'd
> > like to hear
> > more opinions from the community. I prefer Hive-style syntax because I
> > think many users
> > are familiar with Hive, and there're on-going efforts to improve Flink-Hive
> > integration [2][3].
> > But my preference is not strong, I'm okay with other options too. Do you
> > think JOB/Task is
> > a good choice, or do you have other preferred keywords?
> >
> > [1]
> >
> > https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/ <https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/jar/>
> > [2]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility <https://cwiki.apache.org/confluence/display/FLINK/FLIP-152%3A+Hive+Query+Syntax+Compatibility>
> > [3]
> >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint <https://cwiki.apache.org/confluence/display/FLINK/FLIP-223%3A+Support+HiveServer2+Endpoint>
> >
> > Best,
> > Paul Lam
> >
> > Martijn Visser <martijnvisser@apache.org <ma...@apache.org>> 于2022年4月26日周二 20:14写道：
> >
> > > Hi Paul,
> > >
> > > Thanks for creating the FLIP and opening the discussion. I did get a bit
> > > confused about the title, being "query lifecycle statements in SQL
> > client".
> > > This sounds like you want to adopt the SQL client, but you want to expand
> > > the SQL syntax with lifecycle statements, which could be used from the
> > SQL
> > > client, but of course also in a Table API program that contains SQL.
> > GIven
> > > that you're highlighting the CLI client as unchanged, this adds to more
> > > confusion.
> > >
> > > I am interested if there's anything listed in the SQL 2016 standard on
> > > these types of lifecycle statements. I did a quick scan for "SHOW
> > QUERIES"
> > > but couldn't find it. It would be great if we could stay as close as
> > > possible to such syntax. Overall I'm not in favour of using QUERIES as a
> > > keyword. I think Flink applications are not queries, but short- or long
> > > running applications. Why should we follow Hive's setup and indeed not
> > > others such as Snowflake, but also Postgres or MySQL?
> > >
> > > Best regards,
> > >
> > > Martijn Visser
> > > https://twitter.com/MartijnVisser82 <https://twitter.com/MartijnVisser82>
> > > https://github.com/MartijnVisser <https://github.com/MartijnVisser>
> > >
> > >
> > > On Fri, 22 Apr 2022 at 12:06, Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> wrote:
> > >
> > > > Hi Shengkai,
> > > >
> > > > Thanks a lot for your opinions!
> > > >
> > > > > 1. I think the keyword QUERY may confuse users because the statement
> > > also
> > > > > works for the DML statement.
> > > >
> > > > I slightly lean to QUERY, because:
> > > >
> > > > Hive calls DMLs queries. We could be better aligned with Hive using
> > > QUERY,
> > > > especially given that we plan to introduce Hive endpoint.
> > > > QUERY is a more SQL-like concept and friendly to SQL users.
> > > >
> > > > In general, my preference: QUERY > JOB > TASK. I’m okay with JOB, but
> > not
> > > > very good with TASK, as it conflicts with the task concept in Flink
> > > runtime.
> > > >
> > > > We could wait for more feedbacks from the community.
> > > >
> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
> > > terminate
> > > > > their jobs.
> > > >
> > > > Agreed. I’m okay with DROP. And if we want to align with Hive, KILL
> > might
> > > > an alternative.
> > > >
> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like.
> > > >
> > > > Agreed. It’s more SQL-like and intuitive. I’m updating the syntax on
> > the
> > > > FLIP.
> > > >
> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
> > more
> > > > > detailed job infos.
> > > >
> > > > That is a more SQL-like approach I think. But considering the
> > > > ClusterClient APIs, we can fetch the names and the status along in one
> > > > request,
> > > > thus it may be more user friendly to return them all in the SHOW
> > > > statement?
> > > >
> > > > > It's better we can also get the infos about the cluster where the job
> > > is
> > > > > running on through the DESCRIBE statement.
> > > >
> > > > I think cluster info could be part of session properties instead. WDYT?
> > > >
> > > > Best,
> > > > Paul Lam
> > > >
> > > > > 2022年4月22日 11:14，Shengkai Fang <fskmine@gmail.com <ma...@gmail.com>> 写道：
> > > > >
> > > > > Hi Paul
> > > > >
> > > > > Sorry for the late response. I propose my thoughts here.
> > > > >
> > > > > 1. I think the keyword QUERY may confuse users because the statement
> > > also
> > > > > works for the DML statement. I find the Snowflakes[1] supports
> > > > >
> > > > > - CREATE TASK
> > > > > - DROP TASK
> > > > > - ALTER TASK
> > > > > - SHOW TASKS
> > > > > - DESCRIPE TASK
> > > > >
> > > > > I think we can follow snowflake to use `TASK` as the keyword or use
> > the
> > > > > keyword `JOB`?
> > > > >
> > > > > 2. STOP/CANCEL is not very straightforward for the SQL users to
> > > terminate
> > > > > their jobs.
> > > > >
> > > > > ```
> > > > > DROP TASK [IF EXISTS] <job id> PURGE; -- Forcely stop the job with
> > > drain
> > > > >
> > > > > DROP TASK [IF EXISTS] <job id>; -- Stop the task with savepoints
> > > > > ```
> > > > >
> > > > > Oracle[2] uses the PURGE to clean up the table and users can't not
> > > > recover.
> > > > > I think it also works for us to terminate the job permanently.
> > > > >
> > > > > 3. I think CREATE/DROP SAVEPOINTS statement is more SQL-like. Users
> > can
> > > > use
> > > > > the
> > > > >
> > > > > ```
> > > > >  SET 'state.savepoints.dir' = '<path_to_savepoint>';
> > > > >  SET 'state.savepoints.fomat' = 'native';
> > > > >  CREATE SAVEPOINT <job id>;
> > > > >
> > > > >  DROP SAVEPOINT <path_to_savepoint>;
> > > > > ```
> > > > >
> > > > > 4. SHOW TASKS can just list the job id and use the DESCRIPE to get
> > more
> > > > > detailed job infos.
> > > > >
> > > > > ```
> > > > >
> > > > > SHOW TASKS;
> > > > >
> > > > >
> > > > > +----------------------------------+
> > > > > |            job_id                |
> > > > > +----------------------------------+
> > > > > | 0f6413c33757fbe0277897dd94485f04 |
> > > > > +----------------------------------+
> > > > >
> > > > > DESCRIPE TASK <job id>;
> > > > >
> > > > > +------------------------
> > > > > |  job name   | status  |
> > > > > +------------------------
> > > > > | insert-sink | running |
> > > > > +------------------------
> > > > >
> > > > > ```
> > > > > It's better we can also get the infos about the cluster where the job
> > > is
> > > > > running on through the DESCRIBE statement.
> > > > >
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> > https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management <https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management>
> > > > <
> > > >
> > >
> > https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management <https://docs.snowflake.com/en/sql-reference/ddl-pipeline.html#task-management>
> > > > >
> > > > > [2]
> > > > >
> > > >
> > >
> > https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806 <https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806>
> > > > <
> > > >
> > >
> > https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806 <https://docs.oracle.com/cd/E11882_01/server.112/e41084/statements_9003.htm#SQLRF01806>
> > > > >
> > > > >
> > > > > Paul Lam <paullin3280@gmail.com <ma...@gmail.com> <mailto:paullin3280@gmail.com <ma...@gmail.com>>>
> > > > 于2022年4月21日周四 10:36写道：
> > > > >
> > > > >> ping @Timo @Jark @Shengkai
> > > > >>
> > > > >> Best,
> > > > >> Paul Lam
> > > > >>
> > > > >>> 2022年4月18日 17:12，Paul Lam <paullin3280@gmail.com <ma...@gmail.com>> 写道：
> > > > >>>
> > > > >>> Hi team,
> > > > >>>
> > > > >>> I’d like to start a discussion about FLIP-222 [1], which adds query
> > > > >> lifecycle
> > > > >>> statements to SQL client.
> > > > >>>
> > > > >>> Currently, SQL client supports submitting queries (queries in a
> > broad
> > > > >> sense,
> > > > >>> including DQLs and DMLs) but no further lifecycle statements, like
> > > > >> canceling
> > > > >>> a query or triggering a savepoint. That makes SQL users have to
> > rely
> > > on
> > > > >>> CLI or REST API to manage theirs queries.
> > > > >>>
> > > > >>> Thus, I propose to introduce the following statements to fill the
> > > gap.
> > > > >>> SHOW QUERIES
> > > > >>> STOP QUERY <query_id>
> > > > >>> CANCEL QUERY <query_id>
> > > > >>> TRIGGER SAVEPOINT <savepoint_path>
> > > > >>> DISPOSE SAVEPOINT <savepoint_path>
> > > > >>> These statement would align SQL client with CLI, providing the full
> > > > >> lifecycle
> > > > >>> management for queries/jobs.
> > > > >>>
> > > > >>> Please see the FLIP page[1] for more details. Thanks a lot!
> > > > >>> (For reference, the previous discussion thread see [2].)
> > > > >>>
> > > > >>> [1]
> > > > >>
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222%3A+Support+full+query+lifecycle+statements+in+SQL+client>
> > > > >> <
> > > > >>
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client>
> > > > <
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client <https://cwiki.apache.org/confluence/display/FLINK/FLIP-222:+Support+full+query+lifecycle+statements+in+SQL+client>
> > > > >
> > > > >>>
> > > > >>> [2]
> > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>
> > > <
> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>> <
> > > > >> https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb> <
> > > > https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb <https://lists.apache.org/thread/wr47ng0m2hdybjkrwjlk9ftwg403odqb>>>
> > > > >>>
> > > > >>> Best,
> > > > >>> Paul Lam
> > > >
> > > >
> > >
> >