You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ignite.apache.org by Юрий <ju...@gmail.com> on 2018/12/04 09:06:05 UTC

Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

Hey Igniters!

I continue working on IEP-29
<https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring>
in
part related to expose view with currently running queries.

I checked how we track running queries right now and see that it's
complicated and spread by few classes  process and we don't track all
running queries. Also there are internal queries which tracked within users
queries and can't be distinguished of user queries , e.g. map queries for
map-reduce queries or DML operation which required first step as select to
modify data.

My proposal is extract logic for working with running queries information
to separate class, like RunningQueryManager. The class will track running
queries and will be single point to retrieve information about running
queries. Currently to keep information about running queries uses
GridRunningQueryInfo class. As of now it can't provide useful information
to distinguish internal queries and users query, so the class need to
extend to keep information about type of query and id of initial user query
for internal query to be able identify it.

New RunningQueryManager should be used for all place where currently we
have tracking of running queries and added for all places which not covered
yet, it's mostly DDL and DML operations.

After implement the proposed change we can simple expose SQL view for all
running queries on a local node.
Collecting information from all node in a cluster currently out of scope
the change and will be described for discussing later.

Is there any objections for described proposal?

чт, 29 нояб. 2018 г. в 09:03, Юрий <ju...@gmail.com>:

> Hi Alex,
>
> I've just started implement of the view. Thanks for the your efforts!
>
> ср, 28 нояб. 2018 г. в 19:00, Alex Plehanov <pl...@gmail.com>:
>
>> Yuriy,
>>
>> If you have plans to implement running queries view in the nearest future,
>> I already have implemented draft for local node queries some time ago [1].
>> Maybe it will help to implement a view for whole cluster queries.
>>
>> [1]:
>>
>> https://github.com/alex-plekhanov/ignite/commit/6231668646a2b0f848373eb4e9dc38d127603e43
>>
>>
>> ср, 28 нояб. 2018 г. в 17:34, Vladimir Ozerov <vo...@gridgain.com>:
>>
>> > Denis
>> >
>> > I would wait for running queries view first.
>> >
>> > ср, 28 нояб. 2018 г. в 1:57, Denis Magda <dm...@apache.org>:
>> >
>> > > Vladimir,
>> > >
>> > > Please see inline
>> > >
>> > > On Mon, Nov 19, 2018 at 8:23 AM Vladimir Ozerov <vozerov@gridgain.com
>> >
>> > > wrote:
>> > >
>> > > > Denis,
>> > > >
>> > > > I partially agree with you. But there are several problem with
>> syntax
>> > > > proposed by you:
>> > > > 1) This is harder to implement technically - more parsing logic to
>> > > > implement. Ok, this is our internal problem, users do not care
>> about it
>> > > > 2) User will have to consult to docs in any case
>> > > >
>> > >
>> > > Two of these are not a big deal. We just need to invest more time for
>> > > development and during the design phase so that people need to consult
>> > the
>> > > docs rarely.
>> > >
>> > >
>> > > > 3) "nodeId" is not really node ID. For Ignite users node ID is
>> UUID. In
>> > > our
>> > > > case this is node order, and we intentionally avoided any naming
>> here.
>> > > >
>> > >
>> > > Let's use a more loose name such as "node".
>> > >
>> > >
>> > > > Query is just identified by a string, no more than that
>> > > > 4) Proposed syntax is more verbose and open ways for misuse. E.g.
>> what
>> > is
>> > > > "KILL QUERY WHERE queryId=1234"?
>> > > >
>> > > > I am not 100% satisfied with both variants, but the first one looks
>> > > simpler
>> > > > to me. Remember, that user will not guess query ID. Instead, he will
>> > get
>> > > > the list of running queries with some other syntax. What we need to
>> > > > understand for now is how this syntax will look like. I think that
>> we
>> > > > should implement getting list of running queries, and only then
>> start
>> > > > working on cancellation.
>> > > >
>> > >
>> > > That's a good point. Syntax of both running and killing queires
>> commands
>> > > should be tightly coupled. We're going to name a column if running
>> > queries
>> > > IDs somehow anyway and that name might be resued in the WHERE clause
>> of
>> > > KILL.
>> > >
>> > > Should we discuss the syntax in a separate thread?
>> > >
>> > > --
>> > > Denis
>> > >
>> > > >
>> > > > Vladimir.
>> > > >
>> > > >
>> > > > On Mon, Nov 19, 2018 at 7:02 PM Denis Mekhanikov <
>> > dmekhanikov@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Guys,
>> > > > >
>> > > > > Syntax like *KILL QUERY '25.1234'* look a bit cryptic to me.
>> > > > > I'm going to look up in documentation, which parameter goes first
>> in
>> > > this
>> > > > > query every time I use it.
>> > > > > I like the syntax, that Igor suggested more.
>> > > > > Will it be better if we make *nodeId* and *queryId *named
>> properties?
>> > > > >
>> > > > > Something like this:
>> > > > > KILL QUERY WHERE nodeId=25 and queryId=1234
>> > > > >
>> > > > > Denis
>> > > > >
>> > > > > пт, 16 нояб. 2018 г. в 14:12, Юрий <ju...@gmail.com>:
>> > > > >
>> > > > > > I fully agree with last sentences and can start to implement
>> this
>> > > part.
>> > > > > >
>> > > > > > Guys, thanks for your productive participate at discussion.
>> > > > > >
>> > > > > > пт, 16 нояб. 2018 г. в 2:53, Denis Magda <dm...@apache.org>:
>> > > > > >
>> > > > > > > Vladimir,
>> > > > > > >
>> > > > > > > Thanks, make perfect sense to me.
>> > > > > > >
>> > > > > > >
>> > > > > > > On Thu, Nov 15, 2018 at 12:18 AM Vladimir Ozerov <
>> > > > vozerov@gridgain.com
>> > > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Denis,
>> > > > > > > >
>> > > > > > > > The idea is that QueryDetailMetrics will be exposed through
>> > > > separate
>> > > > > > > > "historical" SQL view in addition to current API. So we are
>> on
>> > > the
>> > > > > same
>> > > > > > > > page here.
>> > > > > > > >
>> > > > > > > > As far as query ID I do not see any easy way to operate on a
>> > > single
>> > > > > > > integer
>> > > > > > > > value (even 64bit). This is distributed system - we do not
>> want
>> > > to
>> > > > > have
>> > > > > > > > coordination between nodes to get query ID. And
>> coordination is
>> > > the
>> > > > > > only
>> > > > > > > > possible way to get sexy "long". Instead, I would propose to
>> > form
>> > > > ID
>> > > > > > from
>> > > > > > > > node order and query counter within node. This will be (int,
>> > > long)
>> > > > > > pair.
>> > > > > > > > For use convenience we may convert it to a single string,
>> e.g.
>> > > > > > > > "[node_order].[query_counter]". Then the syntax would be:
>> > > > > > > >
>> > > > > > > > KILL QUERY '25.1234'; // Kill query 1234 on node 25
>> > > > > > > > KILL QUERY '25.*;     // Kill all queries on the node 25
>> > > > > > > >
>> > > > > > > > Makes sense?
>> > > > > > > >
>> > > > > > > > Vladimir.
>> > > > > > > >
>> > > > > > > > On Wed, Nov 14, 2018 at 1:25 PM Denis Magda <
>> dmagda@apache.org
>> > >
>> > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Yury,
>> > > > > > > > >
>> > > > > > > > > As I understand you mean that the view should contains
>> both
>> > > > running
>> > > > > > and
>> > > > > > > > > > finished queries. If be honest for the view I was going
>> to
>> > > use
>> > > > > just
>> > > > > > > > > queries
>> > > > > > > > > > running right now. For finished queries I thought about
>> > > another
>> > > > > > view
>> > > > > > > > with
>> > > > > > > > > > another set of fields which should include I/O related
>> > ones.
>> > > Is
>> > > > > it
>> > > > > > > > works?
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Got you, so if only running queries are there then your
>> > initial
>> > > > > > > proposal
>> > > > > > > > > makes total sense. Not sure we need a view of the finished
>> > > > queries.
>> > > > > > It
>> > > > > > > > will
>> > > > > > > > > be possible to analyze them through the updated
>> > DetailedMetrics
>> > > > > > > approach,
>> > > > > > > > > won't it?
>> > > > > > > > >
>> > > > > > > > > For "KILL QUERY node_id query_id"  node_id required as
>> part
>> > of
>> > > > > unique
>> > > > > > > key
>> > > > > > > > > > of query and help understand Ignite which node start the
>> > > > > > distributed
>> > > > > > > > > query.
>> > > > > > > > > > Use both parameters will allow cheap generate unique key
>> > > across
>> > > > > all
>> > > > > > > > > nodes.
>> > > > > > > > > > Node which started a query can cancel it on all nodes
>> > > > participate
>> > > > > > > > nodes.
>> > > > > > > > > > So, to stop any queries initially we need just send the
>> > > cancel
>> > > > > > > request
>> > > > > > > > to
>> > > > > > > > > > node who started the query. This mechanism is already in
>> > > > Ignite.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > Can we locate node_id behind the scenes if the user
>> supplies
>> > > > > query_id
>> > > > > > > > only?
>> > > > > > > > > A query record in the view already contains query_id and
>> > > node_id
>> > > > > and
>> > > > > > it
>> > > > > > > > > sounds like an extra work for the user to fill in all the
>> > > details
>> > > > > for
>> > > > > > > us.
>> > > > > > > > > Embed node_id into query_id if you'd like to avoid extra
>> > > network
>> > > > > hops
>> > > > > > > for
>> > > > > > > > > query_id to node_id mapping.
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > > Denis
>> > > > > > > > >
>> > > > > > > > > On Wed, Nov 14, 2018 at 1:04 AM Юрий <
>> > > > jury.gerzhedowich@gmail.com>
>> > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Denis,
>> > > > > > > > > >
>> > > > > > > > > > Under the hood 'time' will be as startTime, but for
>> system
>> > > > view I
>> > > > > > > > planned
>> > > > > > > > > > use duration which will be simple calculated as now -
>> > > > startTime.
>> > > > > > So,
>> > > > > > > > > there
>> > > > > > > > > > is't a performance issue.
>> > > > > > > > > > As I understand you mean that the view should contains
>> both
>> > > > > running
>> > > > > > > and
>> > > > > > > > > > finished queries. If be honest for the view I was going
>> to
>> > > use
>> > > > > just
>> > > > > > > > > queries
>> > > > > > > > > > running right now. For finished queries I thought about
>> > > another
>> > > > > > view
>> > > > > > > > with
>> > > > > > > > > > another set of fields which should include I/O related
>> > ones.
>> > > Is
>> > > > > it
>> > > > > > > > works?
>> > > > > > > > > >
>> > > > > > > > > > For "KILL QUERY node_id query_id"  node_id required as
>> part
>> > > of
>> > > > > > unique
>> > > > > > > > key
>> > > > > > > > > > of query and help understand Ignite which node start the
>> > > > > > distributed
>> > > > > > > > > query.
>> > > > > > > > > > Use both parameters will allow cheap generate unique key
>> > > across
>> > > > > all
>> > > > > > > > > nodes.
>> > > > > > > > > > Node which started a query can cancel it on all nodes
>> > > > participate
>> > > > > > > > nodes.
>> > > > > > > > > > So, to stop any queries initially we need just send the
>> > > cancel
>> > > > > > > request
>> > > > > > > > to
>> > > > > > > > > > node who started the query. This mechanism is already in
>> > > > Ignite.
>> > > > > > > > > >
>> > > > > > > > > > Native SQL APIs will automatically support the futures
>> > after
>> > > > > > > > implementing
>> > > > > > > > > > for thin clients. So we are good here.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > вт, 13 нояб. 2018 г. в 18:52, Denis Magda <
>> > dmagda@apache.org
>> > > >:
>> > > > > > > > > >
>> > > > > > > > > > > Yury,
>> > > > > > > > > > >
>> > > > > > > > > > > Please consider the following:
>> > > > > > > > > > >
>> > > > > > > > > > >    - If we record the duration instead of startTime,
>> then
>> > > the
>> > > > > > > former
>> > > > > > > > > has
>> > > > > > > > > > to
>> > > > > > > > > > >    be updated frequently - sounds like a performance
>> red
>> > > > flag.
>> > > > > > > Should
>> > > > > > > > > we
>> > > > > > > > > > > store
>> > > > > > > > > > >    startTime and endTime instead? This way a query
>> record
>> > > > will
>> > > > > be
>> > > > > > > > > updated
>> > > > > > > > > > >    twice - when the query is started and terminated.
>> > > > > > > > > > >    - In the IEP you've mentioned I/O related fields
>> that
>> > > > should
>> > > > > > > help
>> > > > > > > > to
>> > > > > > > > > > >    grasp why a query runs that slow. Should they be
>> > stored
>> > > in
>> > > > > > this
>> > > > > > > > > view?
>> > > > > > > > > > >    - "KILL QUERY query_id" is more than enough. Let's
>> not
>> > > add
>> > > > > > > > "node_id"
>> > > > > > > > > > >    unless it's absolutely required. Our queries are
>> > > > distributed
>> > > > > > and
>> > > > > > > > > > > executed
>> > > > > > > > > > >    across several nodes that's why the node_id
>> parameter
>> > is
>> > > > > > > > redundant.
>> > > > > > > > > > >    - This API needs to be supported across all our
>> > > > interfaces.
>> > > > > We
>> > > > > > > can
>> > > > > > > > > > start
>> > > > > > > > > > >    with JDBC/ODBC and thin clients and then support
>> for
>> > the
>> > > > > > native
>> > > > > > > > SQL
>> > > > > > > > > > APIs
>> > > > > > > > > > >    (Java, Net, C++)
>> > > > > > > > > > >    - Please share examples of SELECTs in the IEP that
>> > would
>> > > > > show
>> > > > > > > how
>> > > > > > > > to
>> > > > > > > > > > >    find long running queries, queries that cause a
>> lot of
>> > > I/O
>> > > > > > > > troubles.
>> > > > > > > > > > >
>> > > > > > > > > > > --
>> > > > > > > > > > > Denis
>> > > > > > > > > > >
>> > > > > > > > > > > On Tue, Nov 13, 2018 at 1:15 AM Юрий <
>> > > > > > jury.gerzhedowich@gmail.com>
>> > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Igniters,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Some comments for my original email's.
>> > > > > > > > > > > >
>> > > > > > > > > > > > The proposal related to part of IEP-29
>> > > > > > > > > > > > <
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
>> > > > > > > > > > > > >
>> > > > > > > > > > > > .
>> > > > > > > > > > > >
>> > > > > > > > > > > > What purpose are we pursuing of the proposal?
>> > > > > > > > > > > > We want to be able check which queries running right
>> > now
>> > > > > > through
>> > > > > > > > thin
>> > > > > > > > > > > > clients. Get some information related to the queries
>> > and
>> > > be
>> > > > > > able
>> > > > > > > to
>> > > > > > > > > > > cancel
>> > > > > > > > > > > > a query if it required for some reasons.
>> > > > > > > > > > > > So, we need interface to get a running queries. For
>> the
>> > > > goal
>> > > > > we
>> > > > > > > > > propose
>> > > > > > > > > > > > running_queries system view. The view contains
>> unique
>> > > query
>> > > > > > > > > identifier
>> > > > > > > > > > > > which need to pass to kill query command to cancel
>> the
>> > > > query.
>> > > > > > > > > > > >
>> > > > > > > > > > > > What do you think about fields of the running
>> queries
>> > > view?
>> > > > > May
>> > > > > > > be
>> > > > > > > > > some
>> > > > > > > > > > > > useful fields we could easy add to the view.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Also let's discuss syntax of cancellation of query.
>> I
>> > > > propose
>> > > > > > to
>> > > > > > > > use
>> > > > > > > > > > > MySQL
>> > > > > > > > > > > > like syntax as easy to understand and shorter then
>> > Oracle
>> > > > and
>> > > > > > > > > Postgres
>> > > > > > > > > > > > syntax ( detailed information in IEP-29
>> > > > > > > > > > > > <
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
>> > > > > > > > > > > > >
>> > > > > > > > > > > > ).
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > пн, 12 нояб. 2018 г. в 19:28, Юрий <
>> > > > > > jury.gerzhedowich@gmail.com
>> > > > > > > >:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Igniters,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Below is a proposed design for thin client SQL
>> > > management
>> > > > > and
>> > > > > > > > > > > monitoring
>> > > > > > > > > > > > > to cancel a queries.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 1) Ignite expose system SQL view with name
>> > > > > *running_queries*
>> > > > > > > > > > > > > proposed columns: *node_id, query_id, sql,
>> > schema_name,
>> > > > > > > > > > connection_id,
>> > > > > > > > > > > > > duration*.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > node_id - initial node of request
>> > > > > > > > > > > > > query_id - unique id of query on node
>> > > > > > > > > > > > > sql - text of query
>> > > > > > > > > > > > > schema name - name of sql schema
>> > > > > > > > > > > > > connection_id - id of client connection from
>> > > > > > > > > > > > ClientListenerConnectionContext
>> > > > > > > > > > > > > class
>> > > > > > > > > > > > > duration - duration in millisecond from start of
>> > query
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Ignite will gather info about running queries from
>> > each
>> > > > of
>> > > > > > > nodes
>> > > > > > > > > and
>> > > > > > > > > > > > > collect it during user query. We already have
>> most of
>> > > the
>> > > > > > > > > information
>> > > > > > > > > > > at
>> > > > > > > > > > > > GridRunningQueryInfo
>> > > > > > > > > > > > > on each of nodes.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Instead of duration we can use start_time, but I
>> > think
>> > > > > > duration
>> > > > > > > > > will
>> > > > > > > > > > be
>> > > > > > > > > > > > > simple to use due to it not depend on a timezone.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 2) Propose to use following syntax to kill a
>> running
>> > > > query:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > *KILL QUERY node_Id query_id*
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Both parameters node_id and query_id can be get
>> > through
>> > > > > > > > > > running_queries
>> > > > > > > > > > > > > system view.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > When a node receive such request it can be run
>> > locally
>> > > in
>> > > > > > case
>> > > > > > > > node
>> > > > > > > > > > > have
>> > > > > > > > > > > > > given node_id or send message to node with given
>> id.
>> > > > > Because
>> > > > > > > node
>> > > > > > > > > > have
>> > > > > > > > > > > > > information about local running queries then can
>> > cancel
>> > > > it
>> > > > > -
>> > > > > > it
>> > > > > > > > > > already
>> > > > > > > > > > > > > implemented in
>> > > > GridReduceQueryExecutor.cancelQueries(qryId)
>> > > > > > > > method.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Comments are welcome.
>> > > > > > > > > > > > > --
>> > > > > > > > > > > > > Живи с улыбкой! :D
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > --
>> > > > > > > > > > > > Живи с улыбкой! :D
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > --
>> > > > > > > > > > Живи с улыбкой! :D
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Живи с улыбкой! :D
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>
> --
> Живи с улыбкой! :D
>


-- 
Живи с улыбкой! :D

Re: proposed design for thin client SQL management and monitoring (view running queries and kill it)

Posted by Vladimir Ozerov <vo...@gridgain.com>.

Sounds good to me.

On Tue, Dec 4, 2018 at 12:06 PM Юрий <ju...@gmail.com> wrote:

> Hey Igniters!
>
> I continue working on IEP-29
> <
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> >
> in
> part related to expose view with currently running queries.
>
> I checked how we track running queries right now and see that it's
> complicated and spread by few classes  process and we don't track all
> running queries. Also there are internal queries which tracked within users
> queries and can't be distinguished of user queries , e.g. map queries for
> map-reduce queries or DML operation which required first step as select to
> modify data.
>
> My proposal is extract logic for working with running queries information
> to separate class, like RunningQueryManager. The class will track running
> queries and will be single point to retrieve information about running
> queries. Currently to keep information about running queries uses
> GridRunningQueryInfo class. As of now it can't provide useful information
> to distinguish internal queries and users query, so the class need to
> extend to keep information about type of query and id of initial user query
> for internal query to be able identify it.
>
> New RunningQueryManager should be used for all place where currently we
> have tracking of running queries and added for all places which not covered
> yet, it's mostly DDL and DML operations.
>
> After implement the proposed change we can simple expose SQL view for all
> running queries on a local node.
> Collecting information from all node in a cluster currently out of scope
> the change and will be described for discussing later.
>
> Is there any objections for described proposal?
>
> чт, 29 нояб. 2018 г. в 09:03, Юрий <ju...@gmail.com>:
>
> > Hi Alex,
> >
> > I've just started implement of the view. Thanks for the your efforts!
> >
> > ср, 28 нояб. 2018 г. в 19:00, Alex Plehanov <pl...@gmail.com>:
> >
> >> Yuriy,
> >>
> >> If you have plans to implement running queries view in the nearest
> future,
> >> I already have implemented draft for local node queries some time ago
> [1].
> >> Maybe it will help to implement a view for whole cluster queries.
> >>
> >> [1]:
> >>
> >>
> https://github.com/alex-plekhanov/ignite/commit/6231668646a2b0f848373eb4e9dc38d127603e43
> >>
> >>
> >> ср, 28 нояб. 2018 г. в 17:34, Vladimir Ozerov <vo...@gridgain.com>:
> >>
> >> > Denis
> >> >
> >> > I would wait for running queries view first.
> >> >
> >> > ср, 28 нояб. 2018 г. в 1:57, Denis Magda <dm...@apache.org>:
> >> >
> >> > > Vladimir,
> >> > >
> >> > > Please see inline
> >> > >
> >> > > On Mon, Nov 19, 2018 at 8:23 AM Vladimir Ozerov <
> vozerov@gridgain.com
> >> >
> >> > > wrote:
> >> > >
> >> > > > Denis,
> >> > > >
> >> > > > I partially agree with you. But there are several problem with
> >> syntax
> >> > > > proposed by you:
> >> > > > 1) This is harder to implement technically - more parsing logic to
> >> > > > implement. Ok, this is our internal problem, users do not care
> >> about it
> >> > > > 2) User will have to consult to docs in any case
> >> > > >
> >> > >
> >> > > Two of these are not a big deal. We just need to invest more time
> for
> >> > > development and during the design phase so that people need to
> consult
> >> > the
> >> > > docs rarely.
> >> > >
> >> > >
> >> > > > 3) "nodeId" is not really node ID. For Ignite users node ID is
> >> UUID. In
> >> > > our
> >> > > > case this is node order, and we intentionally avoided any naming
> >> here.
> >> > > >
> >> > >
> >> > > Let's use a more loose name such as "node".
> >> > >
> >> > >
> >> > > > Query is just identified by a string, no more than that
> >> > > > 4) Proposed syntax is more verbose and open ways for misuse. E.g.
> >> what
> >> > is
> >> > > > "KILL QUERY WHERE queryId=1234"?
> >> > > >
> >> > > > I am not 100% satisfied with both variants, but the first one
> looks
> >> > > simpler
> >> > > > to me. Remember, that user will not guess query ID. Instead, he
> will
> >> > get
> >> > > > the list of running queries with some other syntax. What we need
> to
> >> > > > understand for now is how this syntax will look like. I think that
> >> we
> >> > > > should implement getting list of running queries, and only then
> >> start
> >> > > > working on cancellation.
> >> > > >
> >> > >
> >> > > That's a good point. Syntax of both running and killing queires
> >> commands
> >> > > should be tightly coupled. We're going to name a column if running
> >> > queries
> >> > > IDs somehow anyway and that name might be resued in the WHERE clause
> >> of
> >> > > KILL.
> >> > >
> >> > > Should we discuss the syntax in a separate thread?
> >> > >
> >> > > --
> >> > > Denis
> >> > >
> >> > > >
> >> > > > Vladimir.
> >> > > >
> >> > > >
> >> > > > On Mon, Nov 19, 2018 at 7:02 PM Denis Mekhanikov <
> >> > dmekhanikov@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > Guys,
> >> > > > >
> >> > > > > Syntax like *KILL QUERY '25.1234'* look a bit cryptic to me.
> >> > > > > I'm going to look up in documentation, which parameter goes
> first
> >> in
> >> > > this
> >> > > > > query every time I use it.
> >> > > > > I like the syntax, that Igor suggested more.
> >> > > > > Will it be better if we make *nodeId* and *queryId *named
> >> properties?
> >> > > > >
> >> > > > > Something like this:
> >> > > > > KILL QUERY WHERE nodeId=25 and queryId=1234
> >> > > > >
> >> > > > > Denis
> >> > > > >
> >> > > > > пт, 16 нояб. 2018 г. в 14:12, Юрий <jury.gerzhedowich@gmail.com
> >:
> >> > > > >
> >> > > > > > I fully agree with last sentences and can start to implement
> >> this
> >> > > part.
> >> > > > > >
> >> > > > > > Guys, thanks for your productive participate at discussion.
> >> > > > > >
> >> > > > > > пт, 16 нояб. 2018 г. в 2:53, Denis Magda <dm...@apache.org>:
> >> > > > > >
> >> > > > > > > Vladimir,
> >> > > > > > >
> >> > > > > > > Thanks, make perfect sense to me.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Thu, Nov 15, 2018 at 12:18 AM Vladimir Ozerov <
> >> > > > vozerov@gridgain.com
> >> > > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Denis,
> >> > > > > > > >
> >> > > > > > > > The idea is that QueryDetailMetrics will be exposed
> through
> >> > > > separate
> >> > > > > > > > "historical" SQL view in addition to current API. So we
> are
> >> on
> >> > > the
> >> > > > > same
> >> > > > > > > > page here.
> >> > > > > > > >
> >> > > > > > > > As far as query ID I do not see any easy way to operate
> on a
> >> > > single
> >> > > > > > > integer
> >> > > > > > > > value (even 64bit). This is distributed system - we do not
> >> want
> >> > > to
> >> > > > > have
> >> > > > > > > > coordination between nodes to get query ID. And
> >> coordination is
> >> > > the
> >> > > > > > only
> >> > > > > > > > possible way to get sexy "long". Instead, I would propose
> to
> >> > form
> >> > > > ID
> >> > > > > > from
> >> > > > > > > > node order and query counter within node. This will be
> (int,
> >> > > long)
> >> > > > > > pair.
> >> > > > > > > > For use convenience we may convert it to a single string,
> >> e.g.
> >> > > > > > > > "[node_order].[query_counter]". Then the syntax would be:
> >> > > > > > > >
> >> > > > > > > > KILL QUERY '25.1234'; // Kill query 1234 on node 25
> >> > > > > > > > KILL QUERY '25.*;     // Kill all queries on the node 25
> >> > > > > > > >
> >> > > > > > > > Makes sense?
> >> > > > > > > >
> >> > > > > > > > Vladimir.
> >> > > > > > > >
> >> > > > > > > > On Wed, Nov 14, 2018 at 1:25 PM Denis Magda <
> >> dmagda@apache.org
> >> > >
> >> > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Yury,
> >> > > > > > > > >
> >> > > > > > > > > As I understand you mean that the view should contains
> >> both
> >> > > > running
> >> > > > > > and
> >> > > > > > > > > > finished queries. If be honest for the view I was
> going
> >> to
> >> > > use
> >> > > > > just
> >> > > > > > > > > queries
> >> > > > > > > > > > running right now. For finished queries I thought
> about
> >> > > another
> >> > > > > > view
> >> > > > > > > > with
> >> > > > > > > > > > another set of fields which should include I/O related
> >> > ones.
> >> > > Is
> >> > > > > it
> >> > > > > > > > works?
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > Got you, so if only running queries are there then your
> >> > initial
> >> > > > > > > proposal
> >> > > > > > > > > makes total sense. Not sure we need a view of the
> finished
> >> > > > queries.
> >> > > > > > It
> >> > > > > > > > will
> >> > > > > > > > > be possible to analyze them through the updated
> >> > DetailedMetrics
> >> > > > > > > approach,
> >> > > > > > > > > won't it?
> >> > > > > > > > >
> >> > > > > > > > > For "KILL QUERY node_id query_id"  node_id required as
> >> part
> >> > of
> >> > > > > unique
> >> > > > > > > key
> >> > > > > > > > > > of query and help understand Ignite which node start
> the
> >> > > > > > distributed
> >> > > > > > > > > query.
> >> > > > > > > > > > Use both parameters will allow cheap generate unique
> key
> >> > > across
> >> > > > > all
> >> > > > > > > > > nodes.
> >> > > > > > > > > > Node which started a query can cancel it on all nodes
> >> > > > participate
> >> > > > > > > > nodes.
> >> > > > > > > > > > So, to stop any queries initially we need just send
> the
> >> > > cancel
> >> > > > > > > request
> >> > > > > > > > to
> >> > > > > > > > > > node who started the query. This mechanism is already
> in
> >> > > > Ignite.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > Can we locate node_id behind the scenes if the user
> >> supplies
> >> > > > > query_id
> >> > > > > > > > only?
> >> > > > > > > > > A query record in the view already contains query_id and
> >> > > node_id
> >> > > > > and
> >> > > > > > it
> >> > > > > > > > > sounds like an extra work for the user to fill in all
> the
> >> > > details
> >> > > > > for
> >> > > > > > > us.
> >> > > > > > > > > Embed node_id into query_id if you'd like to avoid extra
> >> > > network
> >> > > > > hops
> >> > > > > > > for
> >> > > > > > > > > query_id to node_id mapping.
> >> > > > > > > > >
> >> > > > > > > > > --
> >> > > > > > > > > Denis
> >> > > > > > > > >
> >> > > > > > > > > On Wed, Nov 14, 2018 at 1:04 AM Юрий <
> >> > > > jury.gerzhedowich@gmail.com>
> >> > > > > > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Denis,
> >> > > > > > > > > >
> >> > > > > > > > > > Under the hood 'time' will be as startTime, but for
> >> system
> >> > > > view I
> >> > > > > > > > planned
> >> > > > > > > > > > use duration which will be simple calculated as now -
> >> > > > startTime.
> >> > > > > > So,
> >> > > > > > > > > there
> >> > > > > > > > > > is't a performance issue.
> >> > > > > > > > > > As I understand you mean that the view should contains
> >> both
> >> > > > > running
> >> > > > > > > and
> >> > > > > > > > > > finished queries. If be honest for the view I was
> going
> >> to
> >> > > use
> >> > > > > just
> >> > > > > > > > > queries
> >> > > > > > > > > > running right now. For finished queries I thought
> about
> >> > > another
> >> > > > > > view
> >> > > > > > > > with
> >> > > > > > > > > > another set of fields which should include I/O related
> >> > ones.
> >> > > Is
> >> > > > > it
> >> > > > > > > > works?
> >> > > > > > > > > >
> >> > > > > > > > > > For "KILL QUERY node_id query_id"  node_id required as
> >> part
> >> > > of
> >> > > > > > unique
> >> > > > > > > > key
> >> > > > > > > > > > of query and help understand Ignite which node start
> the
> >> > > > > > distributed
> >> > > > > > > > > query.
> >> > > > > > > > > > Use both parameters will allow cheap generate unique
> key
> >> > > across
> >> > > > > all
> >> > > > > > > > > nodes.
> >> > > > > > > > > > Node which started a query can cancel it on all nodes
> >> > > > participate
> >> > > > > > > > nodes.
> >> > > > > > > > > > So, to stop any queries initially we need just send
> the
> >> > > cancel
> >> > > > > > > request
> >> > > > > > > > to
> >> > > > > > > > > > node who started the query. This mechanism is already
> in
> >> > > > Ignite.
> >> > > > > > > > > >
> >> > > > > > > > > > Native SQL APIs will automatically support the futures
> >> > after
> >> > > > > > > > implementing
> >> > > > > > > > > > for thin clients. So we are good here.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > вт, 13 нояб. 2018 г. в 18:52, Denis Magda <
> >> > dmagda@apache.org
> >> > > >:
> >> > > > > > > > > >
> >> > > > > > > > > > > Yury,
> >> > > > > > > > > > >
> >> > > > > > > > > > > Please consider the following:
> >> > > > > > > > > > >
> >> > > > > > > > > > >    - If we record the duration instead of startTime,
> >> then
> >> > > the
> >> > > > > > > former
> >> > > > > > > > > has
> >> > > > > > > > > > to
> >> > > > > > > > > > >    be updated frequently - sounds like a performance
> >> red
> >> > > > flag.
> >> > > > > > > Should
> >> > > > > > > > > we
> >> > > > > > > > > > > store
> >> > > > > > > > > > >    startTime and endTime instead? This way a query
> >> record
> >> > > > will
> >> > > > > be
> >> > > > > > > > > updated
> >> > > > > > > > > > >    twice - when the query is started and terminated.
> >> > > > > > > > > > >    - In the IEP you've mentioned I/O related fields
> >> that
> >> > > > should
> >> > > > > > > help
> >> > > > > > > > to
> >> > > > > > > > > > >    grasp why a query runs that slow. Should they be
> >> > stored
> >> > > in
> >> > > > > > this
> >> > > > > > > > > view?
> >> > > > > > > > > > >    - "KILL QUERY query_id" is more than enough.
> Let's
> >> not
> >> > > add
> >> > > > > > > > "node_id"
> >> > > > > > > > > > >    unless it's absolutely required. Our queries are
> >> > > > distributed
> >> > > > > > and
> >> > > > > > > > > > > executed
> >> > > > > > > > > > >    across several nodes that's why the node_id
> >> parameter
> >> > is
> >> > > > > > > > redundant.
> >> > > > > > > > > > >    - This API needs to be supported across all our
> >> > > > interfaces.
> >> > > > > We
> >> > > > > > > can
> >> > > > > > > > > > start
> >> > > > > > > > > > >    with JDBC/ODBC and thin clients and then support
> >> for
> >> > the
> >> > > > > > native
> >> > > > > > > > SQL
> >> > > > > > > > > > APIs
> >> > > > > > > > > > >    (Java, Net, C++)
> >> > > > > > > > > > >    - Please share examples of SELECTs in the IEP
> that
> >> > would
> >> > > > > show
> >> > > > > > > how
> >> > > > > > > > to
> >> > > > > > > > > > >    find long running queries, queries that cause a
> >> lot of
> >> > > I/O
> >> > > > > > > > troubles.
> >> > > > > > > > > > >
> >> > > > > > > > > > > --
> >> > > > > > > > > > > Denis
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Tue, Nov 13, 2018 at 1:15 AM Юрий <
> >> > > > > > jury.gerzhedowich@gmail.com>
> >> > > > > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Igniters,
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Some comments for my original email's.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > The proposal related to part of IEP-29
> >> > > > > > > > > > > > <
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > .
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > What purpose are we pursuing of the proposal?
> >> > > > > > > > > > > > We want to be able check which queries running
> right
> >> > now
> >> > > > > > through
> >> > > > > > > > thin
> >> > > > > > > > > > > > clients. Get some information related to the
> queries
> >> > and
> >> > > be
> >> > > > > > able
> >> > > > > > > to
> >> > > > > > > > > > > cancel
> >> > > > > > > > > > > > a query if it required for some reasons.
> >> > > > > > > > > > > > So, we need interface to get a running queries.
> For
> >> the
> >> > > > goal
> >> > > > > we
> >> > > > > > > > > propose
> >> > > > > > > > > > > > running_queries system view. The view contains
> >> unique
> >> > > query
> >> > > > > > > > > identifier
> >> > > > > > > > > > > > which need to pass to kill query command to cancel
> >> the
> >> > > > query.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > What do you think about fields of the running
> >> queries
> >> > > view?
> >> > > > > May
> >> > > > > > > be
> >> > > > > > > > > some
> >> > > > > > > > > > > > useful fields we could easy add to the view.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Also let's discuss syntax of cancellation of
> query.
> >> I
> >> > > > propose
> >> > > > > > to
> >> > > > > > > > use
> >> > > > > > > > > > > MySQL
> >> > > > > > > > > > > > like syntax as easy to understand and shorter then
> >> > Oracle
> >> > > > and
> >> > > > > > > > > Postgres
> >> > > > > > > > > > > > syntax ( detailed information in IEP-29
> >> > > > > > > > > > > > <
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-29%3A+SQL+management+and+monitoring
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > ).
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > пн, 12 нояб. 2018 г. в 19:28, Юрий <
> >> > > > > > jury.gerzhedowich@gmail.com
> >> > > > > > > >:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > Igniters,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Below is a proposed design for thin client SQL
> >> > > management
> >> > > > > and
> >> > > > > > > > > > > monitoring
> >> > > > > > > > > > > > > to cancel a queries.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > 1) Ignite expose system SQL view with name
> >> > > > > *running_queries*
> >> > > > > > > > > > > > > proposed columns: *node_id, query_id, sql,
> >> > schema_name,
> >> > > > > > > > > > connection_id,
> >> > > > > > > > > > > > > duration*.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > node_id - initial node of request
> >> > > > > > > > > > > > > query_id - unique id of query on node
> >> > > > > > > > > > > > > sql - text of query
> >> > > > > > > > > > > > > schema name - name of sql schema
> >> > > > > > > > > > > > > connection_id - id of client connection from
> >> > > > > > > > > > > > ClientListenerConnectionContext
> >> > > > > > > > > > > > > class
> >> > > > > > > > > > > > > duration - duration in millisecond from start of
> >> > query
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Ignite will gather info about running queries
> from
> >> > each
> >> > > > of
> >> > > > > > > nodes
> >> > > > > > > > > and
> >> > > > > > > > > > > > > collect it during user query. We already have
> >> most of
> >> > > the
> >> > > > > > > > > information
> >> > > > > > > > > > > at
> >> > > > > > > > > > > > GridRunningQueryInfo
> >> > > > > > > > > > > > > on each of nodes.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Instead of duration we can use start_time, but I
> >> > think
> >> > > > > > duration
> >> > > > > > > > > will
> >> > > > > > > > > > be
> >> > > > > > > > > > > > > simple to use due to it not depend on a
> timezone.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > 2) Propose to use following syntax to kill a
> >> running
> >> > > > query:
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > *KILL QUERY node_Id query_id*
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Both parameters node_id and query_id can be get
> >> > through
> >> > > > > > > > > > running_queries
> >> > > > > > > > > > > > > system view.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > When a node receive such request it can be run
> >> > locally
> >> > > in
> >> > > > > > case
> >> > > > > > > > node
> >> > > > > > > > > > > have
> >> > > > > > > > > > > > > given node_id or send message to node with given
> >> id.
> >> > > > > Because
> >> > > > > > > node
> >> > > > > > > > > > have
> >> > > > > > > > > > > > > information about local running queries then can
> >> > cancel
> >> > > > it
> >> > > > > -
> >> > > > > > it
> >> > > > > > > > > > already
> >> > > > > > > > > > > > > implemented in
> >> > > > GridReduceQueryExecutor.cancelQueries(qryId)
> >> > > > > > > > method.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Comments are welcome.
> >> > > > > > > > > > > > > --
> >> > > > > > > > > > > > > Живи с улыбкой! :D
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > --
> >> > > > > > > > > > > > Живи с улыбкой! :D
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > --
> >> > > > > > > > > > Живи с улыбкой! :D
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Живи с улыбкой! :D
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
> > --
> > Живи с улыбкой! :D
> >
>
>
> --
> Живи с улыбкой! :D
>