You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Nikolay Izhikov <ni...@apache.org> on 2019/09/06 12:19:50 UTC

Re: [IEP-35] Monitoring & Profiling. Phase 2

Hello, Igniters.

IEP-35. Monitoring&Profiling. Phase2 is ready [1]
Please, join to the review!

I've implemented:

* Monitoring list engine.
* Following list implemented:
    * Cache list
    * Cache group list
    * Compute task list
    * Service list.

Engine details:

* `MonitoringList` added to store list data.
* Base interface `MonitoringRow` for list data created.
* Corresponding method added to `MetricExporterSpi`
* `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
support list export.
* JMX, SQL and other column-oriented SPI uses
`MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
* Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
can be generated with `MonitoringRowAttributeWalkerGenerator`

I prepare follow-up PR [2], also.
Following lists implemented:

* SQL tables
* SQL indexes
* SQL schemas
* SQL queries
* Continuous queries
* Text queries
* Transactions
* Cluster nodes
* Client connections(JDBC, ODBC, Thin)

[1] https://github.com/apache/ignite/pull/6845
[2] https://github.com/apache/ignite/pull/6790



пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:

> Hello, Igniters.
>
> Since Phase 1 will be merged in master soon I've created the ticket [1]
> for Phase 2.
>
> Scope of Phase 2(copy-paste from the ticket)
>
> Ability to collect lists of some internal object Ignite manage.
> Examples of such objects:
>
>   * Caches
>   * Queries (including continuous queries)
>   * Services
>   * Compute tasks
>   * Distributed Data Structures
>   * etc...
>
>
> 1. Fields for each list(that doesn't currently exists in Ignite) will be
> discussed in separate tickets
> 2. Metric Exporters (optionally) can support list export.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-11905
>
>
> В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > Ticket for IEP.Phase1 created -
> https://issues.apache.org/jira/browse/IGNITE-11848
> >
> >
> > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > Hello, Igniters.
> > >
> > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > >
> > > Issues to address:
> > >
> > > 1. Study experience of following libs, tools:
> > >     * OpenTracing
> > >     * OpenSensus
> > >     * DropWizard
> > >
> > > 2. Support histogram sensor: Sensor that collects values that gets
> into predefined segments
> > >
> > > 3. Use more widely used naming(like in OpenSensus?)
> > >
> > > 4. Consider the usage of OpenSensus as a default implementation for
> local metric storage.
> > >
> > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > >
> > > 6. Some metrics should be part of public API and others are not(may be
> changed/removed in release without warnings).
> > >
> > > My plan for Phase #1 is the following:
> > >
> > > 1. Address the issues.
> > > 2. Prepare public API
> > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> with it.
> > > 4. Prepare a PR with lists of each user API.
> > > 5. Collect feedback for a #4.
> > > 6. Design a log exposer. Consider the usage of JFR format or some
> other widely used, tool compatible format.
> > >
> > > [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > >
> > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > Hello, Maxim.
> > > >
> > > > > How will be recorded throughput sensor values which will require
> an interval for the rate calculations?
> > > >
> > > > I answered to this question in IEP "Design principles":
> > > >
> > > > ```
> > > > Sensors should contain only raw values. No aggregation of numeric
> metrics on Ignite side.
> > > > Min, max, avg and other functions are the matter of an external
> monitoring system.
> > > > ```
> > > >
> > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > where S(t) is the sensor value in some point of time t.
> > > >
> > > > Seems, throughput calculation is a responsibility of an external
> system.
> > > >
> > > > What do you think?
> > > >
> > > > > It seems to me that we can add an additional parameter of
> `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> INFO, WARN, NOTICE, DEBUG).
> > > >
> > > > For now, I think that all sensors and lists will be very(very!)
> lightweight.
> > > > So, we should be able to disable/enable it's, for sure.
> > > >
> > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > for the case we have strong performance limitations for a particular
> workload.
> > > >
> > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> profiling: IEP-35 - Phase 3).
> > > > For example, AFAIK we can't disable current SQL system views(Why
> should we?)
> > > >
> > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > Hello Nikolay,
> > > > >
> > > > > I've looked through your PRs changes.
> > > > >
> > > > > > Sensors
> > > > >
> > > > > How will be recorded throughput sensor values which will require an
> > > > > interval for the rate calculations? Do we have such an example? For
> > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> are
> > > > > out of the scope of current PoC and IEP as they are not related to
> the
> > > > > user metrics, but it is a good example of a particular metric type.
> > > > >
> > > > > It seems to me that we can add an additional parameter of
> > > > > `sensitivityLevel` to provide for the user a flexible sensor
> control
> > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > >
> > > > > It also seems that for the sensors getValue() the completely
> > > > > functional java approach can be used. Am I right?
> > > > >
> > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> wrote:
> > > > > >
> > > > > > Hello, Vyacheslav.
> > > > > >
> > > > > > Thanks for the feedback!
> > > > > >
> > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> the core module.
> > > > > >
> > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> discuss.
> > > > > >
> > > > > > > I like your approach with 'wrapper' for monitored objects,
> like don't like using 'ServiceConfiguration' directly as a monitored object
> for services
> > > > > >
> > > > > > Agreed in general.
> > > > > > Seems, choosing the right data to expose is the matter of
> separate discussion for each Ignite entities.
> > > > > > I've planned to file tickets for each entity so anyone
> interested can share his vision in it.
> > > > > >
> > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > >
> > > > > > I'm not sure that *every* sensor should have directly associated
> timestamp.
> > > > > > Seems, we should support sensors without timestamp for a current
> monitoring numbers at least.
> > > > > >
> > > > > > > Also, it'd be great to have an ability to store a list of a
> fixed size> of last N sensors
> > > > > >
> > > > > > What use-cases do you know for such sensors?
> > > > > > We have plans to support fixed size lists to show "Last N SQL
> queries" or similar data.
> > > > > > Essentially, a sensor is just a single value with the name and
> known meaning.
> > > > > >
> > > > > > > It'd be great if you provide a more extended test to show the
> work of> the system.
> > > > > >
> > > > > > Sorry, for that :)
> > > > > > When you run 'MonitoringSelfTest' you should open
> http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > I provide this info in gist -
> https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > >
> > > > > > I will extend this test to print results to console in the next
> iterations - stay tuned :)
> > > > > >
> > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > Hi, Nikolay,
> > > > > > >
> > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > >
> > > > > > > It would be better to implement it as a separate module, I
> can't say
> > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > believe that HttpExposer with Jetty's dependencies should be
> detached
> > > > > > > from the core module.
> > > > > > >
> > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> 'ServiceConfiguration'
> > > > > > > directly as a monitored object for services. I believe we
> shouldn't
> > > > > > > mix approaches. It'd be better always use some kind of
> container with
> > > > > > > monitored object's information to work with such data.
> > > > > > >
> > > > > > > In my opinion, each sensor should have a timestamp. Usually
> monitoring
> > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > timestamp.
> > > > > > >
> > > > > > > Also, it'd be great to have an ability to store a list of a
> fixed size
> > > > > > > of last N sensors, not to miss them without pushing to an
> external
> > > > > > > monitoring system.
> > > > > > >
> > > > > > > It'd be great if you provide a more extended test to show the
> work of
> > > > > > > the system. Everybody who looks to PR needs to run the test
> and get
> > > > > > > the info manually to see the completeness of sensors, this
> might be
> > > > > > > simplified by proper test.
> > > > > > >
> > > > > > > Thank you!
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> nizhikov@apache.org> wrote:
> > > > > > > >
> > > > > > > > Hello, Igniters.
> > > > > > > >
> > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > PR can be found here -
> https://github.com/apache/ignite/pull/6510
> > > > > > > >
> > > > > > > > I've done following changes:
> > > > > > > >
> > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> implementation of manager to store all monitoring info
> > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> implementation that can respond with JSON from
> http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> gist [4]
> > > > > > > >         3. Compute task start and finish monitoring in
> "compute" list [5]
> > > > > > > >         4. Service registration are monitored in "service"
> list - [6]
> > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> `GridMonitoringManager` [7]
> > > > > > > >
> > > > > > > > Design principles, monitoring subsystem details and new
> Ignite entities can be found in IEP [1].
> > > > > > > >
> > > > > > > > My next steps will be:
> > > > > > > >
> > > > > > > >         1. Implementation of JMX exposer
> > > > > > > >         2. Registration of all "lists" and "sensor groups"
> as a SQL System view.
> > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> (described in IEP).
> > > > > > > >         4. Rewrite existing jmx metrics using
> GridMonitoringManager.
> > > > > > > >
> > > > > > > > Please, share you thoughts.
> > > > > > > >
> > > > > > > > Part of JSON file:
> > > > > > > > ```
> > > > > > > >     "COMPUTE": {
> > > > > > > >       "tasks": {
> > > > > > > >         "name": "tasks",
> > > > > > > >         "rows": [
> > > > > > > >           {
> > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > >             "sessionId":
> "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > >             "data": {
> > > > > > > >               "taskClasName":
> "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > >               "startTime": 1556287337944,
> > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > >               "execName": null
> > > > > > > >             },
> > > > > > > >             "name": "anotherBroadcast"
> > > > > > > >           }
> > > > > > > > ```
> > > > > > > >
> > > > > > > > [1]
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > [2]
> https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > [3]
> https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > [4]
> https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > [5]
> https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > [6]
> https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > [7]
> https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > >
> > > > > > >
> > > > > > >
>

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Denis Magda <dm...@apache.org>.
Outstanding! Let's brag about this publicly :)
https://twitter.com/ApacheIgnite/status/1176280416627486720?s=20

-
Denis


On Mon, Sep 23, 2019 at 3:08 AM Nikolay Izhikov <ni...@apache.org> wrote:

> Hello, Team.
>
>
> System view engine is merged to the master.
>
> Thanks, everyone for the feeback and help with it!
> Alex Plekhanov, Andrey Gura, Vyacheslav Daradur, Alex Gonchruk you feed
> back helps a lot!
>
> But, Monitoring & Profiling. Phase 2 continues.
> I will contribute system views for almost all internal objects of Ignite,
> shortly.
> You can take a look at umbrella ticket for it, if you want:
>
> https://issues.apache.org/jira/browse/IGNITE-11905
>
> I also have plans for a Phase 3 and 4:
>
> * Tracing of Ignite internal processes.
> * Watch Dog for a user-provided code.
>
> Stay tuned :)
>
>
> В Пт, 20/09/2019 в 14:22 +0300, Nikolay Izhikov пишет:
> > Hello, Alex.
> >
> > Good catch, thank you.
> >
> > I will add enabling of JMX and SQL exporters for system views, by
> default.
> >
> > В Ср, 18/09/2019 в 16:09 +0300, Alex Plehanov пишет:
> > > One more point to discuss: Wouldn't it be better to have enabled system
> > > views by default?
> > > To enable views admin must restart the node, sometimes it's an issue.
> > > Views cost almost nothing in terms of performance until they are
> explicitly
> > > requested, so is their a reason to disable views by default?
> > >
> > > вт, 17 сент. 2019 г. в 12:48, Alexey Goncharuk <
> alexey.goncharuk@gmail.com>:
> > >
> > > > Folks,
> > > >
> > > > I honestly tried to follow the discussion, but I think that I lost
> the
> > > > point of the debate. Should we try to exploit the newly introduced
> slack to
> > > > discuss the change and then send a follow-up here?
> > > >
> > > > --AG
> > > >
>

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Nikolay Izhikov <ni...@apache.org>.
Hello, Team.


System view engine is merged to the master.

Thanks, everyone for the feeback and help with it!
Alex Plekhanov, Andrey Gura, Vyacheslav Daradur, Alex Gonchruk you feed back helps a lot!

But, Monitoring & Profiling. Phase 2 continues.
I will contribute system views for almost all internal objects of Ignite, shortly.
You can take a look at umbrella ticket for it, if you want:

https://issues.apache.org/jira/browse/IGNITE-11905

I also have plans for a Phase 3 and 4:

* Tracing of Ignite internal processes.
* Watch Dog for a user-provided code.

Stay tuned :)


В Пт, 20/09/2019 в 14:22 +0300, Nikolay Izhikov пишет:
> Hello, Alex.
> 
> Good catch, thank you.
> 
> I will add enabling of JMX and SQL exporters for system views, by default.
> 
> В Ср, 18/09/2019 в 16:09 +0300, Alex Plehanov пишет:
> > One more point to discuss: Wouldn't it be better to have enabled system
> > views by default?
> > To enable views admin must restart the node, sometimes it's an issue.
> > Views cost almost nothing in terms of performance until they are explicitly
> > requested, so is their a reason to disable views by default?
> > 
> > вт, 17 сент. 2019 г. в 12:48, Alexey Goncharuk <al...@gmail.com>:
> > 
> > > Folks,
> > > 
> > > I honestly tried to follow the discussion, but I think that I lost the
> > > point of the debate. Should we try to exploit the newly introduced slack to
> > > discuss the change and then send a follow-up here?
> > > 
> > > --AG
> > > 

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Nikolay Izhikov <ni...@apache.org>.
Hello, Alex.

Good catch, thank you.

I will add enabling of JMX and SQL exporters for system views, by default.

В Ср, 18/09/2019 в 16:09 +0300, Alex Plehanov пишет:
> One more point to discuss: Wouldn't it be better to have enabled system
> views by default?
> To enable views admin must restart the node, sometimes it's an issue.
> Views cost almost nothing in terms of performance until they are explicitly
> requested, so is their a reason to disable views by default?
> 
> вт, 17 сент. 2019 г. в 12:48, Alexey Goncharuk <al...@gmail.com>:
> 
> > Folks,
> > 
> > I honestly tried to follow the discussion, but I think that I lost the
> > point of the debate. Should we try to exploit the newly introduced slack to
> > discuss the change and then send a follow-up here?
> > 
> > --AG
> > 

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Andrey Gura <ag...@apache.org>.
Alexey,

Yes, system view must be enabled by default and must not have any
enable/disable features. As I told early, system views is on demand
feature and views don't consume any resources while not requested.

On Wed, Sep 18, 2019 at 4:10 PM Alex Plehanov <pl...@gmail.com> wrote:
>
> One more point to discuss: Wouldn't it be better to have enabled system
> views by default?
> To enable views admin must restart the node, sometimes it's an issue.
> Views cost almost nothing in terms of performance until they are explicitly
> requested, so is their a reason to disable views by default?
>
> вт, 17 сент. 2019 г. в 12:48, Alexey Goncharuk <al...@gmail.com>:
>
> > Folks,
> >
> > I honestly tried to follow the discussion, but I think that I lost the
> > point of the debate. Should we try to exploit the newly introduced slack to
> > discuss the change and then send a follow-up here?
> >
> > --AG
> >

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Alex Plehanov <pl...@gmail.com>.
One more point to discuss: Wouldn't it be better to have enabled system
views by default?
To enable views admin must restart the node, sometimes it's an issue.
Views cost almost nothing in terms of performance until they are explicitly
requested, so is their a reason to disable views by default?

вт, 17 сент. 2019 г. в 12:48, Alexey Goncharuk <al...@gmail.com>:

> Folks,
>
> I honestly tried to follow the discussion, but I think that I lost the
> point of the debate. Should we try to exploit the newly introduced slack to
> discuss the change and then send a follow-up here?
>
> --AG
>

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Alexey Goncharuk <al...@gmail.com>.
Folks,

I honestly tried to follow the discussion, but I think that I lost the
point of the debate. Should we try to exploit the newly introduced slack to
discuss the change and then send a follow-up here?

--AG

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Nikolay Izhikov <ni...@apache.org>.
> From my point of view is still task of current issue

What is exactly the task of current issue?
We have fully ready `Walkers` for current system views.
They are correct and easy to use and maintain.

The current PR is already relatively big.
I don't think we should overcomplicate it with the autocompile code.

> But views and metrics are not the same entities.

OK, this is the main difference in our view.

My arguments are following:

1. We exports information about node state to Ignite administrator.
2. Metrics and system view column is the different representation of the same data.
   In one case it's more convinient to watch over 'sytem view', in other make a pretty chart from metric values.
3. With single manager we obtain more clear solution.

So, I think, one manager is enough for both entities.

Maybe there is anyone else, who want to participate in this discussion?

В Пн, 16/09/2019 в 18:01 +0300, Andrey Gura пишет:
> > I think, we should improve Ignite step by step.
> 
> From my point of view is still task of current issue - system views,
> that is exactly one step.
> 
> > > What is part of exported data?
> > Some SPI implementation(JMX, SQL view) exports both metrics and views.
> > Some exports only metrics.
> > It's fine to me.
> > What do you think?
> 
> As I already wrote early metrics and views are different things and
> they should not be exported by the same SPIs. This approach allows
> strongly specify SPI responsibility and excludes problems in reasoning
> about it. This SPI exports metrics and views while another only
> metrics. Why I should care about it? All I want it get SPI that export
> only data that I actually need.
> 
> > I don't propose that.
> > What I propose is to have one manager for the same entities.
> 
> But views and metrics are not the same entities.
> 
> > Please, don't overact my words.
> 
> Just some hyberbole, no more.
> 
> On Mon, Sep 16, 2019 at 4:46 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > 
> > > Why? It is still part of the same task. Master branch should not see
> > > intermediate changes.
> > 
> > I don't propose any "intermediate" feature.
> > We would have full feature of "system views" after merge.
> > 
> > I think, we should improve Ignite step by step.
> > Why we should postpone merge of "system views" because you prefer auto-compile approache comparing auto-generate one?
> > The product will stay the same.
> > Nothing will change from the public API point of view or ongoing development complexity.
> > 
> > It's a "nice to have" feature.
> > And we will have it, shortly.
> > I'll take care of it.
> > 
> > > What is part of exported data?
> > 
> > Some SPI implementation(JMX, SQL view) exports both metrics and views.
> > Some exports only metrics.
> > 
> > It's fine to me.
> > 
> > What do you think?
> > 
> > > First, views and metrics are entities from different worlds/domains.
> > 
> > It's a one domain entities for me.
> > They reflects current state of the node.
> > 
> > > We can live with one manager for absolutely all entities in the system
> > > but we don't do it, right? :)
> > 
> > I don't propose that.
> > What I propose is to have one manager for the same entities.
> > 
> > Please, don't overact my words.
> > 
> > В Пн, 16/09/2019 в 16:24 +0300, Andrey Gura пишет:
> > > > > Views have wider meaning than metrics.
> > > > 
> > > > Yes! I agree, that's why I wrote 'extension' :)
> > > 
> > > No, no, no. Wider meaning isn't equal to extension :)
> > > 
> > > > > IMO using the same code at
> > > > > runtime for view generation is better approach.
> > > > 
> > > > OK for me.
> > > > Let's do it in another ticket?
> > > > I will create one.
> > > 
> > > Why? It is still part of the same task. Master branch should not see
> > > intermediate changes.
> > > 
> > > > Seems, it's OK if some SPI implementation supports only part of exported data.
> > > 
> > > What is part of exported data? I understand why we have to export
> > > metrics but defineitely have no idea why views should be exported
> > > through out any special SPI.
> > > 
> > > > Are we use "lists" or "view" term? :)
> > > 
> > > Views for our task. I mean lists in general sense.
> > > 
> > > > > We can have single manager for metrics and views.
> > > > > Why do we need one more manager in the system?
> > > > > We can live without it.
> > > 
> > > First, views and metrics are entities from different worlds/domains.
> > > Second, we will have less conflicts on GridMetricManager because we
> > > are still working on metrics and views concurrently.
> > > We can live with one manager for absolutely all entities in the system
> > > but we don't do it, right? :)
> > > 
> > > On Mon, Sep 16, 2019 at 2:52 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > 
> > > > > Views have wider meaning than metrics.
> > > > 
> > > > Yes! I agree, that's why I wrote 'extension' :)
> > > > 
> > > > > IMO using the same code at
> > > > > runtime for view generation is better approach.
> > > > 
> > > > OK for me.
> > > > Let's do it in another ticket?
> > > > I will create one.
> > > > 
> > > > > What is the reaal life uses cases for exporting views?
> > > > > Is there any database which exports some lists to somewhere?
> > > > > Especially on push based model, not on demand.
> > > > 
> > > > I don't know such dbms.
> > > > Seems, it's OK if some SPI implementation supports only part of exported data.
> > > > 
> > > > Are we use "lists" or "view" term? :)
> > > > 
> > > > My point is:
> > > > 
> > > > We can have single manager for metrics and views.
> > > > Why do we need one more manager in the system?
> > > > We can live without it.
> > > > 
> > > > В Пн, 16/09/2019 в 13:53 +0300, Andrey Gura пишет:
> > > > > Hi,
> > > > > 
> > > > > > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > > > > > 
> > > > > > For me, it seems that views and metrics is extension of one another.
> > > > > > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > > > > > and one want to know list of running SQL queries one take a look into views.
> > > > > 
> > > > > Views are about system state and they answer to question "what
> > > > > entities exist in the system (caches)?" or "what processes are
> > > > > executing by system (tx, queries)?"
> > > > > Metrics are about system behavior in some retrospective. They answers
> > > > > on questions how system behaves?
> > > > > 
> > > > > Views have wider meaning than metrics.
> > > > > 
> > > > > > > Code generation for walkers is also redundant.
> > > > > > 
> > > > > > If you don't want, you can not use it.
> > > > > > I find it pretty usefull during development.
> > > > > 
> > > > > I talk not about wishes of somebody ) Moreover, if it will depend on
> > > > > wishes it potentially can lead to misusing. IMO using the same code at
> > > > > runtime for view generation is better approach.
> > > > > 
> > > > > > > I really don't understand why we should export system views content
> > > > > > > (especially periodically). Real life use case is take view content on
> > > > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > > > no need any exporters.
> > > > > > 
> > > > > > What if we want to export lists to log or via http, etc?
> > > > > 
> > > > > If we will have public API for views then we can use REST for access
> > > > > to this API. Also you can use public API directly. What is the reaal
> > > > > life uses cases for exporting views? Is there any database which
> > > > > exports some lists to somewhere? Especially on push based model, not
> > > > > on demand.
> > > > > 
> > > > > On Fri, Sep 13, 2019 at 4:36 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > 
> > > > > > Hello, Andrey.
> > > > > > 
> > > > > > > I really don't like name MonitoringList. First of all because it isn't
> > > > > > > about monitoring at all while can be useful for monitoring purposes.
> > > > > > > We already have SQL system views and I think that system view is good
> > > > > > > candidate for naming of new entity.
> > > > > > 
> > > > > > SystemView is OK for me.
> > > > > > I will rename enity in the PR.
> > > > > > 
> > > > > > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > > > > > 
> > > > > > For me, it seems that views and metrics is extension of one another.
> > > > > > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > > > > > and one want to know list of running SQL queries one take a look into views.
> > > > > > 
> > > > > > > There is no any interaction with lists on hot path of code flow
> > > > > > > and there is no any performance impact.
> > > > > > 
> > > > > > OK, let's remove it.
> > > > > > 
> > > > > > > Code generation for walkers is also redundant.
> > > > > > 
> > > > > > If you don't want, you can not use it.
> > > > > > I find it pretty usefull during development.
> > > > > > 
> > > > > > > I really don't understand why we should export system views content
> > > > > > > (especially periodically). Real life use case is take view content on
> > > > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > > > no need any exporters.
> > > > > > 
> > > > > > What if we want to export lists to log or via http, etc?
> > > > > > 
> > > > > > > Also it would be great to involve more people to this discussion.
> > > > > > 
> > > > > > Any feedback are welcome!
> > > > > > 
> > > > > > 
> > > > > > В Пт, 13/09/2019 в 15:13 +0300, Andrey Gura пишет:
> > > > > > > Nikolay,
> > > > > > > 
> > > > > > > thanks a lot for clarification! I added some comments to Upsource review [1].
> > > > > > > 
> > > > > > > Here I want to discuss some high-level issues.
> > > > > > > 
> > > > > > > 1. Naming
> > > > > > > 
> > > > > > > "There are only two hard things in Computer Science: cache
> > > > > > > invalidation and naming things."
> > > > > > > -- Phil Karlton
> > > > > > > 
> > > > > > > I really don't like name MonitoringList. First of all because it isn't
> > > > > > > about monitoring at all while can be useful for monitoring purposes.
> > > > > > > 
> > > > > > > We already have SQL system views and I think that system view is good
> > > > > > > candidate for naming of new entity. As result we will have consistent
> > > > > > > naming which better describes domain.
> > > > > > > 
> > > > > > > I think akso that GridMetricManager is bad candidate for lists (system
> > > > > > > views) management. Because it isn't about metrics. May be new
> > > > > > > SystemViewManager will better fit to this purposes.
> > > > > > > 
> > > > > > > 2. Management
> > > > > > > 
> > > > > > > Lists (aka system views) have life cycle now. I believe that it is
> > > > > > > redundant functionality. There is no any reason for enabling/disabling
> > > > > > > lists. There is no any interaction with lists on hot path of code flow
> > > > > > > and there is no any performance impact.
> > > > > > > 
> > > > > > > So lists management can be reduced to lists creation and registration
> > > > > > > operations (which executes only on node start).
> > > > > > > 
> > > > > > > 3. Code generation
> > > > > > > 
> > > > > > > Code generation for walkers is also redundant. Amount of system views
> > > > > > > in the system is strongly limited (units not dozens) so it is easier
> > > > > > > to change walker by hand literally than navigate to code generator and
> > > > > > > run it. Moreover, first you should add Order annotation in the proper
> > > > > > > place and it make generator practically useless.
> > > > > > > 
> > > > > > > If you still see benefit that can bring Order annotation you can use
> > > > > > > reflection. Motivation is simple, system views are on not hot path and
> > > > > > > I expected that API for system views will not called frequently.
> > > > > > > 
> > > > > > > 4. Export
> > > > > > > 
> > > > > > > I really don't understand why we should export system views content
> > > > > > > (especially periodically). Real life use case is take view content on
> > > > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > > > no need any exporters.
> > > > > > > 
> > > > > > > 
> > > > > > > What do you think about it? Also it would be great to involve more
> > > > > > > people to this discussion.
> > > > > > > 
> > > > > > > [1] https://reviews.ignite.apache.org/ignite/review/IGNT-CR-1065
> > > > > > > 
> > > > > > > On Wed, Sep 11, 2019 at 6:24 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > > > 
> > > > > > > > Hello, Andrey.
> > > > > > > > 
> > > > > > > > Thanks, for joining the review.
> > > > > > > > 
> > > > > > > > Basic interface for objects list is `MonitoringList`. It provides the following features:
> > > > > > > >         * name.
> > > > > > > >         * description.
> > > > > > > >         * row class.
> > > > > > > >         * size.
> > > > > > > >         * iterator for the list content.
> > > > > > > >         * attribute walker (described below).
> > > > > > > > 
> > > > > > > > `MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.
> > > > > > > > 
> > > > > > > > Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
> > > > > > > > It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
> > > > > > > > I think, will be another implementation in the follow-up PRs.
> > > > > > > > 
> > > > > > > > Public API changes:
> > > > > > > > 
> > > > > > > > * New registry created `ReadOnlyMonitoringListRegistry` It provides access:
> > > > > > > >         * To all lists that exist in the Ignite.
> > > > > > > >         * Ability to subscribe to the list creation/removal events.
> > > > > > > > 
> > > > > > > > * `MetricExporterSpi` changes:
> > > > > > > >         * `setMonitoringListRegistry` method added
> > > > > > > >         * `setMonitoringListExportFilter` method added.
> > > > > > > > 
> > > > > > > > `MonitoringRowAttributeWalker` is a helper class for exporter implementations.
> > > > > > > > Usually, exporter SPI iterates on `MonitoringRow` attributes.
> > > > > > > > `SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
> > > > > > > > It can be implemented with Java reflection API, but I use more quick approach.
> > > > > > > > `MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
> > > > > > > > It's also, preserves, the order provided by the MonitoringRow implementation author.
> > > > > > > > It provides 2 main methods:
> > > > > > > >         * `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
> > > > > > > >         * `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> > > > > > > > > Nikolai,
> > > > > > > > > 
> > > > > > > > > I'm trying to review this PR but it is too large.
> > > > > > > > > 
> > > > > > > > > Could you please describe problem and design of implemented solution?
> > > > > > > > > Also javadocs for base interfaces aren't clear, too brief and doesn't
> > > > > > > > > give any imagine about whole picture.
> > > > > > > > > 
> > > > > > > > > At present it is very hard to understand the purposes of new
> > > > > > > > > interfaces and walker generator, and design itself.
> > > > > > > > > 
> > > > > > > > > On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > > > > > 
> > > > > > > > > > Hello, Igniters.
> > > > > > > > > > 
> > > > > > > > > > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > > > > > > > > > Please, join to the review!
> > > > > > > > > > 
> > > > > > > > > > I've implemented:
> > > > > > > > > > 
> > > > > > > > > > * Monitoring list engine.
> > > > > > > > > > * Following list implemented:
> > > > > > > > > >     * Cache list
> > > > > > > > > >     * Cache group list
> > > > > > > > > >     * Compute task list
> > > > > > > > > >     * Service list.
> > > > > > > > > > 
> > > > > > > > > > Engine details:
> > > > > > > > > > 
> > > > > > > > > > * `MonitoringList` added to store list data.
> > > > > > > > > > * Base interface `MonitoringRow` for list data created.
> > > > > > > > > > * Corresponding method added to `MetricExporterSpi`
> > > > > > > > > > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > > > > > > > > > support list export.
> > > > > > > > > > * JMX, SQL and other column-oriented SPI uses
> > > > > > > > > > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > > > > > > > > > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > > > > > > > > > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > > > > > > > > > 
> > > > > > > > > > I prepare follow-up PR [2], also.
> > > > > > > > > > Following lists implemented:
> > > > > > > > > > 
> > > > > > > > > > * SQL tables
> > > > > > > > > > * SQL indexes
> > > > > > > > > > * SQL schemas
> > > > > > > > > > * SQL queries
> > > > > > > > > > * Continuous queries
> > > > > > > > > > * Text queries
> > > > > > > > > > * Transactions
> > > > > > > > > > * Cluster nodes
> > > > > > > > > > * Client connections(JDBC, ODBC, Thin)
> > > > > > > > > > 
> > > > > > > > > > [1] https://github.com/apache/ignite/pull/6845
> > > > > > > > > > [2] https://github.com/apache/ignite/pull/6790
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > > > > > > > > > 
> > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > 
> > > > > > > > > > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > > > > > > > > > for Phase 2.
> > > > > > > > > > > 
> > > > > > > > > > > Scope of Phase 2(copy-paste from the ticket)
> > > > > > > > > > > 
> > > > > > > > > > > Ability to collect lists of some internal object Ignite manage.
> > > > > > > > > > > Examples of such objects:
> > > > > > > > > > > 
> > > > > > > > > > >   * Caches
> > > > > > > > > > >   * Queries (including continuous queries)
> > > > > > > > > > >   * Services
> > > > > > > > > > >   * Compute tasks
> > > > > > > > > > >   * Distributed Data Structures
> > > > > > > > > > >   * etc...
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > > > > > > > > > discussed in separate tickets
> > > > > > > > > > > 2. Metric Exporters (optionally) can support list export.
> > > > > > > > > > > 
> > > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > > Ticket for IEP.Phase1 created -
> > > > > > > > > > > 
> > > > > > > > > > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > > > > > > > > > 
> > > > > > > > > > > > 
> > > > > > > > > > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > > > > > > > > > 
> > > > > > > > > > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Issues to address:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 1. Study experience of following libs, tools:
> > > > > > > > > > > > >     * OpenTracing
> > > > > > > > > > > > >     * OpenSensus
> > > > > > > > > > > > >     * DropWizard
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > > > > > > > > > 
> > > > > > > > > > > into predefined segments
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > > > > > > > > > 
> > > > > > > > > > > local metric storage.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 6. Some metrics should be part of public API and others are not(may be
> > > > > > > > > > > 
> > > > > > > > > > > changed/removed in release without warnings).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > My plan for Phase #1 is the following:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 1. Address the issues.
> > > > > > > > > > > > > 2. Prepare public API
> > > > > > > > > > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > > > > > > > > > 
> > > > > > > > > > > with it.
> > > > > > > > > > > > > 4. Prepare a PR with lists of each user API.
> > > > > > > > > > > > > 5. Collect feedback for a #4.
> > > > > > > > > > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > > > > > > > > > 
> > > > > > > > > > > other widely used, tool compatible format.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > [1]
> > > > > > > > > > > 
> > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > > > 
> > > > > > > > > > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > > > > Hello, Maxim.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > How will be recorded throughput sensor values which will require
> > > > > > > > > > > 
> > > > > > > > > > > an interval for the rate calculations?
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I answered to this question in IEP "Design principles":
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > > > > > > > > > 
> > > > > > > > > > > metrics on Ignite side.
> > > > > > > > > > > > > > Min, max, avg and other functions are the matter of an external
> > > > > > > > > > > 
> > > > > > > > > > > monitoring system.
> > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > > > > > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Seems, throughput calculation is a responsibility of an external
> > > > > > > > > > > 
> > > > > > > > > > > system.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > > > 
> > > > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > > > > > > > > > INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > For now, I think that all sensors and lists will be very(very!)
> > > > > > > > > > > 
> > > > > > > > > > > lightweight.
> > > > > > > > > > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > > > > > > > > > for the case we have strong performance limitations for a particular
> > > > > > > > > > > 
> > > > > > > > > > > workload.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > > > > > > > > > 
> > > > > > > > > > > profiling: IEP-35 - Phase 3).
> > > > > > > > > > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > > > > > > > > > 
> > > > > > > > > > > should we?)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > > > > > > > > > Hello Nikolay,
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I've looked through your PRs changes.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Sensors
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > > > > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > > > > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > > > > > > > > > 
> > > > > > > > > > > are
> > > > > > > > > > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > > > > > > > > > 
> > > > > > > > > > > the
> > > > > > > > > > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > > > > > > > > > 
> > > > > > > > > > > control
> > > > > > > > > > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > > > > > > > > > functional java approach can be used. Am I right?
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > > > > > > > > > 
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Hello, Vyacheslav.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > > > > > > > > > 
> > > > > > > > > > > the core module.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > > > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > > > > > > > > > 
> > > > > > > > > > > discuss.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > > > > > > > > > 
> > > > > > > > > > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > > > > > > > > > for services
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Agreed in general.
> > > > > > > > > > > > > > > > Seems, choosing the right data to expose is the matter of
> > > > > > > > > > > 
> > > > > > > > > > > separate discussion for each Ignite entities.
> > > > > > > > > > > > > > > > I've planned to file tickets for each entity so anyone
> > > > > > > > > > > 
> > > > > > > > > > > interested can share his vision in it.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > > > > > > > > > 
> > > > > > > > > > > timestamp.
> > > > > > > > > > > > > > > > Seems, we should support sensors without timestamp for a current
> > > > > > > > > > > 
> > > > > > > > > > > monitoring numbers at least.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > > > > > 
> > > > > > > > > > > fixed size> of last N sensors
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > > > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > > > > > > > > > 
> > > > > > > > > > > queries" or similar data.
> > > > > > > > > > > > > > > > Essentially, a sensor is just a single value with the name and
> > > > > > > > > > > 
> > > > > > > > > > > known meaning.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > > > > > 
> > > > > > > > > > > work of> the system.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Sorry, for that :)
> > > > > > > > > > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > > > > > > > > > 
> > > > > > > > > > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > > > > > > > > > I provide this info in gist -
> > > > > > > > > > > 
> > > > > > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > I will extend this test to print results to console in the next
> > > > > > > > > > > 
> > > > > > > > > > > iterations - stay tuned :)
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > > > > > > > > > Hi, Nikolay,
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > It would be better to implement it as a separate module, I
> > > > > > > > > > > 
> > > > > > > > > > > can't say
> > > > > > > > > > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > > > > > > > > > 
> > > > > > > > > > > detached
> > > > > > > > > > > > > > > > > from the core module.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > > > > > > > > > 
> > > > > > > > > > > 'ServiceConfiguration'
> > > > > > > > > > > > > > > > > directly as a monitored object for services. I believe we
> > > > > > > > > > > 
> > > > > > > > > > > shouldn't
> > > > > > > > > > > > > > > > > mix approaches. It'd be better always use some kind of
> > > > > > > > > > > 
> > > > > > > > > > > container with
> > > > > > > > > > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > > > > > > > > > 
> > > > > > > > > > > monitoring
> > > > > > > > > > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > > > > > > > > > timestamp.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > > > > > 
> > > > > > > > > > > fixed size
> > > > > > > > > > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > > > > > > > > > 
> > > > > > > > > > > external
> > > > > > > > > > > > > > > > > monitoring system.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > > > > > 
> > > > > > > > > > > work of
> > > > > > > > > > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > > > > > > > > > 
> > > > > > > > > > > and get
> > > > > > > > > > > > > > > > > the info manually to see the completeness of sensors, this
> > > > > > > > > > > 
> > > > > > > > > > > might be
> > > > > > > > > > > > > > > > > simplified by proper test.
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > Thank you!
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > > > > > > > > > 
> > > > > > > > > > > nizhikov@apache.org> wrote:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > > > > > > > > > PR can be found here -
> > > > > > > > > > > 
> > > > > > > > > > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > I've done following changes:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > > > > > > > > > 
> > > > > > > > > > > implementation of manager to store all monitoring info
> > > > > > > > > > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > > > > > > > > > 
> > > > > > > > > > > implementation that can respond with JSON from
> > > > > > > > > > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > > > > > > > > > gist [4]
> > > > > > > > > > > > > > > > > >         3. Compute task start and finish monitoring in
> > > > > > > > > > > 
> > > > > > > > > > > "compute" list [5]
> > > > > > > > > > > > > > > > > >         4. Service registration are monitored in "service"
> > > > > > > > > > > 
> > > > > > > > > > > list - [6]
> > > > > > > > > > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > > > > > > > > > 
> > > > > > > > > > > `GridMonitoringManager` [7]
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Design principles, monitoring subsystem details and new
> > > > > > > > > > > 
> > > > > > > > > > > Ignite entities can be found in IEP [1].
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > My next steps will be:
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > > > > > > > > > 
> > > > > > > > > > > as a SQL System view.
> > > > > > > > > > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > > > > > > > > > 
> > > > > > > > > > > (described in IEP).
> > > > > > > > > > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > > > > > > > > > 
> > > > > > > > > > > GridMonitoringManager.
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > Part of JSON file:
> > > > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > > >     "COMPUTE": {
> > > > > > > > > > > > > > > > > >       "tasks": {
> > > > > > > > > > > > > > > > > >         "name": "tasks",
> > > > > > > > > > > > > > > > > >         "rows": [
> > > > > > > > > > > > > > > > > >           {
> > > > > > > > > > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > > > > > > > > > >             "sessionId":
> > > > > > > > > > > 
> > > > > > > > > > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > > > > > > > > > >             "data": {
> > > > > > > > > > > > > > > > > >               "taskClasName":
> > > > > > > > > > > 
> > > > > > > > > > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > > > > > > > > > >               "execName": null
> > > > > > > > > > > > > > > > > >             },
> > > > > > > > > > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > > > > > > > > > >           }
> > > > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > 
> > > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > 
> > > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > > > > > > > > > [3]
> > > > > > > > > > > 
> > > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > > > > > > > > > [4]
> > > > > > > > > > > 
> > > > > > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > > > > > > [5]
> > > > > > > > > > > 
> > > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > > > > > > > > > [6]
> > > > > > > > > > > 
> > > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > > > > > > > > > [7]
> > > > > > > > > > > 
> > > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > 

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Andrey Gura <ag...@apache.org>.
> I think, we should improve Ignite step by step.

From my point of view is still task of current issue - system views,
that is exactly one step.

>> What is part of exported data?

> Some SPI implementation(JMX, SQL view) exports both metrics and views.
> Some exports only metrics.

> It's fine to me.

> What do you think?

As I already wrote early metrics and views are different things and
they should not be exported by the same SPIs. This approach allows
strongly specify SPI responsibility and excludes problems in reasoning
about it. This SPI exports metrics and views while another only
metrics. Why I should care about it? All I want it get SPI that export
only data that I actually need.

> I don't propose that.
> What I propose is to have one manager for the same entities.

But views and metrics are not the same entities.

> Please, don't overact my words.

Just some hyberbole, no more.

On Mon, Sep 16, 2019 at 4:46 PM Nikolay Izhikov <ni...@apache.org> wrote:
>
> > Why? It is still part of the same task. Master branch should not see
> > intermediate changes.
>
> I don't propose any "intermediate" feature.
> We would have full feature of "system views" after merge.
>
> I think, we should improve Ignite step by step.
> Why we should postpone merge of "system views" because you prefer auto-compile approache comparing auto-generate one?
> The product will stay the same.
> Nothing will change from the public API point of view or ongoing development complexity.
>
> It's a "nice to have" feature.
> And we will have it, shortly.
> I'll take care of it.
>
> > What is part of exported data?
>
> Some SPI implementation(JMX, SQL view) exports both metrics and views.
> Some exports only metrics.
>
> It's fine to me.
>
> What do you think?
>
> > First, views and metrics are entities from different worlds/domains.
>
> It's a one domain entities for me.
> They reflects current state of the node.
>
> > We can live with one manager for absolutely all entities in the system
> > but we don't do it, right? :)
>
> I don't propose that.
> What I propose is to have one manager for the same entities.
>
> Please, don't overact my words.
>
> В Пн, 16/09/2019 в 16:24 +0300, Andrey Gura пишет:
> > > > Views have wider meaning than metrics.
> > > Yes! I agree, that's why I wrote 'extension' :)
> >
> > No, no, no. Wider meaning isn't equal to extension :)
> >
> > > > IMO using the same code at
> > > > runtime for view generation is better approach.
> > > OK for me.
> > > Let's do it in another ticket?
> > > I will create one.
> >
> > Why? It is still part of the same task. Master branch should not see
> > intermediate changes.
> >
> > > Seems, it's OK if some SPI implementation supports only part of exported data.
> >
> > What is part of exported data? I understand why we have to export
> > metrics but defineitely have no idea why views should be exported
> > through out any special SPI.
> >
> > > Are we use "lists" or "view" term? :)
> >
> > Views for our task. I mean lists in general sense.
> >
> > > > We can have single manager for metrics and views.
> > > > Why do we need one more manager in the system?
> > > > We can live without it.
> >
> > First, views and metrics are entities from different worlds/domains.
> > Second, we will have less conflicts on GridMetricManager because we
> > are still working on metrics and views concurrently.
> > We can live with one manager for absolutely all entities in the system
> > but we don't do it, right? :)
> >
> > On Mon, Sep 16, 2019 at 2:52 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > >
> > > > Views have wider meaning than metrics.
> > >
> > > Yes! I agree, that's why I wrote 'extension' :)
> > >
> > > > IMO using the same code at
> > > > runtime for view generation is better approach.
> > >
> > > OK for me.
> > > Let's do it in another ticket?
> > > I will create one.
> > >
> > > > What is the reaal life uses cases for exporting views?
> > > > Is there any database which exports some lists to somewhere?
> > > > Especially on push based model, not on demand.
> > >
> > > I don't know such dbms.
> > > Seems, it's OK if some SPI implementation supports only part of exported data.
> > >
> > > Are we use "lists" or "view" term? :)
> > >
> > > My point is:
> > >
> > > We can have single manager for metrics and views.
> > > Why do we need one more manager in the system?
> > > We can live without it.
> > >
> > > В Пн, 16/09/2019 в 13:53 +0300, Andrey Gura пишет:
> > > > Hi,
> > > >
> > > > > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > > > >
> > > > > For me, it seems that views and metrics is extension of one another.
> > > > > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > > > > and one want to know list of running SQL queries one take a look into views.
> > > >
> > > > Views are about system state and they answer to question "what
> > > > entities exist in the system (caches)?" or "what processes are
> > > > executing by system (tx, queries)?"
> > > > Metrics are about system behavior in some retrospective. They answers
> > > > on questions how system behaves?
> > > >
> > > > Views have wider meaning than metrics.
> > > >
> > > > > > Code generation for walkers is also redundant.
> > > > >
> > > > > If you don't want, you can not use it.
> > > > > I find it pretty usefull during development.
> > > >
> > > > I talk not about wishes of somebody ) Moreover, if it will depend on
> > > > wishes it potentially can lead to misusing. IMO using the same code at
> > > > runtime for view generation is better approach.
> > > >
> > > > > > I really don't understand why we should export system views content
> > > > > > (especially periodically). Real life use case is take view content on
> > > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > > no need any exporters.
> > > > >
> > > > > What if we want to export lists to log or via http, etc?
> > > >
> > > > If we will have public API for views then we can use REST for access
> > > > to this API. Also you can use public API directly. What is the reaal
> > > > life uses cases for exporting views? Is there any database which
> > > > exports some lists to somewhere? Especially on push based model, not
> > > > on demand.
> > > >
> > > > On Fri, Sep 13, 2019 at 4:36 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > >
> > > > > Hello, Andrey.
> > > > >
> > > > > > I really don't like name MonitoringList. First of all because it isn't
> > > > > > about monitoring at all while can be useful for monitoring purposes.
> > > > > > We already have SQL system views and I think that system view is good
> > > > > > candidate for naming of new entity.
> > > > >
> > > > > SystemView is OK for me.
> > > > > I will rename enity in the PR.
> > > > >
> > > > > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > > > >
> > > > > For me, it seems that views and metrics is extension of one another.
> > > > > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > > > > and one want to know list of running SQL queries one take a look into views.
> > > > >
> > > > > > There is no any interaction with lists on hot path of code flow
> > > > > > and there is no any performance impact.
> > > > >
> > > > > OK, let's remove it.
> > > > >
> > > > > > Code generation for walkers is also redundant.
> > > > >
> > > > > If you don't want, you can not use it.
> > > > > I find it pretty usefull during development.
> > > > >
> > > > > > I really don't understand why we should export system views content
> > > > > > (especially periodically). Real life use case is take view content on
> > > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > > no need any exporters.
> > > > >
> > > > > What if we want to export lists to log or via http, etc?
> > > > >
> > > > > > Also it would be great to involve more people to this discussion.
> > > > >
> > > > > Any feedback are welcome!
> > > > >
> > > > >
> > > > > В Пт, 13/09/2019 в 15:13 +0300, Andrey Gura пишет:
> > > > > > Nikolay,
> > > > > >
> > > > > > thanks a lot for clarification! I added some comments to Upsource review [1].
> > > > > >
> > > > > > Here I want to discuss some high-level issues.
> > > > > >
> > > > > > 1. Naming
> > > > > >
> > > > > > "There are only two hard things in Computer Science: cache
> > > > > > invalidation and naming things."
> > > > > > -- Phil Karlton
> > > > > >
> > > > > > I really don't like name MonitoringList. First of all because it isn't
> > > > > > about monitoring at all while can be useful for monitoring purposes.
> > > > > >
> > > > > > We already have SQL system views and I think that system view is good
> > > > > > candidate for naming of new entity. As result we will have consistent
> > > > > > naming which better describes domain.
> > > > > >
> > > > > > I think akso that GridMetricManager is bad candidate for lists (system
> > > > > > views) management. Because it isn't about metrics. May be new
> > > > > > SystemViewManager will better fit to this purposes.
> > > > > >
> > > > > > 2. Management
> > > > > >
> > > > > > Lists (aka system views) have life cycle now. I believe that it is
> > > > > > redundant functionality. There is no any reason for enabling/disabling
> > > > > > lists. There is no any interaction with lists on hot path of code flow
> > > > > > and there is no any performance impact.
> > > > > >
> > > > > > So lists management can be reduced to lists creation and registration
> > > > > > operations (which executes only on node start).
> > > > > >
> > > > > > 3. Code generation
> > > > > >
> > > > > > Code generation for walkers is also redundant. Amount of system views
> > > > > > in the system is strongly limited (units not dozens) so it is easier
> > > > > > to change walker by hand literally than navigate to code generator and
> > > > > > run it. Moreover, first you should add Order annotation in the proper
> > > > > > place and it make generator practically useless.
> > > > > >
> > > > > > If you still see benefit that can bring Order annotation you can use
> > > > > > reflection. Motivation is simple, system views are on not hot path and
> > > > > > I expected that API for system views will not called frequently.
> > > > > >
> > > > > > 4. Export
> > > > > >
> > > > > > I really don't understand why we should export system views content
> > > > > > (especially periodically). Real life use case is take view content on
> > > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > > no need any exporters.
> > > > > >
> > > > > >
> > > > > > What do you think about it? Also it would be great to involve more
> > > > > > people to this discussion.
> > > > > >
> > > > > > [1] https://reviews.ignite.apache.org/ignite/review/IGNT-CR-1065
> > > > > >
> > > > > > On Wed, Sep 11, 2019 at 6:24 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > >
> > > > > > > Hello, Andrey.
> > > > > > >
> > > > > > > Thanks, for joining the review.
> > > > > > >
> > > > > > > Basic interface for objects list is `MonitoringList`. It provides the following features:
> > > > > > >         * name.
> > > > > > >         * description.
> > > > > > >         * row class.
> > > > > > >         * size.
> > > > > > >         * iterator for the list content.
> > > > > > >         * attribute walker (described below).
> > > > > > >
> > > > > > > `MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.
> > > > > > >
> > > > > > > Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
> > > > > > > It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
> > > > > > > I think, will be another implementation in the follow-up PRs.
> > > > > > >
> > > > > > > Public API changes:
> > > > > > >
> > > > > > > * New registry created `ReadOnlyMonitoringListRegistry` It provides access:
> > > > > > >         * To all lists that exist in the Ignite.
> > > > > > >         * Ability to subscribe to the list creation/removal events.
> > > > > > >
> > > > > > > * `MetricExporterSpi` changes:
> > > > > > >         * `setMonitoringListRegistry` method added
> > > > > > >         * `setMonitoringListExportFilter` method added.
> > > > > > >
> > > > > > > `MonitoringRowAttributeWalker` is a helper class for exporter implementations.
> > > > > > > Usually, exporter SPI iterates on `MonitoringRow` attributes.
> > > > > > > `SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
> > > > > > > It can be implemented with Java reflection API, but I use more quick approach.
> > > > > > > `MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
> > > > > > > It's also, preserves, the order provided by the MonitoringRow implementation author.
> > > > > > > It provides 2 main methods:
> > > > > > >         * `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
> > > > > > >         * `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.
> > > > > > >
> > > > > > >
> > > > > > > В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> > > > > > > > Nikolai,
> > > > > > > >
> > > > > > > > I'm trying to review this PR but it is too large.
> > > > > > > >
> > > > > > > > Could you please describe problem and design of implemented solution?
> > > > > > > > Also javadocs for base interfaces aren't clear, too brief and doesn't
> > > > > > > > give any imagine about whole picture.
> > > > > > > >
> > > > > > > > At present it is very hard to understand the purposes of new
> > > > > > > > interfaces and walker generator, and design itself.
> > > > > > > >
> > > > > > > > On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > > > >
> > > > > > > > > Hello, Igniters.
> > > > > > > > >
> > > > > > > > > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > > > > > > > > Please, join to the review!
> > > > > > > > >
> > > > > > > > > I've implemented:
> > > > > > > > >
> > > > > > > > > * Monitoring list engine.
> > > > > > > > > * Following list implemented:
> > > > > > > > >     * Cache list
> > > > > > > > >     * Cache group list
> > > > > > > > >     * Compute task list
> > > > > > > > >     * Service list.
> > > > > > > > >
> > > > > > > > > Engine details:
> > > > > > > > >
> > > > > > > > > * `MonitoringList` added to store list data.
> > > > > > > > > * Base interface `MonitoringRow` for list data created.
> > > > > > > > > * Corresponding method added to `MetricExporterSpi`
> > > > > > > > > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > > > > > > > > support list export.
> > > > > > > > > * JMX, SQL and other column-oriented SPI uses
> > > > > > > > > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > > > > > > > > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > > > > > > > > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > > > > > > > >
> > > > > > > > > I prepare follow-up PR [2], also.
> > > > > > > > > Following lists implemented:
> > > > > > > > >
> > > > > > > > > * SQL tables
> > > > > > > > > * SQL indexes
> > > > > > > > > * SQL schemas
> > > > > > > > > * SQL queries
> > > > > > > > > * Continuous queries
> > > > > > > > > * Text queries
> > > > > > > > > * Transactions
> > > > > > > > > * Cluster nodes
> > > > > > > > > * Client connections(JDBC, ODBC, Thin)
> > > > > > > > >
> > > > > > > > > [1] https://github.com/apache/ignite/pull/6845
> > > > > > > > > [2] https://github.com/apache/ignite/pull/6790
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > > > > > > > >
> > > > > > > > > > Hello, Igniters.
> > > > > > > > > >
> > > > > > > > > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > > > > > > > > for Phase 2.
> > > > > > > > > >
> > > > > > > > > > Scope of Phase 2(copy-paste from the ticket)
> > > > > > > > > >
> > > > > > > > > > Ability to collect lists of some internal object Ignite manage.
> > > > > > > > > > Examples of such objects:
> > > > > > > > > >
> > > > > > > > > >   * Caches
> > > > > > > > > >   * Queries (including continuous queries)
> > > > > > > > > >   * Services
> > > > > > > > > >   * Compute tasks
> > > > > > > > > >   * Distributed Data Structures
> > > > > > > > > >   * etc...
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > > > > > > > > discussed in separate tickets
> > > > > > > > > > 2. Metric Exporters (optionally) can support list export.
> > > > > > > > > >
> > > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > Ticket for IEP.Phase1 created -
> > > > > > > > > >
> > > > > > > > > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > >
> > > > > > > > > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > > > > > > > >
> > > > > > > > > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > > > > > > > >
> > > > > > > > > > > > Issues to address:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. Study experience of following libs, tools:
> > > > > > > > > > > >     * OpenTracing
> > > > > > > > > > > >     * OpenSensus
> > > > > > > > > > > >     * DropWizard
> > > > > > > > > > > >
> > > > > > > > > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > > > > > > > >
> > > > > > > > > > into predefined segments
> > > > > > > > > > > >
> > > > > > > > > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > > > > > > > >
> > > > > > > > > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > > > > > > > >
> > > > > > > > > > local metric storage.
> > > > > > > > > > > >
> > > > > > > > > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > > > > > > > >
> > > > > > > > > > > > 6. Some metrics should be part of public API and others are not(may be
> > > > > > > > > >
> > > > > > > > > > changed/removed in release without warnings).
> > > > > > > > > > > >
> > > > > > > > > > > > My plan for Phase #1 is the following:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. Address the issues.
> > > > > > > > > > > > 2. Prepare public API
> > > > > > > > > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > > > > > > > >
> > > > > > > > > > with it.
> > > > > > > > > > > > 4. Prepare a PR with lists of each user API.
> > > > > > > > > > > > 5. Collect feedback for a #4.
> > > > > > > > > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > > > > > > > >
> > > > > > > > > > other widely used, tool compatible format.
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > >
> > > > > > > > > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > > > Hello, Maxim.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > How will be recorded throughput sensor values which will require
> > > > > > > > > >
> > > > > > > > > > an interval for the rate calculations?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I answered to this question in IEP "Design principles":
> > > > > > > > > > > > >
> > > > > > > > > > > > > ```
> > > > > > > > > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > > > > > > > >
> > > > > > > > > > metrics on Ignite side.
> > > > > > > > > > > > > Min, max, avg and other functions are the matter of an external
> > > > > > > > > >
> > > > > > > > > > monitoring system.
> > > > > > > > > > > > > ```
> > > > > > > > > > > > >
> > > > > > > > > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > > > > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Seems, throughput calculation is a responsibility of an external
> > > > > > > > > >
> > > > > > > > > > system.
> > > > > > > > > > > > >
> > > > > > > > > > > > > What do you think?
> > > > > > > > > > > > >
> > > > > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > >
> > > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > > > > > > > > INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > > > >
> > > > > > > > > > > > > For now, I think that all sensors and lists will be very(very!)
> > > > > > > > > >
> > > > > > > > > > lightweight.
> > > > > > > > > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > > > > > > > >
> > > > > > > > > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > > > > > > > > for the case we have strong performance limitations for a particular
> > > > > > > > > >
> > > > > > > > > > workload.
> > > > > > > > > > > > >
> > > > > > > > > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > > > > > > > >
> > > > > > > > > > profiling: IEP-35 - Phase 3).
> > > > > > > > > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > > > > > > > >
> > > > > > > > > > should we?)
> > > > > > > > > > > > >
> > > > > > > > > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > > > > > > > > Hello Nikolay,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I've looked through your PRs changes.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Sensors
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > > > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > > > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > > > > > > > >
> > > > > > > > > > are
> > > > > > > > > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > > > > > > > >
> > > > > > > > > > the
> > > > > > > > > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > > > > > > > >
> > > > > > > > > > control
> > > > > > > > > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > > > > > > > > functional java approach can be used. Am I right?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hello, Vyacheslav.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > > > > > > > >
> > > > > > > > > > the core module.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > > > > > > > >
> > > > > > > > > > discuss.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > > > > > > > >
> > > > > > > > > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > > > > > > > > for services
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Agreed in general.
> > > > > > > > > > > > > > > Seems, choosing the right data to expose is the matter of
> > > > > > > > > >
> > > > > > > > > > separate discussion for each Ignite entities.
> > > > > > > > > > > > > > > I've planned to file tickets for each entity so anyone
> > > > > > > > > >
> > > > > > > > > > interested can share his vision in it.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > > > > > > > >
> > > > > > > > > > timestamp.
> > > > > > > > > > > > > > > Seems, we should support sensors without timestamp for a current
> > > > > > > > > >
> > > > > > > > > > monitoring numbers at least.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > > > >
> > > > > > > > > > fixed size> of last N sensors
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > > > > > > > >
> > > > > > > > > > queries" or similar data.
> > > > > > > > > > > > > > > Essentially, a sensor is just a single value with the name and
> > > > > > > > > >
> > > > > > > > > > known meaning.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > > > >
> > > > > > > > > > work of> the system.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Sorry, for that :)
> > > > > > > > > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > > > > > > > >
> > > > > > > > > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > > > > > > > > I provide this info in gist -
> > > > > > > > > >
> > > > > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I will extend this test to print results to console in the next
> > > > > > > > > >
> > > > > > > > > > iterations - stay tuned :)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > > > > > > > > Hi, Nikolay,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > It would be better to implement it as a separate module, I
> > > > > > > > > >
> > > > > > > > > > can't say
> > > > > > > > > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > > > > > > > >
> > > > > > > > > > detached
> > > > > > > > > > > > > > > > from the core module.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > > > > > > > >
> > > > > > > > > > 'ServiceConfiguration'
> > > > > > > > > > > > > > > > directly as a monitored object for services. I believe we
> > > > > > > > > >
> > > > > > > > > > shouldn't
> > > > > > > > > > > > > > > > mix approaches. It'd be better always use some kind of
> > > > > > > > > >
> > > > > > > > > > container with
> > > > > > > > > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > > > > > > > >
> > > > > > > > > > monitoring
> > > > > > > > > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > > > > > > > > timestamp.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > > > >
> > > > > > > > > > fixed size
> > > > > > > > > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > > > > > > > >
> > > > > > > > > > external
> > > > > > > > > > > > > > > > monitoring system.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > > > >
> > > > > > > > > > work of
> > > > > > > > > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > > > > > > > >
> > > > > > > > > > and get
> > > > > > > > > > > > > > > > the info manually to see the completeness of sensors, this
> > > > > > > > > >
> > > > > > > > > > might be
> > > > > > > > > > > > > > > > simplified by proper test.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you!
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > > > > > > > >
> > > > > > > > > > nizhikov@apache.org> wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > > > > > > > > PR can be found here -
> > > > > > > > > >
> > > > > > > > > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I've done following changes:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > > > > > > > >
> > > > > > > > > > implementation of manager to store all monitoring info
> > > > > > > > > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > > > > > > > >
> > > > > > > > > > implementation that can respond with JSON from
> > > > > > > > > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > > > > > > > > gist [4]
> > > > > > > > > > > > > > > > >         3. Compute task start and finish monitoring in
> > > > > > > > > >
> > > > > > > > > > "compute" list [5]
> > > > > > > > > > > > > > > > >         4. Service registration are monitored in "service"
> > > > > > > > > >
> > > > > > > > > > list - [6]
> > > > > > > > > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > > > > > > > >
> > > > > > > > > > `GridMonitoringManager` [7]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Design principles, monitoring subsystem details and new
> > > > > > > > > >
> > > > > > > > > > Ignite entities can be found in IEP [1].
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > My next steps will be:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > > > > > > > >
> > > > > > > > > > as a SQL System view.
> > > > > > > > > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > > > > > > > >
> > > > > > > > > > (described in IEP).
> > > > > > > > > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > > > > > > > >
> > > > > > > > > > GridMonitoringManager.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Part of JSON file:
> > > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > >     "COMPUTE": {
> > > > > > > > > > > > > > > > >       "tasks": {
> > > > > > > > > > > > > > > > >         "name": "tasks",
> > > > > > > > > > > > > > > > >         "rows": [
> > > > > > > > > > > > > > > > >           {
> > > > > > > > > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > > > > > > > > >             "sessionId":
> > > > > > > > > >
> > > > > > > > > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > > > > > > > > >             "data": {
> > > > > > > > > > > > > > > > >               "taskClasName":
> > > > > > > > > >
> > > > > > > > > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > > > > > > > > >               "execName": null
> > > > > > > > > > > > > > > > >             },
> > > > > > > > > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > > > > > > > > >           }
> > > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > > > > > > > [2]
> > > > > > > > > >
> > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > > > > > > > > [3]
> > > > > > > > > >
> > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > > > > > > > > [4]
> > > > > > > > > >
> > > > > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > > > > > [5]
> > > > > > > > > >
> > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > > > > > > > > [6]
> > > > > > > > > >
> > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > > > > > > > > [7]
> > > > > > > > > >
> > > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Nikolay Izhikov <ni...@apache.org>.
> Why? It is still part of the same task. Master branch should not see
> intermediate changes.

I don't propose any "intermediate" feature.
We would have full feature of "system views" after merge.

I think, we should improve Ignite step by step.
Why we should postpone merge of "system views" because you prefer auto-compile approache comparing auto-generate one?
The product will stay the same.
Nothing will change from the public API point of view or ongoing development complexity.

It's a "nice to have" feature.
And we will have it, shortly.
I'll take care of it.

> What is part of exported data?

Some SPI implementation(JMX, SQL view) exports both metrics and views.
Some exports only metrics.

It's fine to me.

What do you think?

> First, views and metrics are entities from different worlds/domains.

It's a one domain entities for me.
They reflects current state of the node.

> We can live with one manager for absolutely all entities in the system
> but we don't do it, right? :)

I don't propose that.
What I propose is to have one manager for the same entities.

Please, don't overact my words.

В Пн, 16/09/2019 в 16:24 +0300, Andrey Gura пишет:
> > > Views have wider meaning than metrics.
> > Yes! I agree, that's why I wrote 'extension' :)
> 
> No, no, no. Wider meaning isn't equal to extension :)
> 
> > > IMO using the same code at
> > > runtime for view generation is better approach.
> > OK for me.
> > Let's do it in another ticket?
> > I will create one.
> 
> Why? It is still part of the same task. Master branch should not see
> intermediate changes.
> 
> > Seems, it's OK if some SPI implementation supports only part of exported data.
> 
> What is part of exported data? I understand why we have to export
> metrics but defineitely have no idea why views should be exported
> through out any special SPI.
> 
> > Are we use "lists" or "view" term? :)
> 
> Views for our task. I mean lists in general sense.
> 
> > > We can have single manager for metrics and views.
> > > Why do we need one more manager in the system?
> > > We can live without it.
> 
> First, views and metrics are entities from different worlds/domains.
> Second, we will have less conflicts on GridMetricManager because we
> are still working on metrics and views concurrently.
> We can live with one manager for absolutely all entities in the system
> but we don't do it, right? :)
> 
> On Mon, Sep 16, 2019 at 2:52 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > 
> > > Views have wider meaning than metrics.
> > 
> > Yes! I agree, that's why I wrote 'extension' :)
> > 
> > > IMO using the same code at
> > > runtime for view generation is better approach.
> > 
> > OK for me.
> > Let's do it in another ticket?
> > I will create one.
> > 
> > > What is the reaal life uses cases for exporting views?
> > > Is there any database which exports some lists to somewhere?
> > > Especially on push based model, not on demand.
> > 
> > I don't know such dbms.
> > Seems, it's OK if some SPI implementation supports only part of exported data.
> > 
> > Are we use "lists" or "view" term? :)
> > 
> > My point is:
> > 
> > We can have single manager for metrics and views.
> > Why do we need one more manager in the system?
> > We can live without it.
> > 
> > В Пн, 16/09/2019 в 13:53 +0300, Andrey Gura пишет:
> > > Hi,
> > > 
> > > > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > > > 
> > > > For me, it seems that views and metrics is extension of one another.
> > > > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > > > and one want to know list of running SQL queries one take a look into views.
> > > 
> > > Views are about system state and they answer to question "what
> > > entities exist in the system (caches)?" or "what processes are
> > > executing by system (tx, queries)?"
> > > Metrics are about system behavior in some retrospective. They answers
> > > on questions how system behaves?
> > > 
> > > Views have wider meaning than metrics.
> > > 
> > > > > Code generation for walkers is also redundant.
> > > > 
> > > > If you don't want, you can not use it.
> > > > I find it pretty usefull during development.
> > > 
> > > I talk not about wishes of somebody ) Moreover, if it will depend on
> > > wishes it potentially can lead to misusing. IMO using the same code at
> > > runtime for view generation is better approach.
> > > 
> > > > > I really don't understand why we should export system views content
> > > > > (especially periodically). Real life use case is take view content on
> > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > no need any exporters.
> > > > 
> > > > What if we want to export lists to log or via http, etc?
> > > 
> > > If we will have public API for views then we can use REST for access
> > > to this API. Also you can use public API directly. What is the reaal
> > > life uses cases for exporting views? Is there any database which
> > > exports some lists to somewhere? Especially on push based model, not
> > > on demand.
> > > 
> > > On Fri, Sep 13, 2019 at 4:36 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > 
> > > > Hello, Andrey.
> > > > 
> > > > > I really don't like name MonitoringList. First of all because it isn't
> > > > > about monitoring at all while can be useful for monitoring purposes.
> > > > > We already have SQL system views and I think that system view is good
> > > > > candidate for naming of new entity.
> > > > 
> > > > SystemView is OK for me.
> > > > I will rename enity in the PR.
> > > > 
> > > > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > > > 
> > > > For me, it seems that views and metrics is extension of one another.
> > > > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > > > and one want to know list of running SQL queries one take a look into views.
> > > > 
> > > > > There is no any interaction with lists on hot path of code flow
> > > > > and there is no any performance impact.
> > > > 
> > > > OK, let's remove it.
> > > > 
> > > > > Code generation for walkers is also redundant.
> > > > 
> > > > If you don't want, you can not use it.
> > > > I find it pretty usefull during development.
> > > > 
> > > > > I really don't understand why we should export system views content
> > > > > (especially periodically). Real life use case is take view content on
> > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > no need any exporters.
> > > > 
> > > > What if we want to export lists to log or via http, etc?
> > > > 
> > > > > Also it would be great to involve more people to this discussion.
> > > > 
> > > > Any feedback are welcome!
> > > > 
> > > > 
> > > > В Пт, 13/09/2019 в 15:13 +0300, Andrey Gura пишет:
> > > > > Nikolay,
> > > > > 
> > > > > thanks a lot for clarification! I added some comments to Upsource review [1].
> > > > > 
> > > > > Here I want to discuss some high-level issues.
> > > > > 
> > > > > 1. Naming
> > > > > 
> > > > > "There are only two hard things in Computer Science: cache
> > > > > invalidation and naming things."
> > > > > -- Phil Karlton
> > > > > 
> > > > > I really don't like name MonitoringList. First of all because it isn't
> > > > > about monitoring at all while can be useful for monitoring purposes.
> > > > > 
> > > > > We already have SQL system views and I think that system view is good
> > > > > candidate for naming of new entity. As result we will have consistent
> > > > > naming which better describes domain.
> > > > > 
> > > > > I think akso that GridMetricManager is bad candidate for lists (system
> > > > > views) management. Because it isn't about metrics. May be new
> > > > > SystemViewManager will better fit to this purposes.
> > > > > 
> > > > > 2. Management
> > > > > 
> > > > > Lists (aka system views) have life cycle now. I believe that it is
> > > > > redundant functionality. There is no any reason for enabling/disabling
> > > > > lists. There is no any interaction with lists on hot path of code flow
> > > > > and there is no any performance impact.
> > > > > 
> > > > > So lists management can be reduced to lists creation and registration
> > > > > operations (which executes only on node start).
> > > > > 
> > > > > 3. Code generation
> > > > > 
> > > > > Code generation for walkers is also redundant. Amount of system views
> > > > > in the system is strongly limited (units not dozens) so it is easier
> > > > > to change walker by hand literally than navigate to code generator and
> > > > > run it. Moreover, first you should add Order annotation in the proper
> > > > > place and it make generator practically useless.
> > > > > 
> > > > > If you still see benefit that can bring Order annotation you can use
> > > > > reflection. Motivation is simple, system views are on not hot path and
> > > > > I expected that API for system views will not called frequently.
> > > > > 
> > > > > 4. Export
> > > > > 
> > > > > I really don't understand why we should export system views content
> > > > > (especially periodically). Real life use case is take view content on
> > > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > > no need any exporters.
> > > > > 
> > > > > 
> > > > > What do you think about it? Also it would be great to involve more
> > > > > people to this discussion.
> > > > > 
> > > > > [1] https://reviews.ignite.apache.org/ignite/review/IGNT-CR-1065
> > > > > 
> > > > > On Wed, Sep 11, 2019 at 6:24 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > 
> > > > > > Hello, Andrey.
> > > > > > 
> > > > > > Thanks, for joining the review.
> > > > > > 
> > > > > > Basic interface for objects list is `MonitoringList`. It provides the following features:
> > > > > >         * name.
> > > > > >         * description.
> > > > > >         * row class.
> > > > > >         * size.
> > > > > >         * iterator for the list content.
> > > > > >         * attribute walker (described below).
> > > > > > 
> > > > > > `MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.
> > > > > > 
> > > > > > Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
> > > > > > It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
> > > > > > I think, will be another implementation in the follow-up PRs.
> > > > > > 
> > > > > > Public API changes:
> > > > > > 
> > > > > > * New registry created `ReadOnlyMonitoringListRegistry` It provides access:
> > > > > >         * To all lists that exist in the Ignite.
> > > > > >         * Ability to subscribe to the list creation/removal events.
> > > > > > 
> > > > > > * `MetricExporterSpi` changes:
> > > > > >         * `setMonitoringListRegistry` method added
> > > > > >         * `setMonitoringListExportFilter` method added.
> > > > > > 
> > > > > > `MonitoringRowAttributeWalker` is a helper class for exporter implementations.
> > > > > > Usually, exporter SPI iterates on `MonitoringRow` attributes.
> > > > > > `SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
> > > > > > It can be implemented with Java reflection API, but I use more quick approach.
> > > > > > `MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
> > > > > > It's also, preserves, the order provided by the MonitoringRow implementation author.
> > > > > > It provides 2 main methods:
> > > > > >         * `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
> > > > > >         * `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.
> > > > > > 
> > > > > > 
> > > > > > В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> > > > > > > Nikolai,
> > > > > > > 
> > > > > > > I'm trying to review this PR but it is too large.
> > > > > > > 
> > > > > > > Could you please describe problem and design of implemented solution?
> > > > > > > Also javadocs for base interfaces aren't clear, too brief and doesn't
> > > > > > > give any imagine about whole picture.
> > > > > > > 
> > > > > > > At present it is very hard to understand the purposes of new
> > > > > > > interfaces and walker generator, and design itself.
> > > > > > > 
> > > > > > > On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > > > 
> > > > > > > > Hello, Igniters.
> > > > > > > > 
> > > > > > > > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > > > > > > > Please, join to the review!
> > > > > > > > 
> > > > > > > > I've implemented:
> > > > > > > > 
> > > > > > > > * Monitoring list engine.
> > > > > > > > * Following list implemented:
> > > > > > > >     * Cache list
> > > > > > > >     * Cache group list
> > > > > > > >     * Compute task list
> > > > > > > >     * Service list.
> > > > > > > > 
> > > > > > > > Engine details:
> > > > > > > > 
> > > > > > > > * `MonitoringList` added to store list data.
> > > > > > > > * Base interface `MonitoringRow` for list data created.
> > > > > > > > * Corresponding method added to `MetricExporterSpi`
> > > > > > > > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > > > > > > > support list export.
> > > > > > > > * JMX, SQL and other column-oriented SPI uses
> > > > > > > > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > > > > > > > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > > > > > > > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > > > > > > > 
> > > > > > > > I prepare follow-up PR [2], also.
> > > > > > > > Following lists implemented:
> > > > > > > > 
> > > > > > > > * SQL tables
> > > > > > > > * SQL indexes
> > > > > > > > * SQL schemas
> > > > > > > > * SQL queries
> > > > > > > > * Continuous queries
> > > > > > > > * Text queries
> > > > > > > > * Transactions
> > > > > > > > * Cluster nodes
> > > > > > > > * Client connections(JDBC, ODBC, Thin)
> > > > > > > > 
> > > > > > > > [1] https://github.com/apache/ignite/pull/6845
> > > > > > > > [2] https://github.com/apache/ignite/pull/6790
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > > > > > > > 
> > > > > > > > > Hello, Igniters.
> > > > > > > > > 
> > > > > > > > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > > > > > > > for Phase 2.
> > > > > > > > > 
> > > > > > > > > Scope of Phase 2(copy-paste from the ticket)
> > > > > > > > > 
> > > > > > > > > Ability to collect lists of some internal object Ignite manage.
> > > > > > > > > Examples of such objects:
> > > > > > > > > 
> > > > > > > > >   * Caches
> > > > > > > > >   * Queries (including continuous queries)
> > > > > > > > >   * Services
> > > > > > > > >   * Compute tasks
> > > > > > > > >   * Distributed Data Structures
> > > > > > > > >   * etc...
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > > > > > > > discussed in separate tickets
> > > > > > > > > 2. Metric Exporters (optionally) can support list export.
> > > > > > > > > 
> > > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > Ticket for IEP.Phase1 created -
> > > > > > > > > 
> > > > > > > > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > 
> > > > > > > > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > > > > > > > 
> > > > > > > > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > > > > > > > 
> > > > > > > > > > > Issues to address:
> > > > > > > > > > > 
> > > > > > > > > > > 1. Study experience of following libs, tools:
> > > > > > > > > > >     * OpenTracing
> > > > > > > > > > >     * OpenSensus
> > > > > > > > > > >     * DropWizard
> > > > > > > > > > > 
> > > > > > > > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > > > > > > > 
> > > > > > > > > into predefined segments
> > > > > > > > > > > 
> > > > > > > > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > > > > > > > 
> > > > > > > > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > > > > > > > 
> > > > > > > > > local metric storage.
> > > > > > > > > > > 
> > > > > > > > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > > > > > > > 
> > > > > > > > > > > 6. Some metrics should be part of public API and others are not(may be
> > > > > > > > > 
> > > > > > > > > changed/removed in release without warnings).
> > > > > > > > > > > 
> > > > > > > > > > > My plan for Phase #1 is the following:
> > > > > > > > > > > 
> > > > > > > > > > > 1. Address the issues.
> > > > > > > > > > > 2. Prepare public API
> > > > > > > > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > > > > > > > 
> > > > > > > > > with it.
> > > > > > > > > > > 4. Prepare a PR with lists of each user API.
> > > > > > > > > > > 5. Collect feedback for a #4.
> > > > > > > > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > > > > > > > 
> > > > > > > > > other widely used, tool compatible format.
> > > > > > > > > > > 
> > > > > > > > > > > [1]
> > > > > > > > > 
> > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > 
> > > > > > > > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > > Hello, Maxim.
> > > > > > > > > > > > 
> > > > > > > > > > > > > How will be recorded throughput sensor values which will require
> > > > > > > > > 
> > > > > > > > > an interval for the rate calculations?
> > > > > > > > > > > > 
> > > > > > > > > > > > I answered to this question in IEP "Design principles":
> > > > > > > > > > > > 
> > > > > > > > > > > > ```
> > > > > > > > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > > > > > > > 
> > > > > > > > > metrics on Ignite side.
> > > > > > > > > > > > Min, max, avg and other functions are the matter of an external
> > > > > > > > > 
> > > > > > > > > monitoring system.
> > > > > > > > > > > > ```
> > > > > > > > > > > > 
> > > > > > > > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > > > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > > > > > > > 
> > > > > > > > > > > > Seems, throughput calculation is a responsibility of an external
> > > > > > > > > 
> > > > > > > > > system.
> > > > > > > > > > > > 
> > > > > > > > > > > > What do you think?
> > > > > > > > > > > > 
> > > > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > 
> > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > > > > > > > INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > > > 
> > > > > > > > > > > > For now, I think that all sensors and lists will be very(very!)
> > > > > > > > > 
> > > > > > > > > lightweight.
> > > > > > > > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > > > > > > > 
> > > > > > > > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > > > > > > > for the case we have strong performance limitations for a particular
> > > > > > > > > 
> > > > > > > > > workload.
> > > > > > > > > > > > 
> > > > > > > > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > > > > > > > 
> > > > > > > > > profiling: IEP-35 - Phase 3).
> > > > > > > > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > > > > > > > 
> > > > > > > > > should we?)
> > > > > > > > > > > > 
> > > > > > > > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > > > > > > > Hello Nikolay,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I've looked through your PRs changes.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > > Sensors
> > > > > > > > > > > > > 
> > > > > > > > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > > > > > > > 
> > > > > > > > > are
> > > > > > > > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > > > > > > > 
> > > > > > > > > the
> > > > > > > > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > > > > > > > 
> > > > > > > > > control
> > > > > > > > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > > > > 
> > > > > > > > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > > > > > > > functional java approach can be used. Am I right?
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > > > > > > > 
> > > > > > > > > wrote:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Hello, Vyacheslav.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > > > > > > > 
> > > > > > > > > the core module.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > > > > > > > 
> > > > > > > > > discuss.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > > > > > > > 
> > > > > > > > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > > > > > > > for services
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Agreed in general.
> > > > > > > > > > > > > > Seems, choosing the right data to expose is the matter of
> > > > > > > > > 
> > > > > > > > > separate discussion for each Ignite entities.
> > > > > > > > > > > > > > I've planned to file tickets for each entity so anyone
> > > > > > > > > 
> > > > > > > > > interested can share his vision in it.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > > > > > > > 
> > > > > > > > > timestamp.
> > > > > > > > > > > > > > Seems, we should support sensors without timestamp for a current
> > > > > > > > > 
> > > > > > > > > monitoring numbers at least.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > > > 
> > > > > > > > > fixed size> of last N sensors
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > > > > > > > 
> > > > > > > > > queries" or similar data.
> > > > > > > > > > > > > > Essentially, a sensor is just a single value with the name and
> > > > > > > > > 
> > > > > > > > > known meaning.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > > > 
> > > > > > > > > work of> the system.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Sorry, for that :)
> > > > > > > > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > > > > > > > 
> > > > > > > > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > > > > > > > I provide this info in gist -
> > > > > > > > > 
> > > > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I will extend this test to print results to console in the next
> > > > > > > > > 
> > > > > > > > > iterations - stay tuned :)
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > > > > > > > Hi, Nikolay,
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > It would be better to implement it as a separate module, I
> > > > > > > > > 
> > > > > > > > > can't say
> > > > > > > > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > > > > > > > 
> > > > > > > > > detached
> > > > > > > > > > > > > > > from the core module.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > > > > > > > 
> > > > > > > > > 'ServiceConfiguration'
> > > > > > > > > > > > > > > directly as a monitored object for services. I believe we
> > > > > > > > > 
> > > > > > > > > shouldn't
> > > > > > > > > > > > > > > mix approaches. It'd be better always use some kind of
> > > > > > > > > 
> > > > > > > > > container with
> > > > > > > > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > > > > > > > 
> > > > > > > > > monitoring
> > > > > > > > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > > > > > > > timestamp.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > > > 
> > > > > > > > > fixed size
> > > > > > > > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > > > > > > > 
> > > > > > > > > external
> > > > > > > > > > > > > > > monitoring system.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > > > 
> > > > > > > > > work of
> > > > > > > > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > > > > > > > 
> > > > > > > > > and get
> > > > > > > > > > > > > > > the info manually to see the completeness of sensors, this
> > > > > > > > > 
> > > > > > > > > might be
> > > > > > > > > > > > > > > simplified by proper test.
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > Thank you!
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > > > > > > > 
> > > > > > > > > nizhikov@apache.org> wrote:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > > > > > > > PR can be found here -
> > > > > > > > > 
> > > > > > > > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > I've done following changes:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > > > > > > > 
> > > > > > > > > implementation of manager to store all monitoring info
> > > > > > > > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > > > > > > > 
> > > > > > > > > implementation that can respond with JSON from
> > > > > > > > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > > > > > > > gist [4]
> > > > > > > > > > > > > > > >         3. Compute task start and finish monitoring in
> > > > > > > > > 
> > > > > > > > > "compute" list [5]
> > > > > > > > > > > > > > > >         4. Service registration are monitored in "service"
> > > > > > > > > 
> > > > > > > > > list - [6]
> > > > > > > > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > > > > > > > 
> > > > > > > > > `GridMonitoringManager` [7]
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Design principles, monitoring subsystem details and new
> > > > > > > > > 
> > > > > > > > > Ignite entities can be found in IEP [1].
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > My next steps will be:
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > > > > > > > 
> > > > > > > > > as a SQL System view.
> > > > > > > > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > > > > > > > 
> > > > > > > > > (described in IEP).
> > > > > > > > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > > > > > > > 
> > > > > > > > > GridMonitoringManager.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > Part of JSON file:
> > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > >     "COMPUTE": {
> > > > > > > > > > > > > > > >       "tasks": {
> > > > > > > > > > > > > > > >         "name": "tasks",
> > > > > > > > > > > > > > > >         "rows": [
> > > > > > > > > > > > > > > >           {
> > > > > > > > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > > > > > > > >             "sessionId":
> > > > > > > > > 
> > > > > > > > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > > > > > > > >             "data": {
> > > > > > > > > > > > > > > >               "taskClasName":
> > > > > > > > > 
> > > > > > > > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > > > > > > > >               "execName": null
> > > > > > > > > > > > > > > >             },
> > > > > > > > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > > > > > > > >           }
> > > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > [1]
> > > > > > > > > 
> > > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > > > > > > [2]
> > > > > > > > > 
> > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > > > > > > > [3]
> > > > > > > > > 
> > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > > > > > > > [4]
> > > > > > > > > 
> > > > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > > > > [5]
> > > > > > > > > 
> > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > > > > > > > [6]
> > > > > > > > > 
> > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > > > > > > > [7]
> > > > > > > > > 
> > > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > 

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Andrey Gura <ag...@apache.org>.
>> Views have wider meaning than metrics.

> Yes! I agree, that's why I wrote 'extension' :)

No, no, no. Wider meaning isn't equal to extension :)

>> IMO using the same code at
>> runtime for view generation is better approach.

> OK for me.
> Let's do it in another ticket?
> I will create one.

Why? It is still part of the same task. Master branch should not see
intermediate changes.

> Seems, it's OK if some SPI implementation supports only part of exported data.

What is part of exported data? I understand why we have to export
metrics but defineitely have no idea why views should be exported
through out any special SPI.

> Are we use "lists" or "view" term? :)

Views for our task. I mean lists in general sense.

>> We can have single manager for metrics and views.
>> Why do we need one more manager in the system?
>> We can live without it.

First, views and metrics are entities from different worlds/domains.
Second, we will have less conflicts on GridMetricManager because we
are still working on metrics and views concurrently.
We can live with one manager for absolutely all entities in the system
but we don't do it, right? :)

On Mon, Sep 16, 2019 at 2:52 PM Nikolay Izhikov <ni...@apache.org> wrote:
>
> > Views have wider meaning than metrics.
>
> Yes! I agree, that's why I wrote 'extension' :)
>
> > IMO using the same code at
> > runtime for view generation is better approach.
>
> OK for me.
> Let's do it in another ticket?
> I will create one.
>
> > What is the reaal life uses cases for exporting views?
> > Is there any database which exports some lists to somewhere?
> > Especially on push based model, not on demand.
>
> I don't know such dbms.
> Seems, it's OK if some SPI implementation supports only part of exported data.
>
> Are we use "lists" or "view" term? :)
>
> My point is:
>
> We can have single manager for metrics and views.
> Why do we need one more manager in the system?
> We can live without it.
>
> В Пн, 16/09/2019 в 13:53 +0300, Andrey Gura пишет:
> > Hi,
> >
> > > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > > For me, it seems that views and metrics is extension of one another.
> > > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > > and one want to know list of running SQL queries one take a look into views.
> >
> > Views are about system state and they answer to question "what
> > entities exist in the system (caches)?" or "what processes are
> > executing by system (tx, queries)?"
> > Metrics are about system behavior in some retrospective. They answers
> > on questions how system behaves?
> >
> > Views have wider meaning than metrics.
> >
> > > > Code generation for walkers is also redundant.
> > > If you don't want, you can not use it.
> > > I find it pretty usefull during development.
> >
> > I talk not about wishes of somebody ) Moreover, if it will depend on
> > wishes it potentially can lead to misusing. IMO using the same code at
> > runtime for view generation is better approach.
> >
> > > > I really don't understand why we should export system views content
> > > > (especially periodically). Real life use case is take view content on
> > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > no need any exporters.
> > > What if we want to export lists to log or via http, etc?
> >
> > If we will have public API for views then we can use REST for access
> > to this API. Also you can use public API directly. What is the reaal
> > life uses cases for exporting views? Is there any database which
> > exports some lists to somewhere? Especially on push based model, not
> > on demand.
> >
> > On Fri, Sep 13, 2019 at 4:36 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > >
> > > Hello, Andrey.
> > >
> > > > I really don't like name MonitoringList. First of all because it isn't
> > > > about monitoring at all while can be useful for monitoring purposes.
> > > > We already have SQL system views and I think that system view is good
> > > > candidate for naming of new entity.
> > >
> > > SystemView is OK for me.
> > > I will rename enity in the PR.
> > >
> > > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > >
> > > For me, it seems that views and metrics is extension of one another.
> > > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > > and one want to know list of running SQL queries one take a look into views.
> > >
> > > > There is no any interaction with lists on hot path of code flow
> > > > and there is no any performance impact.
> > >
> > > OK, let's remove it.
> > >
> > > > Code generation for walkers is also redundant.
> > >
> > > If you don't want, you can not use it.
> > > I find it pretty usefull during development.
> > >
> > > > I really don't understand why we should export system views content
> > > > (especially periodically). Real life use case is take view content on
> > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > no need any exporters.
> > >
> > > What if we want to export lists to log or via http, etc?
> > >
> > > > Also it would be great to involve more people to this discussion.
> > >
> > > Any feedback are welcome!
> > >
> > >
> > > В Пт, 13/09/2019 в 15:13 +0300, Andrey Gura пишет:
> > > > Nikolay,
> > > >
> > > > thanks a lot for clarification! I added some comments to Upsource review [1].
> > > >
> > > > Here I want to discuss some high-level issues.
> > > >
> > > > 1. Naming
> > > >
> > > > "There are only two hard things in Computer Science: cache
> > > > invalidation and naming things."
> > > > -- Phil Karlton
> > > >
> > > > I really don't like name MonitoringList. First of all because it isn't
> > > > about monitoring at all while can be useful for monitoring purposes.
> > > >
> > > > We already have SQL system views and I think that system view is good
> > > > candidate for naming of new entity. As result we will have consistent
> > > > naming which better describes domain.
> > > >
> > > > I think akso that GridMetricManager is bad candidate for lists (system
> > > > views) management. Because it isn't about metrics. May be new
> > > > SystemViewManager will better fit to this purposes.
> > > >
> > > > 2. Management
> > > >
> > > > Lists (aka system views) have life cycle now. I believe that it is
> > > > redundant functionality. There is no any reason for enabling/disabling
> > > > lists. There is no any interaction with lists on hot path of code flow
> > > > and there is no any performance impact.
> > > >
> > > > So lists management can be reduced to lists creation and registration
> > > > operations (which executes only on node start).
> > > >
> > > > 3. Code generation
> > > >
> > > > Code generation for walkers is also redundant. Amount of system views
> > > > in the system is strongly limited (units not dozens) so it is easier
> > > > to change walker by hand literally than navigate to code generator and
> > > > run it. Moreover, first you should add Order annotation in the proper
> > > > place and it make generator practically useless.
> > > >
> > > > If you still see benefit that can bring Order annotation you can use
> > > > reflection. Motivation is simple, system views are on not hot path and
> > > > I expected that API for system views will not called frequently.
> > > >
> > > > 4. Export
> > > >
> > > > I really don't understand why we should export system views content
> > > > (especially periodically). Real life use case is take view content on
> > > > demand. So we should have public API for it, SQL API and JMX. There is
> > > > no need any exporters.
> > > >
> > > >
> > > > What do you think about it? Also it would be great to involve more
> > > > people to this discussion.
> > > >
> > > > [1] https://reviews.ignite.apache.org/ignite/review/IGNT-CR-1065
> > > >
> > > > On Wed, Sep 11, 2019 at 6:24 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > >
> > > > > Hello, Andrey.
> > > > >
> > > > > Thanks, for joining the review.
> > > > >
> > > > > Basic interface for objects list is `MonitoringList`. It provides the following features:
> > > > >         * name.
> > > > >         * description.
> > > > >         * row class.
> > > > >         * size.
> > > > >         * iterator for the list content.
> > > > >         * attribute walker (described below).
> > > > >
> > > > > `MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.
> > > > >
> > > > > Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
> > > > > It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
> > > > > I think, will be another implementation in the follow-up PRs.
> > > > >
> > > > > Public API changes:
> > > > >
> > > > > * New registry created `ReadOnlyMonitoringListRegistry` It provides access:
> > > > >         * To all lists that exist in the Ignite.
> > > > >         * Ability to subscribe to the list creation/removal events.
> > > > >
> > > > > * `MetricExporterSpi` changes:
> > > > >         * `setMonitoringListRegistry` method added
> > > > >         * `setMonitoringListExportFilter` method added.
> > > > >
> > > > > `MonitoringRowAttributeWalker` is a helper class for exporter implementations.
> > > > > Usually, exporter SPI iterates on `MonitoringRow` attributes.
> > > > > `SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
> > > > > It can be implemented with Java reflection API, but I use more quick approach.
> > > > > `MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
> > > > > It's also, preserves, the order provided by the MonitoringRow implementation author.
> > > > > It provides 2 main methods:
> > > > >         * `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
> > > > >         * `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.
> > > > >
> > > > >
> > > > > В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> > > > > > Nikolai,
> > > > > >
> > > > > > I'm trying to review this PR but it is too large.
> > > > > >
> > > > > > Could you please describe problem and design of implemented solution?
> > > > > > Also javadocs for base interfaces aren't clear, too brief and doesn't
> > > > > > give any imagine about whole picture.
> > > > > >
> > > > > > At present it is very hard to understand the purposes of new
> > > > > > interfaces and walker generator, and design itself.
> > > > > >
> > > > > > On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > >
> > > > > > > Hello, Igniters.
> > > > > > >
> > > > > > > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > > > > > > Please, join to the review!
> > > > > > >
> > > > > > > I've implemented:
> > > > > > >
> > > > > > > * Monitoring list engine.
> > > > > > > * Following list implemented:
> > > > > > >     * Cache list
> > > > > > >     * Cache group list
> > > > > > >     * Compute task list
> > > > > > >     * Service list.
> > > > > > >
> > > > > > > Engine details:
> > > > > > >
> > > > > > > * `MonitoringList` added to store list data.
> > > > > > > * Base interface `MonitoringRow` for list data created.
> > > > > > > * Corresponding method added to `MetricExporterSpi`
> > > > > > > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > > > > > > support list export.
> > > > > > > * JMX, SQL and other column-oriented SPI uses
> > > > > > > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > > > > > > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > > > > > > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > > > > > >
> > > > > > > I prepare follow-up PR [2], also.
> > > > > > > Following lists implemented:
> > > > > > >
> > > > > > > * SQL tables
> > > > > > > * SQL indexes
> > > > > > > * SQL schemas
> > > > > > > * SQL queries
> > > > > > > * Continuous queries
> > > > > > > * Text queries
> > > > > > > * Transactions
> > > > > > > * Cluster nodes
> > > > > > > * Client connections(JDBC, ODBC, Thin)
> > > > > > >
> > > > > > > [1] https://github.com/apache/ignite/pull/6845
> > > > > > > [2] https://github.com/apache/ignite/pull/6790
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > > > > > >
> > > > > > > > Hello, Igniters.
> > > > > > > >
> > > > > > > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > > > > > > for Phase 2.
> > > > > > > >
> > > > > > > > Scope of Phase 2(copy-paste from the ticket)
> > > > > > > >
> > > > > > > > Ability to collect lists of some internal object Ignite manage.
> > > > > > > > Examples of such objects:
> > > > > > > >
> > > > > > > >   * Caches
> > > > > > > >   * Queries (including continuous queries)
> > > > > > > >   * Services
> > > > > > > >   * Compute tasks
> > > > > > > >   * Distributed Data Structures
> > > > > > > >   * etc...
> > > > > > > >
> > > > > > > >
> > > > > > > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > > > > > > discussed in separate tickets
> > > > > > > > 2. Metric Exporters (optionally) can support list export.
> > > > > > > >
> > > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > > > > > >
> > > > > > > >
> > > > > > > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > > > > > > Ticket for IEP.Phase1 created -
> > > > > > > >
> > > > > > > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > Hello, Igniters.
> > > > > > > > > >
> > > > > > > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > > > > > >
> > > > > > > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > > > > > >
> > > > > > > > > > Issues to address:
> > > > > > > > > >
> > > > > > > > > > 1. Study experience of following libs, tools:
> > > > > > > > > >     * OpenTracing
> > > > > > > > > >     * OpenSensus
> > > > > > > > > >     * DropWizard
> > > > > > > > > >
> > > > > > > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > > > > > >
> > > > > > > > into predefined segments
> > > > > > > > > >
> > > > > > > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > > > > > >
> > > > > > > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > > > > > >
> > > > > > > > local metric storage.
> > > > > > > > > >
> > > > > > > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > > > > > >
> > > > > > > > > > 6. Some metrics should be part of public API and others are not(may be
> > > > > > > >
> > > > > > > > changed/removed in release without warnings).
> > > > > > > > > >
> > > > > > > > > > My plan for Phase #1 is the following:
> > > > > > > > > >
> > > > > > > > > > 1. Address the issues.
> > > > > > > > > > 2. Prepare public API
> > > > > > > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > > > > > >
> > > > > > > > with it.
> > > > > > > > > > 4. Prepare a PR with lists of each user API.
> > > > > > > > > > 5. Collect feedback for a #4.
> > > > > > > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > > > > > >
> > > > > > > > other widely used, tool compatible format.
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > >
> > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > >
> > > > > > > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > > Hello, Maxim.
> > > > > > > > > > >
> > > > > > > > > > > > How will be recorded throughput sensor values which will require
> > > > > > > >
> > > > > > > > an interval for the rate calculations?
> > > > > > > > > > >
> > > > > > > > > > > I answered to this question in IEP "Design principles":
> > > > > > > > > > >
> > > > > > > > > > > ```
> > > > > > > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > > > > > >
> > > > > > > > metrics on Ignite side.
> > > > > > > > > > > Min, max, avg and other functions are the matter of an external
> > > > > > > >
> > > > > > > > monitoring system.
> > > > > > > > > > > ```
> > > > > > > > > > >
> > > > > > > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > > > > > >
> > > > > > > > > > > Seems, throughput calculation is a responsibility of an external
> > > > > > > >
> > > > > > > > system.
> > > > > > > > > > >
> > > > > > > > > > > What do you think?
> > > > > > > > > > >
> > > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > >
> > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > > > > > > INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > >
> > > > > > > > > > > For now, I think that all sensors and lists will be very(very!)
> > > > > > > >
> > > > > > > > lightweight.
> > > > > > > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > > > > > >
> > > > > > > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > > > > > > for the case we have strong performance limitations for a particular
> > > > > > > >
> > > > > > > > workload.
> > > > > > > > > > >
> > > > > > > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > > > > > >
> > > > > > > > profiling: IEP-35 - Phase 3).
> > > > > > > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > > > > > >
> > > > > > > > should we?)
> > > > > > > > > > >
> > > > > > > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > > > > > > Hello Nikolay,
> > > > > > > > > > > >
> > > > > > > > > > > > I've looked through your PRs changes.
> > > > > > > > > > > >
> > > > > > > > > > > > > Sensors
> > > > > > > > > > > >
> > > > > > > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > > > > > >
> > > > > > > > are
> > > > > > > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > > > > > >
> > > > > > > > the
> > > > > > > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > > > > > >
> > > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > > > > > >
> > > > > > > > control
> > > > > > > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > > >
> > > > > > > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > > > > > > functional java approach can be used. Am I right?
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hello, Vyacheslav.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > > > >
> > > > > > > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > > > > > >
> > > > > > > > the core module.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > > > > > >
> > > > > > > > discuss.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > > > > > >
> > > > > > > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > > > > > > for services
> > > > > > > > > > > > >
> > > > > > > > > > > > > Agreed in general.
> > > > > > > > > > > > > Seems, choosing the right data to expose is the matter of
> > > > > > > >
> > > > > > > > separate discussion for each Ignite entities.
> > > > > > > > > > > > > I've planned to file tickets for each entity so anyone
> > > > > > > >
> > > > > > > > interested can share his vision in it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > > > > > >
> > > > > > > > timestamp.
> > > > > > > > > > > > > Seems, we should support sensors without timestamp for a current
> > > > > > > >
> > > > > > > > monitoring numbers at least.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > >
> > > > > > > > fixed size> of last N sensors
> > > > > > > > > > > > >
> > > > > > > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > > > > > >
> > > > > > > > queries" or similar data.
> > > > > > > > > > > > > Essentially, a sensor is just a single value with the name and
> > > > > > > >
> > > > > > > > known meaning.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > >
> > > > > > > > work of> the system.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Sorry, for that :)
> > > > > > > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > > > > > >
> > > > > > > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > > > > > > I provide this info in gist -
> > > > > > > >
> > > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > >
> > > > > > > > > > > > > I will extend this test to print results to console in the next
> > > > > > > >
> > > > > > > > iterations - stay tuned :)
> > > > > > > > > > > > >
> > > > > > > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > > > > > > Hi, Nikolay,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It would be better to implement it as a separate module, I
> > > > > > > >
> > > > > > > > can't say
> > > > > > > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > > > > > >
> > > > > > > > detached
> > > > > > > > > > > > > > from the core module.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > > > > > >
> > > > > > > > 'ServiceConfiguration'
> > > > > > > > > > > > > > directly as a monitored object for services. I believe we
> > > > > > > >
> > > > > > > > shouldn't
> > > > > > > > > > > > > > mix approaches. It'd be better always use some kind of
> > > > > > > >
> > > > > > > > container with
> > > > > > > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > > > > > >
> > > > > > > > monitoring
> > > > > > > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > > > > > > timestamp.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > >
> > > > > > > > fixed size
> > > > > > > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > > > > > >
> > > > > > > > external
> > > > > > > > > > > > > > monitoring system.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > >
> > > > > > > > work of
> > > > > > > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > > > > > >
> > > > > > > > and get
> > > > > > > > > > > > > > the info manually to see the completeness of sensors, this
> > > > > > > >
> > > > > > > > might be
> > > > > > > > > > > > > > simplified by proper test.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > > > > > >
> > > > > > > > nizhikov@apache.org> wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > > > > > > PR can be found here -
> > > > > > > >
> > > > > > > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I've done following changes:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > > > > > >
> > > > > > > > implementation of manager to store all monitoring info
> > > > > > > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > > > > > >
> > > > > > > > implementation that can respond with JSON from
> > > > > > > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > > > > > > gist [4]
> > > > > > > > > > > > > > >         3. Compute task start and finish monitoring in
> > > > > > > >
> > > > > > > > "compute" list [5]
> > > > > > > > > > > > > > >         4. Service registration are monitored in "service"
> > > > > > > >
> > > > > > > > list - [6]
> > > > > > > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > > > > > >
> > > > > > > > `GridMonitoringManager` [7]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Design principles, monitoring subsystem details and new
> > > > > > > >
> > > > > > > > Ignite entities can be found in IEP [1].
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > My next steps will be:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > > > > > >
> > > > > > > > as a SQL System view.
> > > > > > > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > > > > > >
> > > > > > > > (described in IEP).
> > > > > > > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > > > > > >
> > > > > > > > GridMonitoringManager.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Part of JSON file:
> > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > >     "COMPUTE": {
> > > > > > > > > > > > > > >       "tasks": {
> > > > > > > > > > > > > > >         "name": "tasks",
> > > > > > > > > > > > > > >         "rows": [
> > > > > > > > > > > > > > >           {
> > > > > > > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > > > > > > >             "sessionId":
> > > > > > > >
> > > > > > > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > > > > > > >             "data": {
> > > > > > > > > > > > > > >               "taskClasName":
> > > > > > > >
> > > > > > > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > > > > > > >               "execName": null
> > > > > > > > > > > > > > >             },
> > > > > > > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > > > > > > >           }
> > > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1]
> > > > > > > >
> > > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > > > > > [2]
> > > > > > > >
> > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > > > > > > [3]
> > > > > > > >
> > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > > > > > > [4]
> > > > > > > >
> > > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > > > [5]
> > > > > > > >
> > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > > > > > > [6]
> > > > > > > >
> > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > > > > > > [7]
> > > > > > > >
> > > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Nikolay Izhikov <ni...@apache.org>.
> Views have wider meaning than metrics.

Yes! I agree, that's why I wrote 'extension' :)

> IMO using the same code at
> runtime for view generation is better approach.

OK for me.
Let's do it in another ticket? 
I will create one.

> What is the reaal life uses cases for exporting views?
> Is there any database which exports some lists to somewhere?
> Especially on push based model, not on demand.

I don't know such dbms.
Seems, it's OK if some SPI implementation supports only part of exported data.

Are we use "lists" or "view" term? :)

My point is: 

We can have single manager for metrics and views.
Why do we need one more manager in the system?
We can live without it.

В Пн, 16/09/2019 в 13:53 +0300, Andrey Gura пишет:
> Hi,
> 
> > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > For me, it seems that views and metrics is extension of one another.
> > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > and one want to know list of running SQL queries one take a look into views.
> 
> Views are about system state and they answer to question "what
> entities exist in the system (caches)?" or "what processes are
> executing by system (tx, queries)?"
> Metrics are about system behavior in some retrospective. They answers
> on questions how system behaves?
> 
> Views have wider meaning than metrics.
> 
> > > Code generation for walkers is also redundant.
> > If you don't want, you can not use it.
> > I find it pretty usefull during development.
> 
> I talk not about wishes of somebody ) Moreover, if it will depend on
> wishes it potentially can lead to misusing. IMO using the same code at
> runtime for view generation is better approach.
> 
> > > I really don't understand why we should export system views content
> > > (especially periodically). Real life use case is take view content on
> > > demand. So we should have public API for it, SQL API and JMX. There is
> > > no need any exporters.
> > What if we want to export lists to log or via http, etc?
> 
> If we will have public API for views then we can use REST for access
> to this API. Also you can use public API directly. What is the reaal
> life uses cases for exporting views? Is there any database which
> exports some lists to somewhere? Especially on push based model, not
> on demand.
> 
> On Fri, Sep 13, 2019 at 4:36 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > 
> > Hello, Andrey.
> > 
> > > I really don't like name MonitoringList. First of all because it isn't
> > > about monitoring at all while can be useful for monitoring purposes.
> > > We already have SQL system views and I think that system view is good
> > > candidate for naming of new entity.
> > 
> > SystemView is OK for me.
> > I will rename enity in the PR.
> > 
> > > I think akso that GridMetricManager is bad candidate for lists (system views) management.
> > 
> > For me, it seems that views and metrics is extension of one another.
> > If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> > and one want to know list of running SQL queries one take a look into views.
> > 
> > > There is no any interaction with lists on hot path of code flow
> > > and there is no any performance impact.
> > 
> > OK, let's remove it.
> > 
> > > Code generation for walkers is also redundant.
> > 
> > If you don't want, you can not use it.
> > I find it pretty usefull during development.
> > 
> > > I really don't understand why we should export system views content
> > > (especially periodically). Real life use case is take view content on
> > > demand. So we should have public API for it, SQL API and JMX. There is
> > > no need any exporters.
> > 
> > What if we want to export lists to log or via http, etc?
> > 
> > > Also it would be great to involve more people to this discussion.
> > 
> > Any feedback are welcome!
> > 
> > 
> > В Пт, 13/09/2019 в 15:13 +0300, Andrey Gura пишет:
> > > Nikolay,
> > > 
> > > thanks a lot for clarification! I added some comments to Upsource review [1].
> > > 
> > > Here I want to discuss some high-level issues.
> > > 
> > > 1. Naming
> > > 
> > > "There are only two hard things in Computer Science: cache
> > > invalidation and naming things."
> > > -- Phil Karlton
> > > 
> > > I really don't like name MonitoringList. First of all because it isn't
> > > about monitoring at all while can be useful for monitoring purposes.
> > > 
> > > We already have SQL system views and I think that system view is good
> > > candidate for naming of new entity. As result we will have consistent
> > > naming which better describes domain.
> > > 
> > > I think akso that GridMetricManager is bad candidate for lists (system
> > > views) management. Because it isn't about metrics. May be new
> > > SystemViewManager will better fit to this purposes.
> > > 
> > > 2. Management
> > > 
> > > Lists (aka system views) have life cycle now. I believe that it is
> > > redundant functionality. There is no any reason for enabling/disabling
> > > lists. There is no any interaction with lists on hot path of code flow
> > > and there is no any performance impact.
> > > 
> > > So lists management can be reduced to lists creation and registration
> > > operations (which executes only on node start).
> > > 
> > > 3. Code generation
> > > 
> > > Code generation for walkers is also redundant. Amount of system views
> > > in the system is strongly limited (units not dozens) so it is easier
> > > to change walker by hand literally than navigate to code generator and
> > > run it. Moreover, first you should add Order annotation in the proper
> > > place and it make generator practically useless.
> > > 
> > > If you still see benefit that can bring Order annotation you can use
> > > reflection. Motivation is simple, system views are on not hot path and
> > > I expected that API for system views will not called frequently.
> > > 
> > > 4. Export
> > > 
> > > I really don't understand why we should export system views content
> > > (especially periodically). Real life use case is take view content on
> > > demand. So we should have public API for it, SQL API and JMX. There is
> > > no need any exporters.
> > > 
> > > 
> > > What do you think about it? Also it would be great to involve more
> > > people to this discussion.
> > > 
> > > [1] https://reviews.ignite.apache.org/ignite/review/IGNT-CR-1065
> > > 
> > > On Wed, Sep 11, 2019 at 6:24 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > 
> > > > Hello, Andrey.
> > > > 
> > > > Thanks, for joining the review.
> > > > 
> > > > Basic interface for objects list is `MonitoringList`. It provides the following features:
> > > >         * name.
> > > >         * description.
> > > >         * row class.
> > > >         * size.
> > > >         * iterator for the list content.
> > > >         * attribute walker (described below).
> > > > 
> > > > `MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.
> > > > 
> > > > Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
> > > > It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
> > > > I think, will be another implementation in the follow-up PRs.
> > > > 
> > > > Public API changes:
> > > > 
> > > > * New registry created `ReadOnlyMonitoringListRegistry` It provides access:
> > > >         * To all lists that exist in the Ignite.
> > > >         * Ability to subscribe to the list creation/removal events.
> > > > 
> > > > * `MetricExporterSpi` changes:
> > > >         * `setMonitoringListRegistry` method added
> > > >         * `setMonitoringListExportFilter` method added.
> > > > 
> > > > `MonitoringRowAttributeWalker` is a helper class for exporter implementations.
> > > > Usually, exporter SPI iterates on `MonitoringRow` attributes.
> > > > `SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
> > > > It can be implemented with Java reflection API, but I use more quick approach.
> > > > `MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
> > > > It's also, preserves, the order provided by the MonitoringRow implementation author.
> > > > It provides 2 main methods:
> > > >         * `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
> > > >         * `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.
> > > > 
> > > > 
> > > > В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> > > > > Nikolai,
> > > > > 
> > > > > I'm trying to review this PR but it is too large.
> > > > > 
> > > > > Could you please describe problem and design of implemented solution?
> > > > > Also javadocs for base interfaces aren't clear, too brief and doesn't
> > > > > give any imagine about whole picture.
> > > > > 
> > > > > At present it is very hard to understand the purposes of new
> > > > > interfaces and walker generator, and design itself.
> > > > > 
> > > > > On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > > > 
> > > > > > Hello, Igniters.
> > > > > > 
> > > > > > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > > > > > Please, join to the review!
> > > > > > 
> > > > > > I've implemented:
> > > > > > 
> > > > > > * Monitoring list engine.
> > > > > > * Following list implemented:
> > > > > >     * Cache list
> > > > > >     * Cache group list
> > > > > >     * Compute task list
> > > > > >     * Service list.
> > > > > > 
> > > > > > Engine details:
> > > > > > 
> > > > > > * `MonitoringList` added to store list data.
> > > > > > * Base interface `MonitoringRow` for list data created.
> > > > > > * Corresponding method added to `MetricExporterSpi`
> > > > > > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > > > > > support list export.
> > > > > > * JMX, SQL and other column-oriented SPI uses
> > > > > > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > > > > > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > > > > > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > > > > > 
> > > > > > I prepare follow-up PR [2], also.
> > > > > > Following lists implemented:
> > > > > > 
> > > > > > * SQL tables
> > > > > > * SQL indexes
> > > > > > * SQL schemas
> > > > > > * SQL queries
> > > > > > * Continuous queries
> > > > > > * Text queries
> > > > > > * Transactions
> > > > > > * Cluster nodes
> > > > > > * Client connections(JDBC, ODBC, Thin)
> > > > > > 
> > > > > > [1] https://github.com/apache/ignite/pull/6845
> > > > > > [2] https://github.com/apache/ignite/pull/6790
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > > > > > 
> > > > > > > Hello, Igniters.
> > > > > > > 
> > > > > > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > > > > > for Phase 2.
> > > > > > > 
> > > > > > > Scope of Phase 2(copy-paste from the ticket)
> > > > > > > 
> > > > > > > Ability to collect lists of some internal object Ignite manage.
> > > > > > > Examples of such objects:
> > > > > > > 
> > > > > > >   * Caches
> > > > > > >   * Queries (including continuous queries)
> > > > > > >   * Services
> > > > > > >   * Compute tasks
> > > > > > >   * Distributed Data Structures
> > > > > > >   * etc...
> > > > > > > 
> > > > > > > 
> > > > > > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > > > > > discussed in separate tickets
> > > > > > > 2. Metric Exporters (optionally) can support list export.
> > > > > > > 
> > > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > > > > > 
> > > > > > > 
> > > > > > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > > > > > Ticket for IEP.Phase1 created -
> > > > > > > 
> > > > > > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > > > > > 
> > > > > > > > 
> > > > > > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > > > > > Hello, Igniters.
> > > > > > > > > 
> > > > > > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > > > > > 
> > > > > > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > > > > > 
> > > > > > > > > Issues to address:
> > > > > > > > > 
> > > > > > > > > 1. Study experience of following libs, tools:
> > > > > > > > >     * OpenTracing
> > > > > > > > >     * OpenSensus
> > > > > > > > >     * DropWizard
> > > > > > > > > 
> > > > > > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > > > > > 
> > > > > > > into predefined segments
> > > > > > > > > 
> > > > > > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > > > > > 
> > > > > > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > > > > > 
> > > > > > > local metric storage.
> > > > > > > > > 
> > > > > > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > > > > > 
> > > > > > > > > 6. Some metrics should be part of public API and others are not(may be
> > > > > > > 
> > > > > > > changed/removed in release without warnings).
> > > > > > > > > 
> > > > > > > > > My plan for Phase #1 is the following:
> > > > > > > > > 
> > > > > > > > > 1. Address the issues.
> > > > > > > > > 2. Prepare public API
> > > > > > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > > > > > 
> > > > > > > with it.
> > > > > > > > > 4. Prepare a PR with lists of each user API.
> > > > > > > > > 5. Collect feedback for a #4.
> > > > > > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > > > > > 
> > > > > > > other widely used, tool compatible format.
> > > > > > > > > 
> > > > > > > > > [1]
> > > > > > > 
> > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > 
> > > > > > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > > > > > Hello, Maxim.
> > > > > > > > > > 
> > > > > > > > > > > How will be recorded throughput sensor values which will require
> > > > > > > 
> > > > > > > an interval for the rate calculations?
> > > > > > > > > > 
> > > > > > > > > > I answered to this question in IEP "Design principles":
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > > > > > 
> > > > > > > metrics on Ignite side.
> > > > > > > > > > Min, max, avg and other functions are the matter of an external
> > > > > > > 
> > > > > > > monitoring system.
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > > > > > 
> > > > > > > > > > Seems, throughput calculation is a responsibility of an external
> > > > > > > 
> > > > > > > system.
> > > > > > > > > > 
> > > > > > > > > > What do you think?
> > > > > > > > > > 
> > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > 
> > > > > > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > > > > > INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > 
> > > > > > > > > > For now, I think that all sensors and lists will be very(very!)
> > > > > > > 
> > > > > > > lightweight.
> > > > > > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > > > > > 
> > > > > > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > > > > > for the case we have strong performance limitations for a particular
> > > > > > > 
> > > > > > > workload.
> > > > > > > > > > 
> > > > > > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > > > > > 
> > > > > > > profiling: IEP-35 - Phase 3).
> > > > > > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > > > > > 
> > > > > > > should we?)
> > > > > > > > > > 
> > > > > > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > > > > > Hello Nikolay,
> > > > > > > > > > > 
> > > > > > > > > > > I've looked through your PRs changes.
> > > > > > > > > > > 
> > > > > > > > > > > > Sensors
> > > > > > > > > > > 
> > > > > > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > > > > > 
> > > > > > > are
> > > > > > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > > > > > 
> > > > > > > the
> > > > > > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > > > > > 
> > > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > > > > > 
> > > > > > > control
> > > > > > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > > > 
> > > > > > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > > > > > functional java approach can be used. Am I right?
> > > > > > > > > > > 
> > > > > > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > > > > > 
> > > > > > > wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > Hello, Vyacheslav.
> > > > > > > > > > > > 
> > > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > > > 
> > > > > > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > > > > > 
> > > > > > > the core module.
> > > > > > > > > > > > 
> > > > > > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > > > > > 
> > > > > > > discuss.
> > > > > > > > > > > > 
> > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > > > > > 
> > > > > > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > > > > > for services
> > > > > > > > > > > > 
> > > > > > > > > > > > Agreed in general.
> > > > > > > > > > > > Seems, choosing the right data to expose is the matter of
> > > > > > > 
> > > > > > > separate discussion for each Ignite entities.
> > > > > > > > > > > > I've planned to file tickets for each entity so anyone
> > > > > > > 
> > > > > > > interested can share his vision in it.
> > > > > > > > > > > > 
> > > > > > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > > > > > 
> > > > > > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > > > > > 
> > > > > > > timestamp.
> > > > > > > > > > > > Seems, we should support sensors without timestamp for a current
> > > > > > > 
> > > > > > > monitoring numbers at least.
> > > > > > > > > > > > 
> > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > 
> > > > > > > fixed size> of last N sensors
> > > > > > > > > > > > 
> > > > > > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > > > > > 
> > > > > > > queries" or similar data.
> > > > > > > > > > > > Essentially, a sensor is just a single value with the name and
> > > > > > > 
> > > > > > > known meaning.
> > > > > > > > > > > > 
> > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > 
> > > > > > > work of> the system.
> > > > > > > > > > > > 
> > > > > > > > > > > > Sorry, for that :)
> > > > > > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > > > > > 
> > > > > > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > > > > > I provide this info in gist -
> > > > > > > 
> > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > 
> > > > > > > > > > > > I will extend this test to print results to console in the next
> > > > > > > 
> > > > > > > iterations - stay tuned :)
> > > > > > > > > > > > 
> > > > > > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > > > > > Hi, Nikolay,
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > > > > > 
> > > > > > > > > > > > > It would be better to implement it as a separate module, I
> > > > > > > 
> > > > > > > can't say
> > > > > > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > > > > > 
> > > > > > > detached
> > > > > > > > > > > > > from the core module.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > > > > > 
> > > > > > > 'ServiceConfiguration'
> > > > > > > > > > > > > directly as a monitored object for services. I believe we
> > > > > > > 
> > > > > > > shouldn't
> > > > > > > > > > > > > mix approaches. It'd be better always use some kind of
> > > > > > > 
> > > > > > > container with
> > > > > > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > > > > > 
> > > > > > > monitoring
> > > > > > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > > > > > timestamp.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > > > 
> > > > > > > fixed size
> > > > > > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > > > > > 
> > > > > > > external
> > > > > > > > > > > > > monitoring system.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > > > 
> > > > > > > work of
> > > > > > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > > > > > 
> > > > > > > and get
> > > > > > > > > > > > > the info manually to see the completeness of sensors, this
> > > > > > > 
> > > > > > > might be
> > > > > > > > > > > > > simplified by proper test.
> > > > > > > > > > > > > 
> > > > > > > > > > > > > Thank you!
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > > > > > 
> > > > > > > nizhikov@apache.org> wrote:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > > > > > PR can be found here -
> > > > > > > 
> > > > > > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > I've done following changes:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > > > > > 
> > > > > > > implementation of manager to store all monitoring info
> > > > > > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > > > > > 
> > > > > > > implementation that can respond with JSON from
> > > > > > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > > > > > gist [4]
> > > > > > > > > > > > > >         3. Compute task start and finish monitoring in
> > > > > > > 
> > > > > > > "compute" list [5]
> > > > > > > > > > > > > >         4. Service registration are monitored in "service"
> > > > > > > 
> > > > > > > list - [6]
> > > > > > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > > > > > 
> > > > > > > `GridMonitoringManager` [7]
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Design principles, monitoring subsystem details and new
> > > > > > > 
> > > > > > > Ignite entities can be found in IEP [1].
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > My next steps will be:
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > > > > > 
> > > > > > > as a SQL System view.
> > > > > > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > > > > > 
> > > > > > > (described in IEP).
> > > > > > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > > > > > 
> > > > > > > GridMonitoringManager.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > Part of JSON file:
> > > > > > > > > > > > > > ```
> > > > > > > > > > > > > >     "COMPUTE": {
> > > > > > > > > > > > > >       "tasks": {
> > > > > > > > > > > > > >         "name": "tasks",
> > > > > > > > > > > > > >         "rows": [
> > > > > > > > > > > > > >           {
> > > > > > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > > > > > >             "sessionId":
> > > > > > > 
> > > > > > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > > > > > >             "data": {
> > > > > > > > > > > > > >               "taskClasName":
> > > > > > > 
> > > > > > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > > > > > >               "execName": null
> > > > > > > > > > > > > >             },
> > > > > > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > > > > > >           }
> > > > > > > > > > > > > > ```
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > [1]
> > > > > > > 
> > > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > > > > [2]
> > > > > > > 
> > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > > > > > [3]
> > > > > > > 
> > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > > > > > [4]
> > > > > > > 
> > > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > > [5]
> > > > > > > 
> > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > > > > > [6]
> > > > > > > 
> > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > > > > > [7]
> > > > > > > 
> > > > > > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 
> > > > > > > > > > > > > 

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Andrey Gura <ag...@apache.org>.
Hi,

>> I think akso that GridMetricManager is bad candidate for lists (system views) management.

>For me, it seems that views and metrics is extension of one another.
>If the user want to know some instant values(cache put count, cahe get latency) he use metrics
>and one want to know list of running SQL queries one take a look into views.

Views are about system state and they answer to question "what
entities exist in the system (caches)?" or "what processes are
executing by system (tx, queries)?"
Metrics are about system behavior in some retrospective. They answers
on questions how system behaves?

Views have wider meaning than metrics.

>> Code generation for walkers is also redundant.

>If you don't want, you can not use it.
>I find it pretty usefull during development.

I talk not about wishes of somebody ) Moreover, if it will depend on
wishes it potentially can lead to misusing. IMO using the same code at
runtime for view generation is better approach.

>> I really don't understand why we should export system views content
>> (especially periodically). Real life use case is take view content on
>> demand. So we should have public API for it, SQL API and JMX. There is
>> no need any exporters.

> What if we want to export lists to log or via http, etc?

If we will have public API for views then we can use REST for access
to this API. Also you can use public API directly. What is the reaal
life uses cases for exporting views? Is there any database which
exports some lists to somewhere? Especially on push based model, not
on demand.

On Fri, Sep 13, 2019 at 4:36 PM Nikolay Izhikov <ni...@apache.org> wrote:
>
> Hello, Andrey.
>
> > I really don't like name MonitoringList. First of all because it isn't
> > about monitoring at all while can be useful for monitoring purposes.
> > We already have SQL system views and I think that system view is good
> > candidate for naming of new entity.
>
> SystemView is OK for me.
> I will rename enity in the PR.
>
> > I think akso that GridMetricManager is bad candidate for lists (system views) management.
>
> For me, it seems that views and metrics is extension of one another.
> If the user want to know some instant values(cache put count, cahe get latency) he use metrics
> and one want to know list of running SQL queries one take a look into views.
>
> > There is no any interaction with lists on hot path of code flow
> > and there is no any performance impact.
>
> OK, let's remove it.
>
> > Code generation for walkers is also redundant.
>
> If you don't want, you can not use it.
> I find it pretty usefull during development.
>
> > I really don't understand why we should export system views content
> > (especially periodically). Real life use case is take view content on
> > demand. So we should have public API for it, SQL API and JMX. There is
> > no need any exporters.
>
> What if we want to export lists to log or via http, etc?
>
> > Also it would be great to involve more people to this discussion.
>
> Any feedback are welcome!
>
>
> В Пт, 13/09/2019 в 15:13 +0300, Andrey Gura пишет:
> > Nikolay,
> >
> > thanks a lot for clarification! I added some comments to Upsource review [1].
> >
> > Here I want to discuss some high-level issues.
> >
> > 1. Naming
> >
> > "There are only two hard things in Computer Science: cache
> > invalidation and naming things."
> > -- Phil Karlton
> >
> > I really don't like name MonitoringList. First of all because it isn't
> > about monitoring at all while can be useful for monitoring purposes.
> >
> > We already have SQL system views and I think that system view is good
> > candidate for naming of new entity. As result we will have consistent
> > naming which better describes domain.
> >
> > I think akso that GridMetricManager is bad candidate for lists (system
> > views) management. Because it isn't about metrics. May be new
> > SystemViewManager will better fit to this purposes.
> >
> > 2. Management
> >
> > Lists (aka system views) have life cycle now. I believe that it is
> > redundant functionality. There is no any reason for enabling/disabling
> > lists. There is no any interaction with lists on hot path of code flow
> > and there is no any performance impact.
> >
> > So lists management can be reduced to lists creation and registration
> > operations (which executes only on node start).
> >
> > 3. Code generation
> >
> > Code generation for walkers is also redundant. Amount of system views
> > in the system is strongly limited (units not dozens) so it is easier
> > to change walker by hand literally than navigate to code generator and
> > run it. Moreover, first you should add Order annotation in the proper
> > place and it make generator practically useless.
> >
> > If you still see benefit that can bring Order annotation you can use
> > reflection. Motivation is simple, system views are on not hot path and
> > I expected that API for system views will not called frequently.
> >
> > 4. Export
> >
> > I really don't understand why we should export system views content
> > (especially periodically). Real life use case is take view content on
> > demand. So we should have public API for it, SQL API and JMX. There is
> > no need any exporters.
> >
> >
> > What do you think about it? Also it would be great to involve more
> > people to this discussion.
> >
> > [1] https://reviews.ignite.apache.org/ignite/review/IGNT-CR-1065
> >
> > On Wed, Sep 11, 2019 at 6:24 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > >
> > > Hello, Andrey.
> > >
> > > Thanks, for joining the review.
> > >
> > > Basic interface for objects list is `MonitoringList`. It provides the following features:
> > >         * name.
> > >         * description.
> > >         * row class.
> > >         * size.
> > >         * iterator for the list content.
> > >         * attribute walker (described below).
> > >
> > > `MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.
> > >
> > > Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
> > > It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
> > > I think, will be another implementation in the follow-up PRs.
> > >
> > > Public API changes:
> > >
> > > * New registry created `ReadOnlyMonitoringListRegistry` It provides access:
> > >         * To all lists that exist in the Ignite.
> > >         * Ability to subscribe to the list creation/removal events.
> > >
> > > * `MetricExporterSpi` changes:
> > >         * `setMonitoringListRegistry` method added
> > >         * `setMonitoringListExportFilter` method added.
> > >
> > > `MonitoringRowAttributeWalker` is a helper class for exporter implementations.
> > > Usually, exporter SPI iterates on `MonitoringRow` attributes.
> > > `SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
> > > It can be implemented with Java reflection API, but I use more quick approach.
> > > `MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
> > > It's also, preserves, the order provided by the MonitoringRow implementation author.
> > > It provides 2 main methods:
> > >         * `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
> > >         * `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.
> > >
> > >
> > > В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> > > > Nikolai,
> > > >
> > > > I'm trying to review this PR but it is too large.
> > > >
> > > > Could you please describe problem and design of implemented solution?
> > > > Also javadocs for base interfaces aren't clear, too brief and doesn't
> > > > give any imagine about whole picture.
> > > >
> > > > At present it is very hard to understand the purposes of new
> > > > interfaces and walker generator, and design itself.
> > > >
> > > > On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > >
> > > > > Hello, Igniters.
> > > > >
> > > > > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > > > > Please, join to the review!
> > > > >
> > > > > I've implemented:
> > > > >
> > > > > * Monitoring list engine.
> > > > > * Following list implemented:
> > > > >     * Cache list
> > > > >     * Cache group list
> > > > >     * Compute task list
> > > > >     * Service list.
> > > > >
> > > > > Engine details:
> > > > >
> > > > > * `MonitoringList` added to store list data.
> > > > > * Base interface `MonitoringRow` for list data created.
> > > > > * Corresponding method added to `MetricExporterSpi`
> > > > > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > > > > support list export.
> > > > > * JMX, SQL and other column-oriented SPI uses
> > > > > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > > > > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > > > > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > > > >
> > > > > I prepare follow-up PR [2], also.
> > > > > Following lists implemented:
> > > > >
> > > > > * SQL tables
> > > > > * SQL indexes
> > > > > * SQL schemas
> > > > > * SQL queries
> > > > > * Continuous queries
> > > > > * Text queries
> > > > > * Transactions
> > > > > * Cluster nodes
> > > > > * Client connections(JDBC, ODBC, Thin)
> > > > >
> > > > > [1] https://github.com/apache/ignite/pull/6845
> > > > > [2] https://github.com/apache/ignite/pull/6790
> > > > >
> > > > >
> > > > >
> > > > > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > > > >
> > > > > > Hello, Igniters.
> > > > > >
> > > > > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > > > > for Phase 2.
> > > > > >
> > > > > > Scope of Phase 2(copy-paste from the ticket)
> > > > > >
> > > > > > Ability to collect lists of some internal object Ignite manage.
> > > > > > Examples of such objects:
> > > > > >
> > > > > >   * Caches
> > > > > >   * Queries (including continuous queries)
> > > > > >   * Services
> > > > > >   * Compute tasks
> > > > > >   * Distributed Data Structures
> > > > > >   * etc...
> > > > > >
> > > > > >
> > > > > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > > > > discussed in separate tickets
> > > > > > 2. Metric Exporters (optionally) can support list export.
> > > > > >
> > > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > > > >
> > > > > >
> > > > > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > > > > Ticket for IEP.Phase1 created -
> > > > > >
> > > > > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > > > >
> > > > > > >
> > > > > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > > > > Hello, Igniters.
> > > > > > > >
> > > > > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > > > >
> > > > > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > > > >
> > > > > > > > Issues to address:
> > > > > > > >
> > > > > > > > 1. Study experience of following libs, tools:
> > > > > > > >     * OpenTracing
> > > > > > > >     * OpenSensus
> > > > > > > >     * DropWizard
> > > > > > > >
> > > > > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > > > >
> > > > > > into predefined segments
> > > > > > > >
> > > > > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > > > >
> > > > > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > > > >
> > > > > > local metric storage.
> > > > > > > >
> > > > > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > > > >
> > > > > > > > 6. Some metrics should be part of public API and others are not(may be
> > > > > >
> > > > > > changed/removed in release without warnings).
> > > > > > > >
> > > > > > > > My plan for Phase #1 is the following:
> > > > > > > >
> > > > > > > > 1. Address the issues.
> > > > > > > > 2. Prepare public API
> > > > > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > > > >
> > > > > > with it.
> > > > > > > > 4. Prepare a PR with lists of each user API.
> > > > > > > > 5. Collect feedback for a #4.
> > > > > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > > > >
> > > > > > other widely used, tool compatible format.
> > > > > > > >
> > > > > > > > [1]
> > > > > >
> > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > >
> > > > > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > > > > Hello, Maxim.
> > > > > > > > >
> > > > > > > > > > How will be recorded throughput sensor values which will require
> > > > > >
> > > > > > an interval for the rate calculations?
> > > > > > > > >
> > > > > > > > > I answered to this question in IEP "Design principles":
> > > > > > > > >
> > > > > > > > > ```
> > > > > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > > > >
> > > > > > metrics on Ignite side.
> > > > > > > > > Min, max, avg and other functions are the matter of an external
> > > > > >
> > > > > > monitoring system.
> > > > > > > > > ```
> > > > > > > > >
> > > > > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > > > >
> > > > > > > > > Seems, throughput calculation is a responsibility of an external
> > > > > >
> > > > > > system.
> > > > > > > > >
> > > > > > > > > What do you think?
> > > > > > > > >
> > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > >
> > > > > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > > > > INFO, WARN, NOTICE, DEBUG).
> > > > > > > > >
> > > > > > > > > For now, I think that all sensors and lists will be very(very!)
> > > > > >
> > > > > > lightweight.
> > > > > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > > > >
> > > > > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > > > > for the case we have strong performance limitations for a particular
> > > > > >
> > > > > > workload.
> > > > > > > > >
> > > > > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > > > >
> > > > > > profiling: IEP-35 - Phase 3).
> > > > > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > > > >
> > > > > > should we?)
> > > > > > > > >
> > > > > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > > > > Hello Nikolay,
> > > > > > > > > >
> > > > > > > > > > I've looked through your PRs changes.
> > > > > > > > > >
> > > > > > > > > > > Sensors
> > > > > > > > > >
> > > > > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > > > >
> > > > > > are
> > > > > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > > > >
> > > > > > the
> > > > > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > > > >
> > > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > > > >
> > > > > > control
> > > > > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > >
> > > > > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > > > > functional java approach can be used. Am I right?
> > > > > > > > > >
> > > > > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > > > >
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hello, Vyacheslav.
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > >
> > > > > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > > > >
> > > > > > the core module.
> > > > > > > > > > >
> > > > > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > > > >
> > > > > > discuss.
> > > > > > > > > > >
> > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > > > >
> > > > > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > > > > for services
> > > > > > > > > > >
> > > > > > > > > > > Agreed in general.
> > > > > > > > > > > Seems, choosing the right data to expose is the matter of
> > > > > >
> > > > > > separate discussion for each Ignite entities.
> > > > > > > > > > > I've planned to file tickets for each entity so anyone
> > > > > >
> > > > > > interested can share his vision in it.
> > > > > > > > > > >
> > > > > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > > > >
> > > > > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > > > >
> > > > > > timestamp.
> > > > > > > > > > > Seems, we should support sensors without timestamp for a current
> > > > > >
> > > > > > monitoring numbers at least.
> > > > > > > > > > >
> > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > >
> > > > > > fixed size> of last N sensors
> > > > > > > > > > >
> > > > > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > > > >
> > > > > > queries" or similar data.
> > > > > > > > > > > Essentially, a sensor is just a single value with the name and
> > > > > >
> > > > > > known meaning.
> > > > > > > > > > >
> > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > >
> > > > > > work of> the system.
> > > > > > > > > > >
> > > > > > > > > > > Sorry, for that :)
> > > > > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > > > >
> > > > > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > > > > I provide this info in gist -
> > > > > >
> > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > >
> > > > > > > > > > > I will extend this test to print results to console in the next
> > > > > >
> > > > > > iterations - stay tuned :)
> > > > > > > > > > >
> > > > > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > > > > Hi, Nikolay,
> > > > > > > > > > > >
> > > > > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > > > >
> > > > > > > > > > > > It would be better to implement it as a separate module, I
> > > > > >
> > > > > > can't say
> > > > > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > > > >
> > > > > > detached
> > > > > > > > > > > > from the core module.
> > > > > > > > > > > >
> > > > > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > > > >
> > > > > > 'ServiceConfiguration'
> > > > > > > > > > > > directly as a monitored object for services. I believe we
> > > > > >
> > > > > > shouldn't
> > > > > > > > > > > > mix approaches. It'd be better always use some kind of
> > > > > >
> > > > > > container with
> > > > > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > > > >
> > > > > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > > > >
> > > > > > monitoring
> > > > > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > > > > timestamp.
> > > > > > > > > > > >
> > > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > >
> > > > > > fixed size
> > > > > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > > > >
> > > > > > external
> > > > > > > > > > > > monitoring system.
> > > > > > > > > > > >
> > > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > >
> > > > > > work of
> > > > > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > > > >
> > > > > > and get
> > > > > > > > > > > > the info manually to see the completeness of sensors, this
> > > > > >
> > > > > > might be
> > > > > > > > > > > > simplified by proper test.
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you!
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > > > >
> > > > > > nizhikov@apache.org> wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > > > > PR can be found here -
> > > > > >
> > > > > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > > > >
> > > > > > > > > > > > > I've done following changes:
> > > > > > > > > > > > >
> > > > > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > > > >
> > > > > > implementation of manager to store all monitoring info
> > > > > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > > > >
> > > > > > implementation that can respond with JSON from
> > > > > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > > > > gist [4]
> > > > > > > > > > > > >         3. Compute task start and finish monitoring in
> > > > > >
> > > > > > "compute" list [5]
> > > > > > > > > > > > >         4. Service registration are monitored in "service"
> > > > > >
> > > > > > list - [6]
> > > > > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > > > >
> > > > > > `GridMonitoringManager` [7]
> > > > > > > > > > > > >
> > > > > > > > > > > > > Design principles, monitoring subsystem details and new
> > > > > >
> > > > > > Ignite entities can be found in IEP [1].
> > > > > > > > > > > > >
> > > > > > > > > > > > > My next steps will be:
> > > > > > > > > > > > >
> > > > > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > > > >
> > > > > > as a SQL System view.
> > > > > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > > > >
> > > > > > (described in IEP).
> > > > > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > > > >
> > > > > > GridMonitoringManager.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Part of JSON file:
> > > > > > > > > > > > > ```
> > > > > > > > > > > > >     "COMPUTE": {
> > > > > > > > > > > > >       "tasks": {
> > > > > > > > > > > > >         "name": "tasks",
> > > > > > > > > > > > >         "rows": [
> > > > > > > > > > > > >           {
> > > > > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > > > > >             "sessionId":
> > > > > >
> > > > > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > > > > >             "data": {
> > > > > > > > > > > > >               "taskClasName":
> > > > > >
> > > > > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > > > > >               "execName": null
> > > > > > > > > > > > >             },
> > > > > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > > > > >           }
> > > > > > > > > > > > > ```
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1]
> > > > > >
> > > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > > > [2]
> > > > > >
> > > > > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > > > > [3]
> > > > > >
> > > > > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > > > > [4]
> > > > > >
> > > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > > [5]
> > > > > >
> > > > > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > > > > [6]
> > > > > >
> > > > > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > > > > [7]
> > > > > >
> > > > > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Nikolay Izhikov <ni...@apache.org>.
Hello, Andrey.

> I really don't like name MonitoringList. First of all because it isn't
> about monitoring at all while can be useful for monitoring purposes.
> We already have SQL system views and I think that system view is good
> candidate for naming of new entity. 

SystemView is OK for me.
I will rename enity in the PR.

> I think akso that GridMetricManager is bad candidate for lists (system views) management. 

For me, it seems that views and metrics is extension of one another.
If the user want to know some instant values(cache put count, cahe get latency) he use metrics
and one want to know list of running SQL queries one take a look into views. 

> There is no any interaction with lists on hot path of code flow
> and there is no any performance impact.

OK, let's remove it.

> Code generation for walkers is also redundant.

If you don't want, you can not use it.
I find it pretty usefull during development.

> I really don't understand why we should export system views content
> (especially periodically). Real life use case is take view content on
> demand. So we should have public API for it, SQL API and JMX. There is
> no need any exporters.

What if we want to export lists to log or via http, etc?

> Also it would be great to involve more people to this discussion.

Any feedback are welcome!


В Пт, 13/09/2019 в 15:13 +0300, Andrey Gura пишет:
> Nikolay,
> 
> thanks a lot for clarification! I added some comments to Upsource review [1].
> 
> Here I want to discuss some high-level issues.
> 
> 1. Naming
> 
> "There are only two hard things in Computer Science: cache
> invalidation and naming things."
> -- Phil Karlton
> 
> I really don't like name MonitoringList. First of all because it isn't
> about monitoring at all while can be useful for monitoring purposes.
> 
> We already have SQL system views and I think that system view is good
> candidate for naming of new entity. As result we will have consistent
> naming which better describes domain.
> 
> I think akso that GridMetricManager is bad candidate for lists (system
> views) management. Because it isn't about metrics. May be new
> SystemViewManager will better fit to this purposes.
> 
> 2. Management
> 
> Lists (aka system views) have life cycle now. I believe that it is
> redundant functionality. There is no any reason for enabling/disabling
> lists. There is no any interaction with lists on hot path of code flow
> and there is no any performance impact.
> 
> So lists management can be reduced to lists creation and registration
> operations (which executes only on node start).
> 
> 3. Code generation
> 
> Code generation for walkers is also redundant. Amount of system views
> in the system is strongly limited (units not dozens) so it is easier
> to change walker by hand literally than navigate to code generator and
> run it. Moreover, first you should add Order annotation in the proper
> place and it make generator practically useless.
> 
> If you still see benefit that can bring Order annotation you can use
> reflection. Motivation is simple, system views are on not hot path and
> I expected that API for system views will not called frequently.
> 
> 4. Export
> 
> I really don't understand why we should export system views content
> (especially periodically). Real life use case is take view content on
> demand. So we should have public API for it, SQL API and JMX. There is
> no need any exporters.
> 
> 
> What do you think about it? Also it would be great to involve more
> people to this discussion.
> 
> [1] https://reviews.ignite.apache.org/ignite/review/IGNT-CR-1065
> 
> On Wed, Sep 11, 2019 at 6:24 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > 
> > Hello, Andrey.
> > 
> > Thanks, for joining the review.
> > 
> > Basic interface for objects list is `MonitoringList`. It provides the following features:
> >         * name.
> >         * description.
> >         * row class.
> >         * size.
> >         * iterator for the list content.
> >         * attribute walker (described below).
> > 
> > `MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.
> > 
> > Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
> > It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
> > I think, will be another implementation in the follow-up PRs.
> > 
> > Public API changes:
> > 
> > * New registry created `ReadOnlyMonitoringListRegistry` It provides access:
> >         * To all lists that exist in the Ignite.
> >         * Ability to subscribe to the list creation/removal events.
> > 
> > * `MetricExporterSpi` changes:
> >         * `setMonitoringListRegistry` method added
> >         * `setMonitoringListExportFilter` method added.
> > 
> > `MonitoringRowAttributeWalker` is a helper class for exporter implementations.
> > Usually, exporter SPI iterates on `MonitoringRow` attributes.
> > `SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
> > It can be implemented with Java reflection API, but I use more quick approach.
> > `MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
> > It's also, preserves, the order provided by the MonitoringRow implementation author.
> > It provides 2 main methods:
> >         * `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
> >         * `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.
> > 
> > 
> > В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> > > Nikolai,
> > > 
> > > I'm trying to review this PR but it is too large.
> > > 
> > > Could you please describe problem and design of implemented solution?
> > > Also javadocs for base interfaces aren't clear, too brief and doesn't
> > > give any imagine about whole picture.
> > > 
> > > At present it is very hard to understand the purposes of new
> > > interfaces and walker generator, and design itself.
> > > 
> > > On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > > > 
> > > > Hello, Igniters.
> > > > 
> > > > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > > > Please, join to the review!
> > > > 
> > > > I've implemented:
> > > > 
> > > > * Monitoring list engine.
> > > > * Following list implemented:
> > > >     * Cache list
> > > >     * Cache group list
> > > >     * Compute task list
> > > >     * Service list.
> > > > 
> > > > Engine details:
> > > > 
> > > > * `MonitoringList` added to store list data.
> > > > * Base interface `MonitoringRow` for list data created.
> > > > * Corresponding method added to `MetricExporterSpi`
> > > > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > > > support list export.
> > > > * JMX, SQL and other column-oriented SPI uses
> > > > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > > > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > > > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > > > 
> > > > I prepare follow-up PR [2], also.
> > > > Following lists implemented:
> > > > 
> > > > * SQL tables
> > > > * SQL indexes
> > > > * SQL schemas
> > > > * SQL queries
> > > > * Continuous queries
> > > > * Text queries
> > > > * Transactions
> > > > * Cluster nodes
> > > > * Client connections(JDBC, ODBC, Thin)
> > > > 
> > > > [1] https://github.com/apache/ignite/pull/6845
> > > > [2] https://github.com/apache/ignite/pull/6790
> > > > 
> > > > 
> > > > 
> > > > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > > > 
> > > > > Hello, Igniters.
> > > > > 
> > > > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > > > for Phase 2.
> > > > > 
> > > > > Scope of Phase 2(copy-paste from the ticket)
> > > > > 
> > > > > Ability to collect lists of some internal object Ignite manage.
> > > > > Examples of such objects:
> > > > > 
> > > > >   * Caches
> > > > >   * Queries (including continuous queries)
> > > > >   * Services
> > > > >   * Compute tasks
> > > > >   * Distributed Data Structures
> > > > >   * etc...
> > > > > 
> > > > > 
> > > > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > > > discussed in separate tickets
> > > > > 2. Metric Exporters (optionally) can support list export.
> > > > > 
> > > > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > > > 
> > > > > 
> > > > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > > > Ticket for IEP.Phase1 created -
> > > > > 
> > > > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > > > 
> > > > > > 
> > > > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > > > Hello, Igniters.
> > > > > > > 
> > > > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > > > 
> > > > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > > > 
> > > > > > > Issues to address:
> > > > > > > 
> > > > > > > 1. Study experience of following libs, tools:
> > > > > > >     * OpenTracing
> > > > > > >     * OpenSensus
> > > > > > >     * DropWizard
> > > > > > > 
> > > > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > > > 
> > > > > into predefined segments
> > > > > > > 
> > > > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > > > 
> > > > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > > > 
> > > > > local metric storage.
> > > > > > > 
> > > > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > > > 
> > > > > > > 6. Some metrics should be part of public API and others are not(may be
> > > > > 
> > > > > changed/removed in release without warnings).
> > > > > > > 
> > > > > > > My plan for Phase #1 is the following:
> > > > > > > 
> > > > > > > 1. Address the issues.
> > > > > > > 2. Prepare public API
> > > > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > > > 
> > > > > with it.
> > > > > > > 4. Prepare a PR with lists of each user API.
> > > > > > > 5. Collect feedback for a #4.
> > > > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > > > 
> > > > > other widely used, tool compatible format.
> > > > > > > 
> > > > > > > [1]
> > > > > 
> > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > 
> > > > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > > > Hello, Maxim.
> > > > > > > > 
> > > > > > > > > How will be recorded throughput sensor values which will require
> > > > > 
> > > > > an interval for the rate calculations?
> > > > > > > > 
> > > > > > > > I answered to this question in IEP "Design principles":
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > > > 
> > > > > metrics on Ignite side.
> > > > > > > > Min, max, avg and other functions are the matter of an external
> > > > > 
> > > > > monitoring system.
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > > > 
> > > > > > > > Seems, throughput calculation is a responsibility of an external
> > > > > 
> > > > > system.
> > > > > > > > 
> > > > > > > > What do you think?
> > > > > > > > 
> > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > 
> > > > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > > > INFO, WARN, NOTICE, DEBUG).
> > > > > > > > 
> > > > > > > > For now, I think that all sensors and lists will be very(very!)
> > > > > 
> > > > > lightweight.
> > > > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > > > 
> > > > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > > > for the case we have strong performance limitations for a particular
> > > > > 
> > > > > workload.
> > > > > > > > 
> > > > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > > > 
> > > > > profiling: IEP-35 - Phase 3).
> > > > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > > > 
> > > > > should we?)
> > > > > > > > 
> > > > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > > > Hello Nikolay,
> > > > > > > > > 
> > > > > > > > > I've looked through your PRs changes.
> > > > > > > > > 
> > > > > > > > > > Sensors
> > > > > > > > > 
> > > > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > > > 
> > > > > are
> > > > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > > > 
> > > > > the
> > > > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > > > 
> > > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > > > 
> > > > > control
> > > > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > > > 
> > > > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > > > functional java approach can be used. Am I right?
> > > > > > > > > 
> > > > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > > > 
> > > > > wrote:
> > > > > > > > > > 
> > > > > > > > > > Hello, Vyacheslav.
> > > > > > > > > > 
> > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > 
> > > > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > > > 
> > > > > the core module.
> > > > > > > > > > 
> > > > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > > > 
> > > > > discuss.
> > > > > > > > > > 
> > > > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > > > 
> > > > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > > > for services
> > > > > > > > > > 
> > > > > > > > > > Agreed in general.
> > > > > > > > > > Seems, choosing the right data to expose is the matter of
> > > > > 
> > > > > separate discussion for each Ignite entities.
> > > > > > > > > > I've planned to file tickets for each entity so anyone
> > > > > 
> > > > > interested can share his vision in it.
> > > > > > > > > > 
> > > > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > > > 
> > > > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > > > 
> > > > > timestamp.
> > > > > > > > > > Seems, we should support sensors without timestamp for a current
> > > > > 
> > > > > monitoring numbers at least.
> > > > > > > > > > 
> > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > 
> > > > > fixed size> of last N sensors
> > > > > > > > > > 
> > > > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > > > 
> > > > > queries" or similar data.
> > > > > > > > > > Essentially, a sensor is just a single value with the name and
> > > > > 
> > > > > known meaning.
> > > > > > > > > > 
> > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > 
> > > > > work of> the system.
> > > > > > > > > > 
> > > > > > > > > > Sorry, for that :)
> > > > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > > > 
> > > > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > > > I provide this info in gist -
> > > > > 
> > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > 
> > > > > > > > > > I will extend this test to print results to console in the next
> > > > > 
> > > > > iterations - stay tuned :)
> > > > > > > > > > 
> > > > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > > > Hi, Nikolay,
> > > > > > > > > > > 
> > > > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > > > 
> > > > > > > > > > > It would be better to implement it as a separate module, I
> > > > > 
> > > > > can't say
> > > > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > > > 
> > > > > detached
> > > > > > > > > > > from the core module.
> > > > > > > > > > > 
> > > > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > > > 
> > > > > 'ServiceConfiguration'
> > > > > > > > > > > directly as a monitored object for services. I believe we
> > > > > 
> > > > > shouldn't
> > > > > > > > > > > mix approaches. It'd be better always use some kind of
> > > > > 
> > > > > container with
> > > > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > > > 
> > > > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > > > 
> > > > > monitoring
> > > > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > > > timestamp.
> > > > > > > > > > > 
> > > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > > > 
> > > > > fixed size
> > > > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > > > 
> > > > > external
> > > > > > > > > > > monitoring system.
> > > > > > > > > > > 
> > > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > > > 
> > > > > work of
> > > > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > > > 
> > > > > and get
> > > > > > > > > > > the info manually to see the completeness of sensors, this
> > > > > 
> > > > > might be
> > > > > > > > > > > simplified by proper test.
> > > > > > > > > > > 
> > > > > > > > > > > Thank you!
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > > > 
> > > > > nizhikov@apache.org> wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > > > 
> > > > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > > > PR can be found here -
> > > > > 
> > > > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > > > 
> > > > > > > > > > > > I've done following changes:
> > > > > > > > > > > > 
> > > > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > > > 
> > > > > implementation of manager to store all monitoring info
> > > > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > > > 
> > > > > implementation that can respond with JSON from
> > > > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > > > gist [4]
> > > > > > > > > > > >         3. Compute task start and finish monitoring in
> > > > > 
> > > > > "compute" list [5]
> > > > > > > > > > > >         4. Service registration are monitored in "service"
> > > > > 
> > > > > list - [6]
> > > > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > > > 
> > > > > `GridMonitoringManager` [7]
> > > > > > > > > > > > 
> > > > > > > > > > > > Design principles, monitoring subsystem details and new
> > > > > 
> > > > > Ignite entities can be found in IEP [1].
> > > > > > > > > > > > 
> > > > > > > > > > > > My next steps will be:
> > > > > > > > > > > > 
> > > > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > > > 
> > > > > as a SQL System view.
> > > > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > > > 
> > > > > (described in IEP).
> > > > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > > > 
> > > > > GridMonitoringManager.
> > > > > > > > > > > > 
> > > > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > > > 
> > > > > > > > > > > > Part of JSON file:
> > > > > > > > > > > > ```
> > > > > > > > > > > >     "COMPUTE": {
> > > > > > > > > > > >       "tasks": {
> > > > > > > > > > > >         "name": "tasks",
> > > > > > > > > > > >         "rows": [
> > > > > > > > > > > >           {
> > > > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > > > >             "sessionId":
> > > > > 
> > > > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > > > >             "data": {
> > > > > > > > > > > >               "taskClasName":
> > > > > 
> > > > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > > > >               "execName": null
> > > > > > > > > > > >             },
> > > > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > > > >           }
> > > > > > > > > > > > ```
> > > > > > > > > > > > 
> > > > > > > > > > > > [1]
> > > > > 
> > > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > > [2]
> > > > > 
> > > > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > > > [3]
> > > > > 
> > > > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > > > [4]
> > > > > 
> > > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > > [5]
> > > > > 
> > > > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > > > [6]
> > > > > 
> > > > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > > > [7]
> > > > > 
> > > > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > > > 
> > > > > > > > > > > 
> > > > > > > > > > > 

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Andrey Gura <ag...@apache.org>.
Nikolay,

thanks a lot for clarification! I added some comments to Upsource review [1].

Here I want to discuss some high-level issues.

1. Naming

"There are only two hard things in Computer Science: cache
invalidation and naming things."
-- Phil Karlton

I really don't like name MonitoringList. First of all because it isn't
about monitoring at all while can be useful for monitoring purposes.

We already have SQL system views and I think that system view is good
candidate for naming of new entity. As result we will have consistent
naming which better describes domain.

I think akso that GridMetricManager is bad candidate for lists (system
views) management. Because it isn't about metrics. May be new
SystemViewManager will better fit to this purposes.

2. Management

Lists (aka system views) have life cycle now. I believe that it is
redundant functionality. There is no any reason for enabling/disabling
lists. There is no any interaction with lists on hot path of code flow
and there is no any performance impact.

So lists management can be reduced to lists creation and registration
operations (which executes only on node start).

3. Code generation

Code generation for walkers is also redundant. Amount of system views
in the system is strongly limited (units not dozens) so it is easier
to change walker by hand literally than navigate to code generator and
run it. Moreover, first you should add Order annotation in the proper
place and it make generator practically useless.

If you still see benefit that can bring Order annotation you can use
reflection. Motivation is simple, system views are on not hot path and
I expected that API for system views will not called frequently.

4. Export

I really don't understand why we should export system views content
(especially periodically). Real life use case is take view content on
demand. So we should have public API for it, SQL API and JMX. There is
no need any exporters.


What do you think about it? Also it would be great to involve more
people to this discussion.

[1] https://reviews.ignite.apache.org/ignite/review/IGNT-CR-1065

On Wed, Sep 11, 2019 at 6:24 PM Nikolay Izhikov <ni...@apache.org> wrote:
>
> Hello, Andrey.
>
> Thanks, for joining the review.
>
> Basic interface for objects list is `MonitoringList`. It provides the following features:
>         * name.
>         * description.
>         * row class.
>         * size.
>         * iterator for the list content.
>         * attribute walker (described below).
>
> `MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.
>
> Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
> It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
> I think, will be another implementation in the follow-up PRs.
>
> Public API changes:
>
> * New registry created `ReadOnlyMonitoringListRegistry` It provides access:
>         * To all lists that exist in the Ignite.
>         * Ability to subscribe to the list creation/removal events.
>
> * `MetricExporterSpi` changes:
>         * `setMonitoringListRegistry` method added
>         * `setMonitoringListExportFilter` method added.
>
> `MonitoringRowAttributeWalker` is a helper class for exporter implementations.
> Usually, exporter SPI iterates on `MonitoringRow` attributes.
> `SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
> It can be implemented with Java reflection API, but I use more quick approach.
> `MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
> It's also, preserves, the order provided by the MonitoringRow implementation author.
> It provides 2 main methods:
>         * `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
>         * `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.
>
>
> В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> > Nikolai,
> >
> > I'm trying to review this PR but it is too large.
> >
> > Could you please describe problem and design of implemented solution?
> > Also javadocs for base interfaces aren't clear, too brief and doesn't
> > give any imagine about whole picture.
> >
> > At present it is very hard to understand the purposes of new
> > interfaces and walker generator, and design itself.
> >
> > On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > >
> > > Hello, Igniters.
> > >
> > > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > > Please, join to the review!
> > >
> > > I've implemented:
> > >
> > > * Monitoring list engine.
> > > * Following list implemented:
> > >     * Cache list
> > >     * Cache group list
> > >     * Compute task list
> > >     * Service list.
> > >
> > > Engine details:
> > >
> > > * `MonitoringList` added to store list data.
> > > * Base interface `MonitoringRow` for list data created.
> > > * Corresponding method added to `MetricExporterSpi`
> > > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > > support list export.
> > > * JMX, SQL and other column-oriented SPI uses
> > > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > >
> > > I prepare follow-up PR [2], also.
> > > Following lists implemented:
> > >
> > > * SQL tables
> > > * SQL indexes
> > > * SQL schemas
> > > * SQL queries
> > > * Continuous queries
> > > * Text queries
> > > * Transactions
> > > * Cluster nodes
> > > * Client connections(JDBC, ODBC, Thin)
> > >
> > > [1] https://github.com/apache/ignite/pull/6845
> > > [2] https://github.com/apache/ignite/pull/6790
> > >
> > >
> > >
> > > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > >
> > > > Hello, Igniters.
> > > >
> > > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > > for Phase 2.
> > > >
> > > > Scope of Phase 2(copy-paste from the ticket)
> > > >
> > > > Ability to collect lists of some internal object Ignite manage.
> > > > Examples of such objects:
> > > >
> > > >   * Caches
> > > >   * Queries (including continuous queries)
> > > >   * Services
> > > >   * Compute tasks
> > > >   * Distributed Data Structures
> > > >   * etc...
> > > >
> > > >
> > > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > > discussed in separate tickets
> > > > 2. Metric Exporters (optionally) can support list export.
> > > >
> > > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > >
> > > >
> > > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > > Ticket for IEP.Phase1 created -
> > > >
> > > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > >
> > > > >
> > > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > > Hello, Igniters.
> > > > > >
> > > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > >
> > > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > >
> > > > > > Issues to address:
> > > > > >
> > > > > > 1. Study experience of following libs, tools:
> > > > > >     * OpenTracing
> > > > > >     * OpenSensus
> > > > > >     * DropWizard
> > > > > >
> > > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > >
> > > > into predefined segments
> > > > > >
> > > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > >
> > > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > >
> > > > local metric storage.
> > > > > >
> > > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > >
> > > > > > 6. Some metrics should be part of public API and others are not(may be
> > > >
> > > > changed/removed in release without warnings).
> > > > > >
> > > > > > My plan for Phase #1 is the following:
> > > > > >
> > > > > > 1. Address the issues.
> > > > > > 2. Prepare public API
> > > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > >
> > > > with it.
> > > > > > 4. Prepare a PR with lists of each user API.
> > > > > > 5. Collect feedback for a #4.
> > > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > >
> > > > other widely used, tool compatible format.
> > > > > >
> > > > > > [1]
> > > >
> > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > >
> > > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > > Hello, Maxim.
> > > > > > >
> > > > > > > > How will be recorded throughput sensor values which will require
> > > >
> > > > an interval for the rate calculations?
> > > > > > >
> > > > > > > I answered to this question in IEP "Design principles":
> > > > > > >
> > > > > > > ```
> > > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > >
> > > > metrics on Ignite side.
> > > > > > > Min, max, avg and other functions are the matter of an external
> > > >
> > > > monitoring system.
> > > > > > > ```
> > > > > > >
> > > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > >
> > > > > > > Seems, throughput calculation is a responsibility of an external
> > > >
> > > > system.
> > > > > > >
> > > > > > > What do you think?
> > > > > > >
> > > > > > > > It seems to me that we can add an additional parameter of
> > > >
> > > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > > INFO, WARN, NOTICE, DEBUG).
> > > > > > >
> > > > > > > For now, I think that all sensors and lists will be very(very!)
> > > >
> > > > lightweight.
> > > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > >
> > > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > > for the case we have strong performance limitations for a particular
> > > >
> > > > workload.
> > > > > > >
> > > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > >
> > > > profiling: IEP-35 - Phase 3).
> > > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > >
> > > > should we?)
> > > > > > >
> > > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > > Hello Nikolay,
> > > > > > > >
> > > > > > > > I've looked through your PRs changes.
> > > > > > > >
> > > > > > > > > Sensors
> > > > > > > >
> > > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > >
> > > > are
> > > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > >
> > > > the
> > > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > >
> > > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > >
> > > > control
> > > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > >
> > > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > > functional java approach can be used. Am I right?
> > > > > > > >
> > > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > Hello, Vyacheslav.
> > > > > > > > >
> > > > > > > > > Thanks for the feedback!
> > > > > > > > >
> > > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > >
> > > > the core module.
> > > > > > > > >
> > > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > >
> > > > discuss.
> > > > > > > > >
> > > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > >
> > > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > > for services
> > > > > > > > >
> > > > > > > > > Agreed in general.
> > > > > > > > > Seems, choosing the right data to expose is the matter of
> > > >
> > > > separate discussion for each Ignite entities.
> > > > > > > > > I've planned to file tickets for each entity so anyone
> > > >
> > > > interested can share his vision in it.
> > > > > > > > >
> > > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > >
> > > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > >
> > > > timestamp.
> > > > > > > > > Seems, we should support sensors without timestamp for a current
> > > >
> > > > monitoring numbers at least.
> > > > > > > > >
> > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > >
> > > > fixed size> of last N sensors
> > > > > > > > >
> > > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > >
> > > > queries" or similar data.
> > > > > > > > > Essentially, a sensor is just a single value with the name and
> > > >
> > > > known meaning.
> > > > > > > > >
> > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > >
> > > > work of> the system.
> > > > > > > > >
> > > > > > > > > Sorry, for that :)
> > > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > >
> > > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > > I provide this info in gist -
> > > >
> > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > >
> > > > > > > > > I will extend this test to print results to console in the next
> > > >
> > > > iterations - stay tuned :)
> > > > > > > > >
> > > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > > Hi, Nikolay,
> > > > > > > > > >
> > > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > >
> > > > > > > > > > It would be better to implement it as a separate module, I
> > > >
> > > > can't say
> > > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > >
> > > > detached
> > > > > > > > > > from the core module.
> > > > > > > > > >
> > > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > >
> > > > 'ServiceConfiguration'
> > > > > > > > > > directly as a monitored object for services. I believe we
> > > >
> > > > shouldn't
> > > > > > > > > > mix approaches. It'd be better always use some kind of
> > > >
> > > > container with
> > > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > >
> > > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > >
> > > > monitoring
> > > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > > timestamp.
> > > > > > > > > >
> > > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > >
> > > > fixed size
> > > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > >
> > > > external
> > > > > > > > > > monitoring system.
> > > > > > > > > >
> > > > > > > > > > It'd be great if you provide a more extended test to show the
> > > >
> > > > work of
> > > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > >
> > > > and get
> > > > > > > > > > the info manually to see the completeness of sensors, this
> > > >
> > > > might be
> > > > > > > > > > simplified by proper test.
> > > > > > > > > >
> > > > > > > > > > Thank you!
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > >
> > > > nizhikov@apache.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hello, Igniters.
> > > > > > > > > > >
> > > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > > PR can be found here -
> > > >
> > > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > >
> > > > > > > > > > > I've done following changes:
> > > > > > > > > > >
> > > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > >
> > > > implementation of manager to store all monitoring info
> > > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > >
> > > > implementation that can respond with JSON from
> > > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > > gist [4]
> > > > > > > > > > >         3. Compute task start and finish monitoring in
> > > >
> > > > "compute" list [5]
> > > > > > > > > > >         4. Service registration are monitored in "service"
> > > >
> > > > list - [6]
> > > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > >
> > > > `GridMonitoringManager` [7]
> > > > > > > > > > >
> > > > > > > > > > > Design principles, monitoring subsystem details and new
> > > >
> > > > Ignite entities can be found in IEP [1].
> > > > > > > > > > >
> > > > > > > > > > > My next steps will be:
> > > > > > > > > > >
> > > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > >
> > > > as a SQL System view.
> > > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > >
> > > > (described in IEP).
> > > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > >
> > > > GridMonitoringManager.
> > > > > > > > > > >
> > > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > >
> > > > > > > > > > > Part of JSON file:
> > > > > > > > > > > ```
> > > > > > > > > > >     "COMPUTE": {
> > > > > > > > > > >       "tasks": {
> > > > > > > > > > >         "name": "tasks",
> > > > > > > > > > >         "rows": [
> > > > > > > > > > >           {
> > > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > > >             "sessionId":
> > > >
> > > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > > >             "data": {
> > > > > > > > > > >               "taskClasName":
> > > >
> > > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > > >               "execName": null
> > > > > > > > > > >             },
> > > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > > >           }
> > > > > > > > > > > ```
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > >
> > > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > > [2]
> > > >
> > > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > > [3]
> > > >
> > > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > > [4]
> > > >
> > > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > > [5]
> > > >
> > > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > > [6]
> > > >
> > > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > > [7]
> > > >
> > > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Nikolay Izhikov <ni...@apache.org>.
Hello, Andrey.

Thanks, for joining the review.

Basic interface for objects list is `MonitoringList`. It provides the following features:
	* name.
	* description.
	* row class.
	* size.
	* iterator for the list content.
	* attribute walker (described below).

`MonitoringRow` is a marker interface for classes that can be used as a monitoring list content.

Internally, there is only one implementation of `MonitoringList`, for now, `MonitoringListAdapter`.
It adapts the content of some `ConcurrentMap` which uses widely in Ignite internals.
I think, will be another implementation in the follow-up PRs.

Public API changes:

* New registry created `ReadOnlyMonitoringListRegistry` It provides access:
	* To all lists that exist in the Ignite.
	* Ability to subscribe to the list creation/removal events.

* `MetricExporterSpi` changes:
	* `setMonitoringListRegistry` method added
	* `setMonitoringListExportFilter` method added.

`MonitoringRowAttributeWalker` is a helper class for exporter implementations.
Usually, exporter SPI iterates on `MonitoringRow` attributes.
`SqlViewExporterSpi`, `JmxMetricExporterSpi` can be taken as an example.
It can be implemented with Java reflection API, but I use more quick approach.
`MonitoringRowAttributeWalker` can visit each attribute of the MonitoringRow implementation.
It's also, preserves, the order provided by the MonitoringRow implementation author.
It provides 2 main methods:
	* `visitAll(AttributeVisitor visitor);` - visits each attribute of the some monitoring row class. Provides index, name and class of attribute to the consumer.
	* `visitAll(R row, AttributeWithValueVisitor visitor)` - visits each attribute of some monitoring row instance. Provides index, name, class, value of attribute to the consumer.


В Ср, 11/09/2019 в 16:30 +0300, Andrey Gura пишет:
> Nikolai,
> 
> I'm trying to review this PR but it is too large.
> 
> Could you please describe problem and design of implemented solution?
> Also javadocs for base interfaces aren't clear, too brief and doesn't
> give any imagine about whole picture.
> 
> At present it is very hard to understand the purposes of new
> interfaces and walker generator, and design itself.
> 
> On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
> > 
> > Hello, Igniters.
> > 
> > IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> > Please, join to the review!
> > 
> > I've implemented:
> > 
> > * Monitoring list engine.
> > * Following list implemented:
> >     * Cache list
> >     * Cache group list
> >     * Compute task list
> >     * Service list.
> > 
> > Engine details:
> > 
> > * `MonitoringList` added to store list data.
> > * Base interface `MonitoringRow` for list data created.
> > * Corresponding method added to `MetricExporterSpi`
> > * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> > support list export.
> > * JMX, SQL and other column-oriented SPI uses
> > `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> > * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> > can be generated with `MonitoringRowAttributeWalkerGenerator`
> > 
> > I prepare follow-up PR [2], also.
> > Following lists implemented:
> > 
> > * SQL tables
> > * SQL indexes
> > * SQL schemas
> > * SQL queries
> > * Continuous queries
> > * Text queries
> > * Transactions
> > * Cluster nodes
> > * Client connections(JDBC, ODBC, Thin)
> > 
> > [1] https://github.com/apache/ignite/pull/6845
> > [2] https://github.com/apache/ignite/pull/6790
> > 
> > 
> > 
> > пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
> > 
> > > Hello, Igniters.
> > > 
> > > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > > for Phase 2.
> > > 
> > > Scope of Phase 2(copy-paste from the ticket)
> > > 
> > > Ability to collect lists of some internal object Ignite manage.
> > > Examples of such objects:
> > > 
> > >   * Caches
> > >   * Queries (including continuous queries)
> > >   * Services
> > >   * Compute tasks
> > >   * Distributed Data Structures
> > >   * etc...
> > > 
> > > 
> > > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > > discussed in separate tickets
> > > 2. Metric Exporters (optionally) can support list export.
> > > 
> > > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> > > 
> > > 
> > > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > > Ticket for IEP.Phase1 created -
> > > 
> > > https://issues.apache.org/jira/browse/IGNITE-11848
> > > > 
> > > > 
> > > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > > Hello, Igniters.
> > > > > 
> > > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > > 
> > > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > > > 
> > > > > Issues to address:
> > > > > 
> > > > > 1. Study experience of following libs, tools:
> > > > >     * OpenTracing
> > > > >     * OpenSensus
> > > > >     * DropWizard
> > > > > 
> > > > > 2. Support histogram sensor: Sensor that collects values that gets
> > > 
> > > into predefined segments
> > > > > 
> > > > > 3. Use more widely used naming(like in OpenSensus?)
> > > > > 
> > > > > 4. Consider the usage of OpenSensus as a default implementation for
> > > 
> > > local metric storage.
> > > > > 
> > > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > > > 
> > > > > 6. Some metrics should be part of public API and others are not(may be
> > > 
> > > changed/removed in release without warnings).
> > > > > 
> > > > > My plan for Phase #1 is the following:
> > > > > 
> > > > > 1. Address the issues.
> > > > > 2. Prepare public API
> > > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > > 
> > > with it.
> > > > > 4. Prepare a PR with lists of each user API.
> > > > > 5. Collect feedback for a #4.
> > > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > > 
> > > other widely used, tool compatible format.
> > > > > 
> > > > > [1]
> > > 
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > 
> > > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > > Hello, Maxim.
> > > > > > 
> > > > > > > How will be recorded throughput sensor values which will require
> > > 
> > > an interval for the rate calculations?
> > > > > > 
> > > > > > I answered to this question in IEP "Design principles":
> > > > > > 
> > > > > > ```
> > > > > > Sensors should contain only raw values. No aggregation of numeric
> > > 
> > > metrics on Ignite side.
> > > > > > Min, max, avg and other functions are the matter of an external
> > > 
> > > monitoring system.
> > > > > > ```
> > > > > > 
> > > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > > where S(t) is the sensor value in some point of time t.
> > > > > > 
> > > > > > Seems, throughput calculation is a responsibility of an external
> > > 
> > > system.
> > > > > > 
> > > > > > What do you think?
> > > > > > 
> > > > > > > It seems to me that we can add an additional parameter of
> > > 
> > > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > > INFO, WARN, NOTICE, DEBUG).
> > > > > > 
> > > > > > For now, I think that all sensors and lists will be very(very!)
> > > 
> > > lightweight.
> > > > > > So, we should be able to disable/enable it's, for sure.
> > > > > > 
> > > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > > for the case we have strong performance limitations for a particular
> > > 
> > > workload.
> > > > > > 
> > > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > > 
> > > profiling: IEP-35 - Phase 3).
> > > > > > For example, AFAIK we can't disable current SQL system views(Why
> > > 
> > > should we?)
> > > > > > 
> > > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > > Hello Nikolay,
> > > > > > > 
> > > > > > > I've looked through your PRs changes.
> > > > > > > 
> > > > > > > > Sensors
> > > > > > > 
> > > > > > > How will be recorded throughput sensor values which will require an
> > > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > > 
> > > are
> > > > > > > out of the scope of current PoC and IEP as they are not related to
> > > 
> > > the
> > > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > > > 
> > > > > > > It seems to me that we can add an additional parameter of
> > > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > > 
> > > control
> > > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > > > 
> > > > > > > It also seems that for the sensors getValue() the completely
> > > > > > > functional java approach can be used. Am I right?
> > > > > > > 
> > > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > > 
> > > wrote:
> > > > > > > > 
> > > > > > > > Hello, Vyacheslav.
> > > > > > > > 
> > > > > > > > Thanks for the feedback!
> > > > > > > > 
> > > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > > 
> > > the core module.
> > > > > > > > 
> > > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > > 
> > > discuss.
> > > > > > > > 
> > > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > > 
> > > like don't like using 'ServiceConfiguration' directly as a monitored object
> > > for services
> > > > > > > > 
> > > > > > > > Agreed in general.
> > > > > > > > Seems, choosing the right data to expose is the matter of
> > > 
> > > separate discussion for each Ignite entities.
> > > > > > > > I've planned to file tickets for each entity so anyone
> > > 
> > > interested can share his vision in it.
> > > > > > > > 
> > > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > > > 
> > > > > > > > I'm not sure that *every* sensor should have directly associated
> > > 
> > > timestamp.
> > > > > > > > Seems, we should support sensors without timestamp for a current
> > > 
> > > monitoring numbers at least.
> > > > > > > > 
> > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > 
> > > fixed size> of last N sensors
> > > > > > > > 
> > > > > > > > What use-cases do you know for such sensors?
> > > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > > 
> > > queries" or similar data.
> > > > > > > > Essentially, a sensor is just a single value with the name and
> > > 
> > > known meaning.
> > > > > > > > 
> > > > > > > > > It'd be great if you provide a more extended test to show the
> > > 
> > > work of> the system.
> > > > > > > > 
> > > > > > > > Sorry, for that :)
> > > > > > > > When you run 'MonitoringSelfTest' you should open
> > > 
> > > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > > I provide this info in gist -
> > > 
> > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > 
> > > > > > > > I will extend this test to print results to console in the next
> > > 
> > > iterations - stay tuned :)
> > > > > > > > 
> > > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > > Hi, Nikolay,
> > > > > > > > > 
> > > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > > > 
> > > > > > > > > It would be better to implement it as a separate module, I
> > > 
> > > can't say
> > > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > > 
> > > detached
> > > > > > > > > from the core module.
> > > > > > > > > 
> > > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > > 
> > > 'ServiceConfiguration'
> > > > > > > > > directly as a monitored object for services. I believe we
> > > 
> > > shouldn't
> > > > > > > > > mix approaches. It'd be better always use some kind of
> > > 
> > > container with
> > > > > > > > > monitored object's information to work with such data.
> > > > > > > > > 
> > > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > > 
> > > monitoring
> > > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > > timestamp.
> > > > > > > > > 
> > > > > > > > > Also, it'd be great to have an ability to store a list of a
> > > 
> > > fixed size
> > > > > > > > > of last N sensors, not to miss them without pushing to an
> > > 
> > > external
> > > > > > > > > monitoring system.
> > > > > > > > > 
> > > > > > > > > It'd be great if you provide a more extended test to show the
> > > 
> > > work of
> > > > > > > > > the system. Everybody who looks to PR needs to run the test
> > > 
> > > and get
> > > > > > > > > the info manually to see the completeness of sensors, this
> > > 
> > > might be
> > > > > > > > > simplified by proper test.
> > > > > > > > > 
> > > > > > > > > Thank you!
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > > 
> > > nizhikov@apache.org> wrote:
> > > > > > > > > > 
> > > > > > > > > > Hello, Igniters.
> > > > > > > > > > 
> > > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > > PR can be found here -
> > > 
> > > https://github.com/apache/ignite/pull/6510
> > > > > > > > > > 
> > > > > > > > > > I've done following changes:
> > > > > > > > > > 
> > > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > > 
> > > implementation of manager to store all monitoring info
> > > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > > 
> > > implementation that can respond with JSON from
> > > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > > gist [4]
> > > > > > > > > >         3. Compute task start and finish monitoring in
> > > 
> > > "compute" list [5]
> > > > > > > > > >         4. Service registration are monitored in "service"
> > > 
> > > list - [6]
> > > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > > 
> > > `GridMonitoringManager` [7]
> > > > > > > > > > 
> > > > > > > > > > Design principles, monitoring subsystem details and new
> > > 
> > > Ignite entities can be found in IEP [1].
> > > > > > > > > > 
> > > > > > > > > > My next steps will be:
> > > > > > > > > > 
> > > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > > 
> > > as a SQL System view.
> > > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > > 
> > > (described in IEP).
> > > > > > > > > >         4. Rewrite existing jmx metrics using
> > > 
> > > GridMonitoringManager.
> > > > > > > > > > 
> > > > > > > > > > Please, share you thoughts.
> > > > > > > > > > 
> > > > > > > > > > Part of JSON file:
> > > > > > > > > > ```
> > > > > > > > > >     "COMPUTE": {
> > > > > > > > > >       "tasks": {
> > > > > > > > > >         "name": "tasks",
> > > > > > > > > >         "rows": [
> > > > > > > > > >           {
> > > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > > >             "sessionId":
> > > 
> > > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > > >             "data": {
> > > > > > > > > >               "taskClasName":
> > > 
> > > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > > >               "execName": null
> > > > > > > > > >             },
> > > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > > >           }
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > [1]
> > > 
> > > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > > [2]
> > > 
> > > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > > [3]
> > > 
> > > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > > [4]
> > > 
> > > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > > [5]
> > > 
> > > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > > [6]
> > > 
> > > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > > [7]
> > > 
> > > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 

Re: [IEP-35] Monitoring & Profiling. Phase 2

Posted by Andrey Gura <ag...@apache.org>.
Nikolai,

I'm trying to review this PR but it is too large.

Could you please describe problem and design of implemented solution?
Also javadocs for base interfaces aren't clear, too brief and doesn't
give any imagine about whole picture.

At present it is very hard to understand the purposes of new
interfaces and walker generator, and design itself.

On Fri, Sep 6, 2019 at 3:16 PM Nikolay Izhikov <ni...@apache.org> wrote:
>
> Hello, Igniters.
>
> IEP-35. Monitoring&Profiling. Phase2 is ready [1]
> Please, join to the review!
>
> I've implemented:
>
> * Monitoring list engine.
> * Following list implemented:
>     * Cache list
>     * Cache group list
>     * Compute task list
>     * Service list.
>
> Engine details:
>
> * `MonitoringList` added to store list data.
> * Base interface `MonitoringRow` for list data created.
> * Corresponding method added to `MetricExporterSpi`
> * `JmxMetricExporterSpi`, `SqlViewExporterSpi`, `LogExporterSpi` updated to
> support list export.
> * JMX, SQL and other column-oriented SPI uses
> `MonitoringRowAttributeWalker` to quickly traverse all list row attributes.
> * Implementation of `MonitoringRowAttributeWalkerfor specificMonitoringRow`
> can be generated with `MonitoringRowAttributeWalkerGenerator`
>
> I prepare follow-up PR [2], also.
> Following lists implemented:
>
> * SQL tables
> * SQL indexes
> * SQL schemas
> * SQL queries
> * Continuous queries
> * Text queries
> * Transactions
> * Cluster nodes
> * Client connections(JDBC, ODBC, Thin)
>
> [1] https://github.com/apache/ignite/pull/6845
> [2] https://github.com/apache/ignite/pull/6790
>
>
>
> пн, 10 июн. 2019 г. в 13:49, Nikolay Izhikov <ni...@apache.org>:
>
> > Hello, Igniters.
> >
> > Since Phase 1 will be merged in master soon I've created the ticket [1]
> > for Phase 2.
> >
> > Scope of Phase 2(copy-paste from the ticket)
> >
> > Ability to collect lists of some internal object Ignite manage.
> > Examples of such objects:
> >
> >   * Caches
> >   * Queries (including continuous queries)
> >   * Services
> >   * Compute tasks
> >   * Distributed Data Structures
> >   * etc...
> >
> >
> > 1. Fields for each list(that doesn't currently exists in Ignite) will be
> > discussed in separate tickets
> > 2. Metric Exporters (optionally) can support list export.
> >
> > [1] https://issues.apache.org/jira/browse/IGNITE-11905
> >
> >
> > В Вт, 14/05/2019 в 16:42 +0300, Nikolay Izhikov пишет:
> > > Ticket for IEP.Phase1 created -
> > https://issues.apache.org/jira/browse/IGNITE-11848
> > >
> > >
> > > В Пн, 13/05/2019 в 18:06 +0300, Nikolay Izhikov пишет:
> > > > Hello, Igniters.
> > > >
> > > > We have discussed this IEP [1] with Alexey Goncharyuk, Anton
> > Vinogradov, Andrey Gura, Alexey Scherbakov and Pavel Kovalenko.
> > > >
> > > > Issues to address:
> > > >
> > > > 1. Study experience of following libs, tools:
> > > >     * OpenTracing
> > > >     * OpenSensus
> > > >     * DropWizard
> > > >
> > > > 2. Support histogram sensor: Sensor that collects values that gets
> > into predefined segments
> > > >
> > > > 3. Use more widely used naming(like in OpenSensus?)
> > > >
> > > > 4. Consider the usage of OpenSensus as a default implementation for
> > local metric storage.
> > > >
> > > > 5. To measure the performance penalty for metrics for 5_000 caches.
> > > >
> > > > 6. Some metrics should be part of public API and others are not(may be
> > changed/removed in release without warnings).
> > > >
> > > > My plan for Phase #1 is the following:
> > > >
> > > > 1. Address the issues.
> > > > 2. Prepare public API
> > > > 3. Prepare PR for monitoring subsystem + existing metrics rewritten
> > with it.
> > > > 4. Prepare a PR with lists of each user API.
> > > > 5. Collect feedback for a #4.
> > > > 6. Design a log exposer. Consider the usage of JFR format or some
> > other widely used, tool compatible format.
> > > >
> > > > [1]
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > >
> > > > В Чт, 02/05/2019 в 14:02 +0300, Nikolay Izhikov пишет:
> > > > > Hello, Maxim.
> > > > >
> > > > > > How will be recorded throughput sensor values which will require
> > an interval for the rate calculations?
> > > > >
> > > > > I answered to this question in IEP "Design principles":
> > > > >
> > > > > ```
> > > > > Sensors should contain only raw values. No aggregation of numeric
> > metrics on Ignite side.
> > > > > Min, max, avg and other functions are the matter of an external
> > monitoring system.
> > > > > ```
> > > > >
> > > > > Throughput is a function `(S(t2) - S(t1))/(t2-t1)`
> > > > > where S(t) is the sensor value in some point of time t.
> > > > >
> > > > > Seems, throughput calculation is a responsibility of an external
> > system.
> > > > >
> > > > > What do you think?
> > > > >
> > > > > > It seems to me that we can add an additional parameter of
> > `sensitivityLevel` to provide for the user a flexible sensor control (e.g.,
> > INFO, WARN, NOTICE, DEBUG).
> > > > >
> > > > > For now, I think that all sensors and lists will be very(very!)
> > lightweight.
> > > > > So, we should be able to disable/enable it's, for sure.
> > > > >
> > > > > But, we should turn off and turn on the whole Ignite subsystem
> > > > > for the case we have strong performance limitations for a particular
> > workload.
> > > > >
> > > > > So, we have two "level" of monitoring - INFO and DEBUG(for
> > profiling: IEP-35 - Phase 3).
> > > > > For example, AFAIK we can't disable current SQL system views(Why
> > should we?)
> > > > >
> > > > > В Вт, 30/04/2019 в 14:33 +0300, Maxim Muzafarov пишет:
> > > > > > Hello Nikolay,
> > > > > >
> > > > > > I've looked through your PRs changes.
> > > > > >
> > > > > > > Sensors
> > > > > >
> > > > > > How will be recorded throughput sensor values which will require an
> > > > > > interval for the rate calculations? Do we have such an example? For
> > > > > > instance, getAllocationRate() or getEvictionRate(). These metrics
> > are
> > > > > > out of the scope of current PoC and IEP as they are not related to
> > the
> > > > > > user metrics, but it is a good example of a particular metric type.
> > > > > >
> > > > > > It seems to me that we can add an additional parameter of
> > > > > > `sensitivityLevel` to provide for the user a flexible sensor
> > control
> > > > > > (e.g., INFO, WARN, NOTICE, DEBUG).
> > > > > >
> > > > > > It also seems that for the sensors getValue() the completely
> > > > > > functional java approach can be used. Am I right?
> > > > > >
> > > > > > On Mon, 29 Apr 2019 at 11:44, Nikolay Izhikov <ni...@apache.org>
> > wrote:
> > > > > > >
> > > > > > > Hello, Vyacheslav.
> > > > > > >
> > > > > > > Thanks for the feedback!
> > > > > > >
> > > > > > > > HttpExposer with Jetty's dependencies should be detached> from
> > the core module.
> > > > > > >
> > > > > > > Agreed. module hierarchy is the essence of the next steps.
> > > > > > > For now it just a proof of my ideas for Ignite monitoring we can
> > discuss.
> > > > > > >
> > > > > > > > I like your approach with 'wrapper' for monitored objects,
> > like don't like using 'ServiceConfiguration' directly as a monitored object
> > for services
> > > > > > >
> > > > > > > Agreed in general.
> > > > > > > Seems, choosing the right data to expose is the matter of
> > separate discussion for each Ignite entities.
> > > > > > > I've planned to file tickets for each entity so anyone
> > interested can share his vision in it.
> > > > > > >
> > > > > > > > In my opinion, each sensor should have a timestamp.
> > > > > > >
> > > > > > > I'm not sure that *every* sensor should have directly associated
> > timestamp.
> > > > > > > Seems, we should support sensors without timestamp for a current
> > monitoring numbers at least.
> > > > > > >
> > > > > > > > Also, it'd be great to have an ability to store a list of a
> > fixed size> of last N sensors
> > > > > > >
> > > > > > > What use-cases do you know for such sensors?
> > > > > > > We have plans to support fixed size lists to show "Last N SQL
> > queries" or similar data.
> > > > > > > Essentially, a sensor is just a single value with the name and
> > known meaning.
> > > > > > >
> > > > > > > > It'd be great if you provide a more extended test to show the
> > work of> the system.
> > > > > > >
> > > > > > > Sorry, for that :)
> > > > > > > When you run 'MonitoringSelfTest' you should open
> > http://localhost:8080/ignite/monitoring to view exposed info.
> > > > > > > I provide this info in gist -
> > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > >
> > > > > > > I will extend this test to print results to console in the next
> > iterations - stay tuned :)
> > > > > > >
> > > > > > > В Вс, 28/04/2019 в 23:35 +0300, Vyacheslav Daradur пишет:
> > > > > > > > Hi, Nikolay,
> > > > > > > >
> > > > > > > > I looked through PR and IEP, and I have some comments:
> > > > > > > >
> > > > > > > > It would be better to implement it as a separate module, I
> > can't say
> > > > > > > > if it is possible for the main part of monitoring or not, but I
> > > > > > > > believe that HttpExposer with Jetty's dependencies should be
> > detached
> > > > > > > > from the core module.
> > > > > > > >
> > > > > > > > I like your approach with 'wrapper' for monitored objects, like
> > > > > > > > 'ComputeTaskInfo' in PR, and don't like using
> > 'ServiceConfiguration'
> > > > > > > > directly as a monitored object for services. I believe we
> > shouldn't
> > > > > > > > mix approaches. It'd be better always use some kind of
> > container with
> > > > > > > > monitored object's information to work with such data.
> > > > > > > >
> > > > > > > > In my opinion, each sensor should have a timestamp. Usually
> > monitoring
> > > > > > > > systems aggregate data and build graphics according to sensors
> > > > > > > > timestamp.
> > > > > > > >
> > > > > > > > Also, it'd be great to have an ability to store a list of a
> > fixed size
> > > > > > > > of last N sensors, not to miss them without pushing to an
> > external
> > > > > > > > monitoring system.
> > > > > > > >
> > > > > > > > It'd be great if you provide a more extended test to show the
> > work of
> > > > > > > > the system. Everybody who looks to PR needs to run the test
> > and get
> > > > > > > > the info manually to see the completeness of sensors, this
> > might be
> > > > > > > > simplified by proper test.
> > > > > > > >
> > > > > > > > Thank you!
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Apr 26, 2019 at 5:56 PM Nikolay Izhikov <
> > nizhikov@apache.org> wrote:
> > > > > > > > >
> > > > > > > > > Hello, Igniters.
> > > > > > > > >
> > > > > > > > > I've prepared Proof of Concept for IEP-35 [1]
> > > > > > > > > PR can be found here -
> > https://github.com/apache/ignite/pull/6510
> > > > > > > > >
> > > > > > > > > I've done following changes:
> > > > > > > > >
> > > > > > > > >         1. `GridMonitoringManager`  [2] - simple
> > implementation of manager to store all monitoring info
> > > > > > > > >         2. `HttpPullExposerSpi` [3] - pull exposer
> > implementation that can respond with JSON from
> > http://localhost:8080/ignite/monitoring. JSON content can be veiwed in
> > gist [4]
> > > > > > > > >         3. Compute task start and finish monitoring in
> > "compute" list [5]
> > > > > > > > >         4. Service registration are monitored in "service"
> > list - [6]
> > > > > > > > >         5. Current `IgniteSpiMBeanAdapter` rewritten using
> > `GridMonitoringManager` [7]
> > > > > > > > >
> > > > > > > > > Design principles, monitoring subsystem details and new
> > Ignite entities can be found in IEP [1].
> > > > > > > > >
> > > > > > > > > My next steps will be:
> > > > > > > > >
> > > > > > > > >         1. Implementation of JMX exposer
> > > > > > > > >         2. Registration of all "lists" and "sensor groups"
> > as a SQL System view.
> > > > > > > > >         3. Add monitoring for all unmonitoring Ignite API.
> > (described in IEP).
> > > > > > > > >         4. Rewrite existing jmx metrics using
> > GridMonitoringManager.
> > > > > > > > >
> > > > > > > > > Please, share you thoughts.
> > > > > > > > >
> > > > > > > > > Part of JSON file:
> > > > > > > > > ```
> > > > > > > > >     "COMPUTE": {
> > > > > > > > >       "tasks": {
> > > > > > > > >         "name": "tasks",
> > > > > > > > >         "rows": [
> > > > > > > > >           {
> > > > > > > > >             "id": "0798817a-eeec-4386-9af7-94edb39ffced",
> > > > > > > > >             "sessionId":
> > "a1814f95a61-912451ff-ca7b-4764-a7fd-728f6a900000",
> > > > > > > > >             "data": {
> > > > > > > > >               "taskClasName":
> > "org.apache.ignite.monitoring.MonitoringSelfTest$$Lambda$145/1500885480",
> > > > > > > > >               "startTime": 1556287337944,
> > > > > > > > >               "timeout": 9223372036854776000,
> > > > > > > > >               "execName": null
> > > > > > > > >             },
> > > > > > > > >             "name": "anotherBroadcast"
> > > > > > > > >           }
> > > > > > > > > ```
> > > > > > > > >
> > > > > > > > > [1]
> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=112820392
> > > > > > > > > [2]
> > https://github.com/apache/ignite/pull/6510/files#diff-ec7d5cf5e35b99303deb9accee153c50R34
> > > > > > > > > [3]
> > https://github.com/apache/ignite/pull/6510/files#diff-32239c45e0ae3b692af2eae7078e1436R47
> > > > > > > > > [4]
> > https://gist.github.com/nizhikov/aa1e6222e6a3456472b881b8deb0e24d
> > > > > > > > > [5]
> > https://github.com/apache/ignite/pull/6510/files#diff-d651ed29d07bd0c5ce291654a3254cc0R749
> > > > > > > > > [6]
> > https://github.com/apache/ignite/pull/6510/files#diff-0b4e54fbda2b0da1c10eff48416336f6R1606
> > > > > > > > > [7]
> > https://github.com/apache/ignite/pull/6510/files#diff-4398bf118150500e059069b3a1638ec7R61
> > > > > > > >
> > > > > > > >
> > > > > > > >
> >