You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@accumulo.apache.org by Supun Kamburugamuva <su...@gmail.com> on 2013/04/21 16:37:31 UTC

GSOC: Monitor Improvements

Hi all,

I would like to start writing the proposal for the GSoc. I've put together
some initial high level goals of the project. Please let me know what I can
improve.

Per table plots: Accumulo 594
---------------------

The goal of this is to display plots that explains the various activtities
that happens per table. When we go to the tables page of the monitor and go
to a specific table it displays some information in a table format. We can
argument this information by showing graphs for

1. Ingest entries
2. Ingest data size
3. Scan entries
4. Scan data size

Per tablet plots
----------------------

Same as in the table plots we can display information regarding tablet
servers in the tablet server page. The plots will display the same
information as table plots considering data per tablet server.

Trace Visualization: Accumulo 1198
----------------------------

Since we are displaying graphs about each tablet and each table we can add
major and minor compaction graph to each table and each tablet.

Or other option is to display this in a single graph in overview page with
different graph lines for different tables and tablets.

Server type information : Accumulo 807
---------------------------------

For displaying this informations we can add a new page and display the
information as a table. The table should specify the network address of the
server, server type, weather it is active or in-active etc.

Thanks,
Supun...

Re: GSOC: Monitor Improvements

Posted by Miguel Pereira <mi...@gmail.com>.

Mike, this might be what your are referring to, maybe not, for time series
visualization.

http://square.github.io/cubism/

Also, I found jmxtrans to be useful when writing metrics to ganglia /
graphite.

Cheers,
Miguel


On Mon, Apr 22, 2013 at 1:50 PM, Keith Turner <ke...@deenlo.com> wrote:

> On Mon, Apr 22, 2013 at 12:42 PM, Supun Kamburugamuva <supun06@gmail.com
> >wrote:
>
> > Great.. we could certainly introduce the graph Mike and Keith have
> > mentioned.
> >
>
> I mentioned that it would be useful to display info collected from clients.
>  Tracing already collects this info.  The graph Mike mentioned may be
> useful for displaying trace info, maybe a plot per a trace field.
>
>
> >
> > Supun..
> >
> >
> > On Mon, Apr 22, 2013 at 12:02 PM, Keith Turner <ke...@deenlo.com> wrote:
> >
> > > On Mon, Apr 22, 2013 at 11:42 AM, Mike Drob <md...@mdrob.com> wrote:
> > >
> > > > Adding on to the comment about summaries, averages, and outliers. If,
> > for
> > > > some reason, you end up with a two-hump population, then simply
> showing
> > > > averages will mask the split and lose a lot of valuable information.
> It
> > > is
> > > > often valuable to know that a particular set of users or servers are
> > > > experiencing degraded performance while the rest of the ecosystem is
> > > > healthy.
> > > >
> > > > This isn't something that shows up in a regular time series because
> the
> > > > secondary population is usually very small compared to the total
> > > > population. There was a graph for request latency of a service that I
> > saw
> > > > once that I really wish I could find again, maybe somebody on the
> list
> > > will
> > > > be able to chime in - It had timestamps on the x-axis, latency on the
> > y,
> > > > and each (x,y) point was colored on a gradient representing how many
> > > > requests were fulfilled at time x with latency y. This chart make it
> > > > immediately easy to see that most data points fit a normal
> distribution
> > > > with a low mean, but there was also a cluster at the top for some
> > reason.
> > > >
> > >
> > >
> > > That sounds really cool.  Maybe the y-axis/latency could be log scale.
> > > Inevitably a 3004 second operation will finish and obscure the
> > > smaller latencies.
> > >
> > > Sometimes its more useful to sample this type of info from the clients
> > > rather than tablet servers.   A tablet server may report low latencies,
> > but
> > > all clients using may experience high latencies because of a network
> > issue.
> > >   We could certainly consider making the client code report this info.
> > >
> > >
> > > >
> > > > I'd love to see that type of chart show up for tablet servers
> (probably
> > > not
> > > > as useful for tables).
> > > >
> > > > Mike
> > > >
> > > >
> > > > On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
> > > > wrote:
> > > >
> > > > > Another thing to consider is scale.  On large clusters (many
> hundreds
> > > of
> > > > > nodes), more data is not helpful for visualization.  Instead,
> > > summaries,
> > > > > averages and outliers are important.
> > > > >
> > > > > For example, if one node is consistently slow, it is better to know
> > > that
> > > > > than to see one graph with low numbers in a sea of graphs.
> > > > >
> > > > > If the monitor collects information using JMX, collection time for
> > each
> > > > > node would be a good thing to know, too.
> > > > >
> > > > > -Eric
> > > > >
> > > > >
> > > > > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <josh.elser@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Supun,
> > > > > >
> > > > > > Yup, very much so. Having a way to consume any and all metrics
> via
> > > JMX
> > > > > > would simplify things for any consumers (internal or external).
> > > > > >
> > > > > >
> > > > > >
> > > > > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > > > > >
> > > > > >> Hi Josh,
> > > > > >>
> > > > > >> Thanks for the suggestions. I'll incorporate these to the
> > proposal.
> > > > > >>
> > > > > >> Another area I would like to work is on JMX. There is a Jira
> that
> > > says
> > > > > to
> > > > > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do
> > you
> > > > > think
> > > > > >> this is a good addition to the Monitor?
> > > > > >>
> > > > > >> Thanks,
> > > > > >> Supun..
> > > > > >>
> > > > > >>
> > > > > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <
> josh.elser@gmail.com
> > >
> > > > > wrote:
> > > > > >>
> > > > > >>  Supun,
> > > > > >>>
> > > > > >>> Looks good! Can I make some suggestions/comments?
> > > > > >>>
> > > > > >>> For: "Per table plots: ACCUMULO-594", I'd also like to see
> minor
> > > > > >>> compactions, major compactions, index cache hit rate, and data
> > > cache
> > > > > hit
> > > > > >>> rate per table (same graphs that are displayed system-wide when
> > you
> > > > > visit
> > > > > >>> http://${MONITOR_HOST}:50095/.
> > > > > >>>
> > > > > >>> For "Per tablet [server] plots", it would be neat if you could
> > also
> > > > > >>> extract some general statistics like top N least performing,
> top
> > N
> > > > > >>> highest
> > > > > >>> performing, etc. tablet servers. Ideally, this could correlate
> > with
> > > > > >>> servers
> > > > > >>> that may be having problems :).
> > > > > >>>
> > > > > >>> Do you see these proposed changes as being sufficient for 3-4
> > > months
> > > > of
> > > > > >>> 40hrs/week work? If you plan to really dig into these changes
> > > > (perhaps
> > > > > >>> reworking components of the monitor itself), I could perhaps
> see
> > > > this.
> > > > > Do
> > > > > >>> you have any ideas for more lofty goals that you could pursue
> as
> > > > well?
> > > > > I
> > > > > >>> don't want you/us to get one month into things and see you
> > complete
> > > > > >>> everything we initially planned to accomplish :)
> > > > > >>>
> > > > > >>> - Josh
> > > > > >>>
> > > > > >>>
> > > > > >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> > > > > >>>
> > > > > >>>  Hi all,
> > > > > >>>>
> > > > > >>>> I would like to start writing the proposal for the GSoc. I've
> > put
> > > > > >>>> together
> > > > > >>>> some initial high level goals of the project. Please let me
> know
> > > > what
> > > > > I
> > > > > >>>> can
> > > > > >>>> improve.
> > > > > >>>>
> > > > > >>>> Per table plots: Accumulo 594
> > > > > >>>> ---------------------
> > > > > >>>>
> > > > > >>>> The goal of this is to display plots that explains the various
> > > > > >>>> activtities
> > > > > >>>> that happens per table. When we go to the tables page of the
> > > monitor
> > > > > and
> > > > > >>>> go
> > > > > >>>> to a specific table it displays some information in a table
> > > format.
> > > > We
> > > > > >>>> can
> > > > > >>>> argument this information by showing graphs for
> > > > > >>>>
> > > > > >>>> 1. Ingest entries
> > > > > >>>> 2. Ingest data size
> > > > > >>>> 3. Scan entries
> > > > > >>>> 4. Scan data size
> > > > > >>>>
> > > > > >>>> Per tablet plots
> > > > > >>>> ----------------------
> > > > > >>>>
> > > > > >>>> Same as in the table plots we can display information
> regarding
> > > > tablet
> > > > > >>>> servers in the tablet server page. The plots will display the
> > same
> > > > > >>>> information as table plots considering data per tablet server.
> > > > > >>>>
> > > > > >>>> Trace Visualization: Accumulo 1198
> > > > > >>>> ----------------------------
> > > > > >>>>
> > > > > >>>> Since we are displaying graphs about each tablet and each
> table
> > we
> > > > can
> > > > > >>>> add
> > > > > >>>> major and minor compaction graph to each table and each
> tablet.
> > > > > >>>>
> > > > > >>>> Or other option is to display this in a single graph in
> overview
> > > > page
> > > > > >>>> with
> > > > > >>>> different graph lines for different tables and tablets.
> > > > > >>>>
> > > > > >>>> Server type information : Accumulo 807
> > > > > >>>> ------------------------------****---
> > > > > >>>>
> > > > > >>>> For displaying this informations we can add a new page and
> > display
> > > > the
> > > > > >>>> information as a table. The table should specify the network
> > > address
> > > > > of
> > > > > >>>> the
> > > > > >>>> server, server type, weather it is active or in-active etc.
> > > > > >>>>
> > > > > >>>> Thanks,
> > > > > >>>> Supun...
> > > > > >>>>
> > > > > >>>>
> > > > > >>>>
> > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
>

Re: GSOC: Monitor Improvements

Posted by Keith Turner <ke...@deenlo.com>.

On Mon, Apr 22, 2013 at 12:42 PM, Supun Kamburugamuva <su...@gmail.com>wrote:

> Great.. we could certainly introduce the graph Mike and Keith have
> mentioned.
>

I mentioned that it would be useful to display info collected from clients.
 Tracing already collects this info.  The graph Mike mentioned may be
useful for displaying trace info, maybe a plot per a trace field.


>
> Supun..
>
>
> On Mon, Apr 22, 2013 at 12:02 PM, Keith Turner <ke...@deenlo.com> wrote:
>
> > On Mon, Apr 22, 2013 at 11:42 AM, Mike Drob <md...@mdrob.com> wrote:
> >
> > > Adding on to the comment about summaries, averages, and outliers. If,
> for
> > > some reason, you end up with a two-hump population, then simply showing
> > > averages will mask the split and lose a lot of valuable information. It
> > is
> > > often valuable to know that a particular set of users or servers are
> > > experiencing degraded performance while the rest of the ecosystem is
> > > healthy.
> > >
> > > This isn't something that shows up in a regular time series because the
> > > secondary population is usually very small compared to the total
> > > population. There was a graph for request latency of a service that I
> saw
> > > once that I really wish I could find again, maybe somebody on the list
> > will
> > > be able to chime in - It had timestamps on the x-axis, latency on the
> y,
> > > and each (x,y) point was colored on a gradient representing how many
> > > requests were fulfilled at time x with latency y. This chart make it
> > > immediately easy to see that most data points fit a normal distribution
> > > with a low mean, but there was also a cluster at the top for some
> reason.
> > >
> >
> >
> > That sounds really cool.  Maybe the y-axis/latency could be log scale.
> > Inevitably a 3004 second operation will finish and obscure the
> > smaller latencies.
> >
> > Sometimes its more useful to sample this type of info from the clients
> > rather than tablet servers.   A tablet server may report low latencies,
> but
> > all clients using may experience high latencies because of a network
> issue.
> >   We could certainly consider making the client code report this info.
> >
> >
> > >
> > > I'd love to see that type of chart show up for tablet servers (probably
> > not
> > > as useful for tables).
> > >
> > > Mike
> > >
> > >
> > > On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
> > > wrote:
> > >
> > > > Another thing to consider is scale.  On large clusters (many hundreds
> > of
> > > > nodes), more data is not helpful for visualization.  Instead,
> > summaries,
> > > > averages and outliers are important.
> > > >
> > > > For example, if one node is consistently slow, it is better to know
> > that
> > > > than to see one graph with low numbers in a sea of graphs.
> > > >
> > > > If the monitor collects information using JMX, collection time for
> each
> > > > node would be a good thing to know, too.
> > > >
> > > > -Eric
> > > >
> > > >
> > > > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
> > > wrote:
> > > >
> > > > > Supun,
> > > > >
> > > > > Yup, very much so. Having a way to consume any and all metrics via
> > JMX
> > > > > would simplify things for any consumers (internal or external).
> > > > >
> > > > >
> > > > >
> > > > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > > > >
> > > > >> Hi Josh,
> > > > >>
> > > > >> Thanks for the suggestions. I'll incorporate these to the
> proposal.
> > > > >>
> > > > >> Another area I would like to work is on JMX. There is a Jira that
> > says
> > > > to
> > > > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do
> you
> > > > think
> > > > >> this is a good addition to the Monitor?
> > > > >>
> > > > >> Thanks,
> > > > >> Supun..
> > > > >>
> > > > >>
> > > > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <josh.elser@gmail.com
> >
> > > > wrote:
> > > > >>
> > > > >>  Supun,
> > > > >>>
> > > > >>> Looks good! Can I make some suggestions/comments?
> > > > >>>
> > > > >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> > > > >>> compactions, major compactions, index cache hit rate, and data
> > cache
> > > > hit
> > > > >>> rate per table (same graphs that are displayed system-wide when
> you
> > > > visit
> > > > >>> http://${MONITOR_HOST}:50095/.
> > > > >>>
> > > > >>> For "Per tablet [server] plots", it would be neat if you could
> also
> > > > >>> extract some general statistics like top N least performing, top
> N
> > > > >>> highest
> > > > >>> performing, etc. tablet servers. Ideally, this could correlate
> with
> > > > >>> servers
> > > > >>> that may be having problems :).
> > > > >>>
> > > > >>> Do you see these proposed changes as being sufficient for 3-4
> > months
> > > of
> > > > >>> 40hrs/week work? If you plan to really dig into these changes
> > > (perhaps
> > > > >>> reworking components of the monitor itself), I could perhaps see
> > > this.
> > > > Do
> > > > >>> you have any ideas for more lofty goals that you could pursue as
> > > well?
> > > > I
> > > > >>> don't want you/us to get one month into things and see you
> complete
> > > > >>> everything we initially planned to accomplish :)
> > > > >>>
> > > > >>> - Josh
> > > > >>>
> > > > >>>
> > > > >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> > > > >>>
> > > > >>>  Hi all,
> > > > >>>>
> > > > >>>> I would like to start writing the proposal for the GSoc. I've
> put
> > > > >>>> together
> > > > >>>> some initial high level goals of the project. Please let me know
> > > what
> > > > I
> > > > >>>> can
> > > > >>>> improve.
> > > > >>>>
> > > > >>>> Per table plots: Accumulo 594
> > > > >>>> ---------------------
> > > > >>>>
> > > > >>>> The goal of this is to display plots that explains the various
> > > > >>>> activtities
> > > > >>>> that happens per table. When we go to the tables page of the
> > monitor
> > > > and
> > > > >>>> go
> > > > >>>> to a specific table it displays some information in a table
> > format.
> > > We
> > > > >>>> can
> > > > >>>> argument this information by showing graphs for
> > > > >>>>
> > > > >>>> 1. Ingest entries
> > > > >>>> 2. Ingest data size
> > > > >>>> 3. Scan entries
> > > > >>>> 4. Scan data size
> > > > >>>>
> > > > >>>> Per tablet plots
> > > > >>>> ----------------------
> > > > >>>>
> > > > >>>> Same as in the table plots we can display information regarding
> > > tablet
> > > > >>>> servers in the tablet server page. The plots will display the
> same
> > > > >>>> information as table plots considering data per tablet server.
> > > > >>>>
> > > > >>>> Trace Visualization: Accumulo 1198
> > > > >>>> ----------------------------
> > > > >>>>
> > > > >>>> Since we are displaying graphs about each tablet and each table
> we
> > > can
> > > > >>>> add
> > > > >>>> major and minor compaction graph to each table and each tablet.
> > > > >>>>
> > > > >>>> Or other option is to display this in a single graph in overview
> > > page
> > > > >>>> with
> > > > >>>> different graph lines for different tables and tablets.
> > > > >>>>
> > > > >>>> Server type information : Accumulo 807
> > > > >>>> ------------------------------****---
> > > > >>>>
> > > > >>>> For displaying this informations we can add a new page and
> display
> > > the
> > > > >>>> information as a table. The table should specify the network
> > address
> > > > of
> > > > >>>> the
> > > > >>>> server, server type, weather it is active or in-active etc.
> > > > >>>>
> > > > >>>> Thanks,
> > > > >>>> Supun...
> > > > >>>>
> > > > >>>>
> > > > >>>>
> > > > >>
> > > > >
> > > >
> > >
> >
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>

Re: GSOC: Monitor Improvements

Posted by Supun Kamburugamuva <su...@gmail.com>.

Great.. we could certainly introduce the graph Mike and Keith have
mentioned.

Supun..


On Mon, Apr 22, 2013 at 12:02 PM, Keith Turner <ke...@deenlo.com> wrote:

> On Mon, Apr 22, 2013 at 11:42 AM, Mike Drob <md...@mdrob.com> wrote:
>
> > Adding on to the comment about summaries, averages, and outliers. If, for
> > some reason, you end up with a two-hump population, then simply showing
> > averages will mask the split and lose a lot of valuable information. It
> is
> > often valuable to know that a particular set of users or servers are
> > experiencing degraded performance while the rest of the ecosystem is
> > healthy.
> >
> > This isn't something that shows up in a regular time series because the
> > secondary population is usually very small compared to the total
> > population. There was a graph for request latency of a service that I saw
> > once that I really wish I could find again, maybe somebody on the list
> will
> > be able to chime in - It had timestamps on the x-axis, latency on the y,
> > and each (x,y) point was colored on a gradient representing how many
> > requests were fulfilled at time x with latency y. This chart make it
> > immediately easy to see that most data points fit a normal distribution
> > with a low mean, but there was also a cluster at the top for some reason.
> >
>
>
> That sounds really cool.  Maybe the y-axis/latency could be log scale.
> Inevitably a 3004 second operation will finish and obscure the
> smaller latencies.
>
> Sometimes its more useful to sample this type of info from the clients
> rather than tablet servers.   A tablet server may report low latencies, but
> all clients using may experience high latencies because of a network issue.
>   We could certainly consider making the client code report this info.
>
>
> >
> > I'd love to see that type of chart show up for tablet servers (probably
> not
> > as useful for tables).
> >
> > Mike
> >
> >
> > On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
> > wrote:
> >
> > > Another thing to consider is scale.  On large clusters (many hundreds
> of
> > > nodes), more data is not helpful for visualization.  Instead,
> summaries,
> > > averages and outliers are important.
> > >
> > > For example, if one node is consistently slow, it is better to know
> that
> > > than to see one graph with low numbers in a sea of graphs.
> > >
> > > If the monitor collects information using JMX, collection time for each
> > > node would be a good thing to know, too.
> > >
> > > -Eric
> > >
> > >
> > > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
> > wrote:
> > >
> > > > Supun,
> > > >
> > > > Yup, very much so. Having a way to consume any and all metrics via
> JMX
> > > > would simplify things for any consumers (internal or external).
> > > >
> > > >
> > > >
> > > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > > >
> > > >> Hi Josh,
> > > >>
> > > >> Thanks for the suggestions. I'll incorporate these to the proposal.
> > > >>
> > > >> Another area I would like to work is on JMX. There is a Jira that
> says
> > > to
> > > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> > > think
> > > >> this is a good addition to the Monitor?
> > > >>
> > > >> Thanks,
> > > >> Supun..
> > > >>
> > > >>
> > > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
> > > wrote:
> > > >>
> > > >>  Supun,
> > > >>>
> > > >>> Looks good! Can I make some suggestions/comments?
> > > >>>
> > > >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> > > >>> compactions, major compactions, index cache hit rate, and data
> cache
> > > hit
> > > >>> rate per table (same graphs that are displayed system-wide when you
> > > visit
> > > >>> http://${MONITOR_HOST}:50095/.
> > > >>>
> > > >>> For "Per tablet [server] plots", it would be neat if you could also
> > > >>> extract some general statistics like top N least performing, top N
> > > >>> highest
> > > >>> performing, etc. tablet servers. Ideally, this could correlate with
> > > >>> servers
> > > >>> that may be having problems :).
> > > >>>
> > > >>> Do you see these proposed changes as being sufficient for 3-4
> months
> > of
> > > >>> 40hrs/week work? If you plan to really dig into these changes
> > (perhaps
> > > >>> reworking components of the monitor itself), I could perhaps see
> > this.
> > > Do
> > > >>> you have any ideas for more lofty goals that you could pursue as
> > well?
> > > I
> > > >>> don't want you/us to get one month into things and see you complete
> > > >>> everything we initially planned to accomplish :)
> > > >>>
> > > >>> - Josh
> > > >>>
> > > >>>
> > > >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> > > >>>
> > > >>>  Hi all,
> > > >>>>
> > > >>>> I would like to start writing the proposal for the GSoc. I've put
> > > >>>> together
> > > >>>> some initial high level goals of the project. Please let me know
> > what
> > > I
> > > >>>> can
> > > >>>> improve.
> > > >>>>
> > > >>>> Per table plots: Accumulo 594
> > > >>>> ---------------------
> > > >>>>
> > > >>>> The goal of this is to display plots that explains the various
> > > >>>> activtities
> > > >>>> that happens per table. When we go to the tables page of the
> monitor
> > > and
> > > >>>> go
> > > >>>> to a specific table it displays some information in a table
> format.
> > We
> > > >>>> can
> > > >>>> argument this information by showing graphs for
> > > >>>>
> > > >>>> 1. Ingest entries
> > > >>>> 2. Ingest data size
> > > >>>> 3. Scan entries
> > > >>>> 4. Scan data size
> > > >>>>
> > > >>>> Per tablet plots
> > > >>>> ----------------------
> > > >>>>
> > > >>>> Same as in the table plots we can display information regarding
> > tablet
> > > >>>> servers in the tablet server page. The plots will display the same
> > > >>>> information as table plots considering data per tablet server.
> > > >>>>
> > > >>>> Trace Visualization: Accumulo 1198
> > > >>>> ----------------------------
> > > >>>>
> > > >>>> Since we are displaying graphs about each tablet and each table we
> > can
> > > >>>> add
> > > >>>> major and minor compaction graph to each table and each tablet.
> > > >>>>
> > > >>>> Or other option is to display this in a single graph in overview
> > page
> > > >>>> with
> > > >>>> different graph lines for different tables and tablets.
> > > >>>>
> > > >>>> Server type information : Accumulo 807
> > > >>>> ------------------------------****---
> > > >>>>
> > > >>>> For displaying this informations we can add a new page and display
> > the
> > > >>>> information as a table. The table should specify the network
> address
> > > of
> > > >>>> the
> > > >>>> server, server type, weather it is active or in-active etc.
> > > >>>>
> > > >>>> Thanks,
> > > >>>> Supun...
> > > >>>>
> > > >>>>
> > > >>>>
> > > >>
> > > >
> > >
> >
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: GSOC: Monitor Improvements

Posted by Keith Turner <ke...@deenlo.com>.

On Mon, Apr 22, 2013 at 11:42 AM, Mike Drob <md...@mdrob.com> wrote:

> Adding on to the comment about summaries, averages, and outliers. If, for
> some reason, you end up with a two-hump population, then simply showing
> averages will mask the split and lose a lot of valuable information. It is
> often valuable to know that a particular set of users or servers are
> experiencing degraded performance while the rest of the ecosystem is
> healthy.
>
> This isn't something that shows up in a regular time series because the
> secondary population is usually very small compared to the total
> population. There was a graph for request latency of a service that I saw
> once that I really wish I could find again, maybe somebody on the list will
> be able to chime in - It had timestamps on the x-axis, latency on the y,
> and each (x,y) point was colored on a gradient representing how many
> requests were fulfilled at time x with latency y. This chart make it
> immediately easy to see that most data points fit a normal distribution
> with a low mean, but there was also a cluster at the top for some reason.
>


That sounds really cool.  Maybe the y-axis/latency could be log scale.
Inevitably a 3004 second operation will finish and obscure the
smaller latencies.

Sometimes its more useful to sample this type of info from the clients
rather than tablet servers.   A tablet server may report low latencies, but
all clients using may experience high latencies because of a network issue.
  We could certainly consider making the client code report this info.


>
> I'd love to see that type of chart show up for tablet servers (probably not
> as useful for tables).
>
> Mike
>
>
> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
> wrote:
>
> > Another thing to consider is scale.  On large clusters (many hundreds of
> > nodes), more data is not helpful for visualization.  Instead, summaries,
> > averages and outliers are important.
> >
> > For example, if one node is consistently slow, it is better to know that
> > than to see one graph with low numbers in a sea of graphs.
> >
> > If the monitor collects information using JMX, collection time for each
> > node would be a good thing to know, too.
> >
> > -Eric
> >
> >
> > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >
> > > Supun,
> > >
> > > Yup, very much so. Having a way to consume any and all metrics via JMX
> > > would simplify things for any consumers (internal or external).
> > >
> > >
> > >
> > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > >
> > >> Hi Josh,
> > >>
> > >> Thanks for the suggestions. I'll incorporate these to the proposal.
> > >>
> > >> Another area I would like to work is on JMX. There is a Jira that says
> > to
> > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> > think
> > >> this is a good addition to the Monitor?
> > >>
> > >> Thanks,
> > >> Supun..
> > >>
> > >>
> > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
> > wrote:
> > >>
> > >>  Supun,
> > >>>
> > >>> Looks good! Can I make some suggestions/comments?
> > >>>
> > >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> > >>> compactions, major compactions, index cache hit rate, and data cache
> > hit
> > >>> rate per table (same graphs that are displayed system-wide when you
> > visit
> > >>> http://${MONITOR_HOST}:50095/.
> > >>>
> > >>> For "Per tablet [server] plots", it would be neat if you could also
> > >>> extract some general statistics like top N least performing, top N
> > >>> highest
> > >>> performing, etc. tablet servers. Ideally, this could correlate with
> > >>> servers
> > >>> that may be having problems :).
> > >>>
> > >>> Do you see these proposed changes as being sufficient for 3-4 months
> of
> > >>> 40hrs/week work? If you plan to really dig into these changes
> (perhaps
> > >>> reworking components of the monitor itself), I could perhaps see
> this.
> > Do
> > >>> you have any ideas for more lofty goals that you could pursue as
> well?
> > I
> > >>> don't want you/us to get one month into things and see you complete
> > >>> everything we initially planned to accomplish :)
> > >>>
> > >>> - Josh
> > >>>
> > >>>
> > >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> > >>>
> > >>>  Hi all,
> > >>>>
> > >>>> I would like to start writing the proposal for the GSoc. I've put
> > >>>> together
> > >>>> some initial high level goals of the project. Please let me know
> what
> > I
> > >>>> can
> > >>>> improve.
> > >>>>
> > >>>> Per table plots: Accumulo 594
> > >>>> ---------------------
> > >>>>
> > >>>> The goal of this is to display plots that explains the various
> > >>>> activtities
> > >>>> that happens per table. When we go to the tables page of the monitor
> > and
> > >>>> go
> > >>>> to a specific table it displays some information in a table format.
> We
> > >>>> can
> > >>>> argument this information by showing graphs for
> > >>>>
> > >>>> 1. Ingest entries
> > >>>> 2. Ingest data size
> > >>>> 3. Scan entries
> > >>>> 4. Scan data size
> > >>>>
> > >>>> Per tablet plots
> > >>>> ----------------------
> > >>>>
> > >>>> Same as in the table plots we can display information regarding
> tablet
> > >>>> servers in the tablet server page. The plots will display the same
> > >>>> information as table plots considering data per tablet server.
> > >>>>
> > >>>> Trace Visualization: Accumulo 1198
> > >>>> ----------------------------
> > >>>>
> > >>>> Since we are displaying graphs about each tablet and each table we
> can
> > >>>> add
> > >>>> major and minor compaction graph to each table and each tablet.
> > >>>>
> > >>>> Or other option is to display this in a single graph in overview
> page
> > >>>> with
> > >>>> different graph lines for different tables and tablets.
> > >>>>
> > >>>> Server type information : Accumulo 807
> > >>>> ------------------------------****---
> > >>>>
> > >>>> For displaying this informations we can add a new page and display
> the
> > >>>> information as a table. The table should specify the network address
> > of
> > >>>> the
> > >>>> server, server type, weather it is active or in-active etc.
> > >>>>
> > >>>> Thanks,
> > >>>> Supun...
> > >>>>
> > >>>>
> > >>>>
> > >>
> > >
> >
>

Re: GSOC: Monitor Improvements

Posted by Mike Drob <md...@mdrob.com>.

Adding on to the comment about summaries, averages, and outliers. If, for
some reason, you end up with a two-hump population, then simply showing
averages will mask the split and lose a lot of valuable information. It is
often valuable to know that a particular set of users or servers are
experiencing degraded performance while the rest of the ecosystem is
healthy.

This isn't something that shows up in a regular time series because the
secondary population is usually very small compared to the total
population. There was a graph for request latency of a service that I saw
once that I really wish I could find again, maybe somebody on the list will
be able to chime in - It had timestamps on the x-axis, latency on the y,
and each (x,y) point was colored on a gradient representing how many
requests were fulfilled at time x with latency y. This chart make it
immediately easy to see that most data points fit a normal distribution
with a low mean, but there was also a cluster at the top for some reason.

I'd love to see that type of chart show up for tablet servers (probably not
as useful for tables).

Mike


On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com> wrote:

> Another thing to consider is scale.  On large clusters (many hundreds of
> nodes), more data is not helpful for visualization.  Instead, summaries,
> averages and outliers are important.
>
> For example, if one node is consistently slow, it is better to know that
> than to see one graph with low numbers in a sea of graphs.
>
> If the monitor collects information using JMX, collection time for each
> node would be a good thing to know, too.
>
> -Eric
>
>
> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com> wrote:
>
> > Supun,
> >
> > Yup, very much so. Having a way to consume any and all metrics via JMX
> > would simplify things for any consumers (internal or external).
> >
> >
> >
> > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> >
> >> Hi Josh,
> >>
> >> Thanks for the suggestions. I'll incorporate these to the proposal.
> >>
> >> Another area I would like to work is on JMX. There is a Jira that says
> to
> >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> think
> >> this is a good addition to the Monitor?
> >>
> >> Thanks,
> >> Supun..
> >>
> >>
> >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >>
> >>  Supun,
> >>>
> >>> Looks good! Can I make some suggestions/comments?
> >>>
> >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> >>> compactions, major compactions, index cache hit rate, and data cache
> hit
> >>> rate per table (same graphs that are displayed system-wide when you
> visit
> >>> http://${MONITOR_HOST}:50095/.
> >>>
> >>> For "Per tablet [server] plots", it would be neat if you could also
> >>> extract some general statistics like top N least performing, top N
> >>> highest
> >>> performing, etc. tablet servers. Ideally, this could correlate with
> >>> servers
> >>> that may be having problems :).
> >>>
> >>> Do you see these proposed changes as being sufficient for 3-4 months of
> >>> 40hrs/week work? If you plan to really dig into these changes (perhaps
> >>> reworking components of the monitor itself), I could perhaps see this.
> Do
> >>> you have any ideas for more lofty goals that you could pursue as well?
> I
> >>> don't want you/us to get one month into things and see you complete
> >>> everything we initially planned to accomplish :)
> >>>
> >>> - Josh
> >>>
> >>>
> >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> >>>
> >>>  Hi all,
> >>>>
> >>>> I would like to start writing the proposal for the GSoc. I've put
> >>>> together
> >>>> some initial high level goals of the project. Please let me know what
> I
> >>>> can
> >>>> improve.
> >>>>
> >>>> Per table plots: Accumulo 594
> >>>> ---------------------
> >>>>
> >>>> The goal of this is to display plots that explains the various
> >>>> activtities
> >>>> that happens per table. When we go to the tables page of the monitor
> and
> >>>> go
> >>>> to a specific table it displays some information in a table format. We
> >>>> can
> >>>> argument this information by showing graphs for
> >>>>
> >>>> 1. Ingest entries
> >>>> 2. Ingest data size
> >>>> 3. Scan entries
> >>>> 4. Scan data size
> >>>>
> >>>> Per tablet plots
> >>>> ----------------------
> >>>>
> >>>> Same as in the table plots we can display information regarding tablet
> >>>> servers in the tablet server page. The plots will display the same
> >>>> information as table plots considering data per tablet server.
> >>>>
> >>>> Trace Visualization: Accumulo 1198
> >>>> ----------------------------
> >>>>
> >>>> Since we are displaying graphs about each tablet and each table we can
> >>>> add
> >>>> major and minor compaction graph to each table and each tablet.
> >>>>
> >>>> Or other option is to display this in a single graph in overview page
> >>>> with
> >>>> different graph lines for different tables and tablets.
> >>>>
> >>>> Server type information : Accumulo 807
> >>>> ------------------------------****---
> >>>>
> >>>> For displaying this informations we can add a new page and display the
> >>>> information as a table. The table should specify the network address
> of
> >>>> the
> >>>> server, server type, weather it is active or in-active etc.
> >>>>
> >>>> Thanks,
> >>>> Supun...
> >>>>
> >>>>
> >>>>
> >>
> >
>

Re: GSOC: Monitor Improvements

Posted by Supun Kamburugamuva <su...@gmail.com>.

Thank you all for the valuable input. I'll start writing the proposal. I
really like to contribute to Accumulo and would like to take on the RRDTool
proposal by Eric after the summer. Hopefully I'll have time.

Thanks,
Supun..


On Mon, Apr 22, 2013 at 11:17 AM, Eric Newton <er...@gmail.com> wrote:

> I would do something simpler: just have a Mock collector which does no JMX,
> it just makes up numbers, which could be substituted for testing.
>
> -Eric
>
>
>
> On Mon, Apr 22, 2013 at 11:04 AM, Supun Kamburugamuva <supun06@gmail.com
> >wrote:
>
> > That sounds interesting. To clarify the requirement, we can have a
> process
> > that exposes the same JMX mbeans as the the real server and monitor can
> > plug in to this process.
> >
> > Thanks,
> > Supun..
> >
> >
> > On Mon, Apr 22, 2013 at 10:57 AM, Josh Elser <jo...@gmail.com>
> wrote:
> >
> > > That would be pretty sweet, actually. Potentially parallel to what you
> > > want to do, Supun, but cool nonetheless.
> > >
> > > I could see a lot of benefit by having some process that could emulate
> > the
> > > output from a non-trivially-sized Accumulo cluster on a single box.
> > >
> > >
> > > On 4/22/13 10:43 AM, Eric Newton wrote:
> > >
> > >> You could mock the stats collection.
> > >>
> > >> -Eric
> > >>
> > >>
> > >> On Mon, Apr 22, 2013 at 10:41 AM, David Medinets
> > >> <da...@gmail.com>**wrote:
> > >>
> > >>  The average developer probably can't access a large cluster with
> > hundred
> > >>> of
> > >>> nodes. Is there a way to simulate this?
> > >>>
> > >>>
> > >>> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
> > >>> wrote:
> > >>>
> > >>>  Another thing to consider is scale.  On large clusters (many
> hundreds
> > of
> > >>>> nodes), more data is not helpful for visualization.  Instead,
> > summaries,
> > >>>> averages and outliers are important.
> > >>>>
> > >>>> For example, if one node is consistently slow, it is better to know
> > that
> > >>>> than to see one graph with low numbers in a sea of graphs.
> > >>>>
> > >>>> If the monitor collects information using JMX, collection time for
> > each
> > >>>> node would be a good thing to know, too.
> > >>>>
> > >>>> -Eric
> > >>>>
> > >>>>
> > >>>> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
> > >>>>
> > >>> wrote:
> > >>>
> > >>>> Supun,
> > >>>>>
> > >>>>> Yup, very much so. Having a way to consume any and all metrics via
> > JMX
> > >>>>> would simplify things for any consumers (internal or external).
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > >>>>>
> > >>>>>  Hi Josh,
> > >>>>>>
> > >>>>>> Thanks for the suggestions. I'll incorporate these to the
> proposal.
> > >>>>>>
> > >>>>>> Another area I would like to work is on JMX. There is a Jira that
> > says
> > >>>>>>
> > >>>>> to
> > >>>>
> > >>>>> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> > >>>>>>
> > >>>>> think
> > >>>>
> > >>>>> this is a good addition to the Monitor?
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Supun..
> > >>>>>>
> > >>>>>>
> > >>>>>> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <josh.elser@gmail.com
> >
> > >>>>>>
> > >>>>> wrote:
> > >>>>
> > >>>>>   Supun,
> > >>>>>>
> > >>>>>>> Looks good! Can I make some suggestions/comments?
> > >>>>>>>
> > >>>>>>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> > >>>>>>> compactions, major compactions, index cache hit rate, and data
> > cache
> > >>>>>>>
> > >>>>>> hit
> > >>>>
> > >>>>> rate per table (same graphs that are displayed system-wide when you
> > >>>>>>>
> > >>>>>> visit
> > >>>>
> > >>>>> http://${MONITOR_HOST}:50095/.
> > >>>>>>>
> > >>>>>>> For "Per tablet [server] plots", it would be neat if you could
> also
> > >>>>>>> extract some general statistics like top N least performing, top
> N
> > >>>>>>> highest
> > >>>>>>> performing, etc. tablet servers. Ideally, this could correlate
> with
> > >>>>>>> servers
> > >>>>>>> that may be having problems :).
> > >>>>>>>
> > >>>>>>> Do you see these proposed changes as being sufficient for 3-4
> > months
> > >>>>>>>
> > >>>>>> of
> > >>>
> > >>>>  40hrs/week work? If you plan to really dig into these changes
> > >>>>>>>
> > >>>>>> (perhaps
> > >>>
> > >>>>  reworking components of the monitor itself), I could perhaps see
> > >>>>>>>
> > >>>>>> this.
> > >>>
> > >>>> Do
> > >>>>
> > >>>>> you have any ideas for more lofty goals that you could pursue as
> > >>>>>>>
> > >>>>>> well?
> > >>>
> > >>>> I
> > >>>>
> > >>>>> don't want you/us to get one month into things and see you complete
> > >>>>>>> everything we initially planned to accomplish :)
> > >>>>>>>
> > >>>>>>> - Josh
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> > >>>>>>>
> > >>>>>>>   Hi all,
> > >>>>>>>
> > >>>>>>>> I would like to start writing the proposal for the GSoc. I've
> put
> > >>>>>>>> together
> > >>>>>>>> some initial high level goals of the project. Please let me know
> > >>>>>>>>
> > >>>>>>> what
> > >>>
> > >>>> I
> > >>>>
> > >>>>>  can
> > >>>>>>>> improve.
> > >>>>>>>>
> > >>>>>>>> Per table plots: Accumulo 594
> > >>>>>>>> ---------------------
> > >>>>>>>>
> > >>>>>>>> The goal of this is to display plots that explains the various
> > >>>>>>>> activtities
> > >>>>>>>> that happens per table. When we go to the tables page of the
> > monitor
> > >>>>>>>>
> > >>>>>>> and
> > >>>>
> > >>>>>  go
> > >>>>>>>> to a specific table it displays some information in a table
> > format.
> > >>>>>>>>
> > >>>>>>> We
> > >>>
> > >>>>  can
> > >>>>>>>> argument this information by showing graphs for
> > >>>>>>>>
> > >>>>>>>> 1. Ingest entries
> > >>>>>>>> 2. Ingest data size
> > >>>>>>>> 3. Scan entries
> > >>>>>>>> 4. Scan data size
> > >>>>>>>>
> > >>>>>>>> Per tablet plots
> > >>>>>>>> ----------------------
> > >>>>>>>>
> > >>>>>>>> Same as in the table plots we can display information regarding
> > >>>>>>>>
> > >>>>>>> tablet
> > >>>
> > >>>>  servers in the tablet server page. The plots will display the same
> > >>>>>>>> information as table plots considering data per tablet server.
> > >>>>>>>>
> > >>>>>>>> Trace Visualization: Accumulo 1198
> > >>>>>>>> ----------------------------
> > >>>>>>>>
> > >>>>>>>> Since we are displaying graphs about each tablet and each table
> we
> > >>>>>>>>
> > >>>>>>> can
> > >>>
> > >>>>  add
> > >>>>>>>> major and minor compaction graph to each table and each tablet.
> > >>>>>>>>
> > >>>>>>>> Or other option is to display this in a single graph in overview
> > >>>>>>>>
> > >>>>>>> page
> > >>>
> > >>>>  with
> > >>>>>>>> different graph lines for different tables and tablets.
> > >>>>>>>>
> > >>>>>>>> Server type information : Accumulo 807
> > >>>>>>>> ------------------------------******---
> > >>>>>>>>
> > >>>>>>>> For displaying this informations we can add a new page and
> display
> > >>>>>>>>
> > >>>>>>> the
> > >>>
> > >>>>  information as a table. The table should specify the network
> address
> > >>>>>>>>
> > >>>>>>> of
> > >>>>
> > >>>>>  the
> > >>>>>>>> server, server type, weather it is active or in-active etc.
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Supun...
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >
> >
> >
> > --
> > Supun Kamburugamuva
> > Member, Apache Software Foundation; http://www.apache.org
> > E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> > Blog: http://supunk.blogspot.com
> >
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: GSOC: Monitor Improvements

Posted by Eric Newton <er...@gmail.com>.

I would do something simpler: just have a Mock collector which does no JMX,
it just makes up numbers, which could be substituted for testing.

-Eric



On Mon, Apr 22, 2013 at 11:04 AM, Supun Kamburugamuva <su...@gmail.com>wrote:

> That sounds interesting. To clarify the requirement, we can have a process
> that exposes the same JMX mbeans as the the real server and monitor can
> plug in to this process.
>
> Thanks,
> Supun..
>
>
> On Mon, Apr 22, 2013 at 10:57 AM, Josh Elser <jo...@gmail.com> wrote:
>
> > That would be pretty sweet, actually. Potentially parallel to what you
> > want to do, Supun, but cool nonetheless.
> >
> > I could see a lot of benefit by having some process that could emulate
> the
> > output from a non-trivially-sized Accumulo cluster on a single box.
> >
> >
> > On 4/22/13 10:43 AM, Eric Newton wrote:
> >
> >> You could mock the stats collection.
> >>
> >> -Eric
> >>
> >>
> >> On Mon, Apr 22, 2013 at 10:41 AM, David Medinets
> >> <da...@gmail.com>**wrote:
> >>
> >>  The average developer probably can't access a large cluster with
> hundred
> >>> of
> >>> nodes. Is there a way to simulate this?
> >>>
> >>>
> >>> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
> >>> wrote:
> >>>
> >>>  Another thing to consider is scale.  On large clusters (many hundreds
> of
> >>>> nodes), more data is not helpful for visualization.  Instead,
> summaries,
> >>>> averages and outliers are important.
> >>>>
> >>>> For example, if one node is consistently slow, it is better to know
> that
> >>>> than to see one graph with low numbers in a sea of graphs.
> >>>>
> >>>> If the monitor collects information using JMX, collection time for
> each
> >>>> node would be a good thing to know, too.
> >>>>
> >>>> -Eric
> >>>>
> >>>>
> >>>> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
> >>>>
> >>> wrote:
> >>>
> >>>> Supun,
> >>>>>
> >>>>> Yup, very much so. Having a way to consume any and all metrics via
> JMX
> >>>>> would simplify things for any consumers (internal or external).
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> >>>>>
> >>>>>  Hi Josh,
> >>>>>>
> >>>>>> Thanks for the suggestions. I'll incorporate these to the proposal.
> >>>>>>
> >>>>>> Another area I would like to work is on JMX. There is a Jira that
> says
> >>>>>>
> >>>>> to
> >>>>
> >>>>> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> >>>>>>
> >>>>> think
> >>>>
> >>>>> this is a good addition to the Monitor?
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Supun..
> >>>>>>
> >>>>>>
> >>>>>> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
> >>>>>>
> >>>>> wrote:
> >>>>
> >>>>>   Supun,
> >>>>>>
> >>>>>>> Looks good! Can I make some suggestions/comments?
> >>>>>>>
> >>>>>>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> >>>>>>> compactions, major compactions, index cache hit rate, and data
> cache
> >>>>>>>
> >>>>>> hit
> >>>>
> >>>>> rate per table (same graphs that are displayed system-wide when you
> >>>>>>>
> >>>>>> visit
> >>>>
> >>>>> http://${MONITOR_HOST}:50095/.
> >>>>>>>
> >>>>>>> For "Per tablet [server] plots", it would be neat if you could also
> >>>>>>> extract some general statistics like top N least performing, top N
> >>>>>>> highest
> >>>>>>> performing, etc. tablet servers. Ideally, this could correlate with
> >>>>>>> servers
> >>>>>>> that may be having problems :).
> >>>>>>>
> >>>>>>> Do you see these proposed changes as being sufficient for 3-4
> months
> >>>>>>>
> >>>>>> of
> >>>
> >>>>  40hrs/week work? If you plan to really dig into these changes
> >>>>>>>
> >>>>>> (perhaps
> >>>
> >>>>  reworking components of the monitor itself), I could perhaps see
> >>>>>>>
> >>>>>> this.
> >>>
> >>>> Do
> >>>>
> >>>>> you have any ideas for more lofty goals that you could pursue as
> >>>>>>>
> >>>>>> well?
> >>>
> >>>> I
> >>>>
> >>>>> don't want you/us to get one month into things and see you complete
> >>>>>>> everything we initially planned to accomplish :)
> >>>>>>>
> >>>>>>> - Josh
> >>>>>>>
> >>>>>>>
> >>>>>>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> >>>>>>>
> >>>>>>>   Hi all,
> >>>>>>>
> >>>>>>>> I would like to start writing the proposal for the GSoc. I've put
> >>>>>>>> together
> >>>>>>>> some initial high level goals of the project. Please let me know
> >>>>>>>>
> >>>>>>> what
> >>>
> >>>> I
> >>>>
> >>>>>  can
> >>>>>>>> improve.
> >>>>>>>>
> >>>>>>>> Per table plots: Accumulo 594
> >>>>>>>> ---------------------
> >>>>>>>>
> >>>>>>>> The goal of this is to display plots that explains the various
> >>>>>>>> activtities
> >>>>>>>> that happens per table. When we go to the tables page of the
> monitor
> >>>>>>>>
> >>>>>>> and
> >>>>
> >>>>>  go
> >>>>>>>> to a specific table it displays some information in a table
> format.
> >>>>>>>>
> >>>>>>> We
> >>>
> >>>>  can
> >>>>>>>> argument this information by showing graphs for
> >>>>>>>>
> >>>>>>>> 1. Ingest entries
> >>>>>>>> 2. Ingest data size
> >>>>>>>> 3. Scan entries
> >>>>>>>> 4. Scan data size
> >>>>>>>>
> >>>>>>>> Per tablet plots
> >>>>>>>> ----------------------
> >>>>>>>>
> >>>>>>>> Same as in the table plots we can display information regarding
> >>>>>>>>
> >>>>>>> tablet
> >>>
> >>>>  servers in the tablet server page. The plots will display the same
> >>>>>>>> information as table plots considering data per tablet server.
> >>>>>>>>
> >>>>>>>> Trace Visualization: Accumulo 1198
> >>>>>>>> ----------------------------
> >>>>>>>>
> >>>>>>>> Since we are displaying graphs about each tablet and each table we
> >>>>>>>>
> >>>>>>> can
> >>>
> >>>>  add
> >>>>>>>> major and minor compaction graph to each table and each tablet.
> >>>>>>>>
> >>>>>>>> Or other option is to display this in a single graph in overview
> >>>>>>>>
> >>>>>>> page
> >>>
> >>>>  with
> >>>>>>>> different graph lines for different tables and tablets.
> >>>>>>>>
> >>>>>>>> Server type information : Accumulo 807
> >>>>>>>> ------------------------------******---
> >>>>>>>>
> >>>>>>>> For displaying this informations we can add a new page and display
> >>>>>>>>
> >>>>>>> the
> >>>
> >>>>  information as a table. The table should specify the network address
> >>>>>>>>
> >>>>>>> of
> >>>>
> >>>>>  the
> >>>>>>>> server, server type, weather it is active or in-active etc.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>> Supun...
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>

Re: GSOC: Monitor Improvements

Posted by Supun Kamburugamuva <su...@gmail.com>.

That sounds interesting. To clarify the requirement, we can have a process
that exposes the same JMX mbeans as the the real server and monitor can
plug in to this process.

Thanks,
Supun..


On Mon, Apr 22, 2013 at 10:57 AM, Josh Elser <jo...@gmail.com> wrote:

> That would be pretty sweet, actually. Potentially parallel to what you
> want to do, Supun, but cool nonetheless.
>
> I could see a lot of benefit by having some process that could emulate the
> output from a non-trivially-sized Accumulo cluster on a single box.
>
>
> On 4/22/13 10:43 AM, Eric Newton wrote:
>
>> You could mock the stats collection.
>>
>> -Eric
>>
>>
>> On Mon, Apr 22, 2013 at 10:41 AM, David Medinets
>> <da...@gmail.com>**wrote:
>>
>>  The average developer probably can't access a large cluster with hundred
>>> of
>>> nodes. Is there a way to simulate this?
>>>
>>>
>>> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
>>> wrote:
>>>
>>>  Another thing to consider is scale.  On large clusters (many hundreds of
>>>> nodes), more data is not helpful for visualization.  Instead, summaries,
>>>> averages and outliers are important.
>>>>
>>>> For example, if one node is consistently slow, it is better to know that
>>>> than to see one graph with low numbers in a sea of graphs.
>>>>
>>>> If the monitor collects information using JMX, collection time for each
>>>> node would be a good thing to know, too.
>>>>
>>>> -Eric
>>>>
>>>>
>>>> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
>>>>
>>> wrote:
>>>
>>>> Supun,
>>>>>
>>>>> Yup, very much so. Having a way to consume any and all metrics via JMX
>>>>> would simplify things for any consumers (internal or external).
>>>>>
>>>>>
>>>>>
>>>>> On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
>>>>>
>>>>>  Hi Josh,
>>>>>>
>>>>>> Thanks for the suggestions. I'll incorporate these to the proposal.
>>>>>>
>>>>>> Another area I would like to work is on JMX. There is a Jira that says
>>>>>>
>>>>> to
>>>>
>>>>> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
>>>>>>
>>>>> think
>>>>
>>>>> this is a good addition to the Monitor?
>>>>>>
>>>>>> Thanks,
>>>>>> Supun..
>>>>>>
>>>>>>
>>>>>> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
>>>>>>
>>>>> wrote:
>>>>
>>>>>   Supun,
>>>>>>
>>>>>>> Looks good! Can I make some suggestions/comments?
>>>>>>>
>>>>>>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
>>>>>>> compactions, major compactions, index cache hit rate, and data cache
>>>>>>>
>>>>>> hit
>>>>
>>>>> rate per table (same graphs that are displayed system-wide when you
>>>>>>>
>>>>>> visit
>>>>
>>>>> http://${MONITOR_HOST}:50095/.
>>>>>>>
>>>>>>> For "Per tablet [server] plots", it would be neat if you could also
>>>>>>> extract some general statistics like top N least performing, top N
>>>>>>> highest
>>>>>>> performing, etc. tablet servers. Ideally, this could correlate with
>>>>>>> servers
>>>>>>> that may be having problems :).
>>>>>>>
>>>>>>> Do you see these proposed changes as being sufficient for 3-4 months
>>>>>>>
>>>>>> of
>>>
>>>>  40hrs/week work? If you plan to really dig into these changes
>>>>>>>
>>>>>> (perhaps
>>>
>>>>  reworking components of the monitor itself), I could perhaps see
>>>>>>>
>>>>>> this.
>>>
>>>> Do
>>>>
>>>>> you have any ideas for more lofty goals that you could pursue as
>>>>>>>
>>>>>> well?
>>>
>>>> I
>>>>
>>>>> don't want you/us to get one month into things and see you complete
>>>>>>> everything we initially planned to accomplish :)
>>>>>>>
>>>>>>> - Josh
>>>>>>>
>>>>>>>
>>>>>>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
>>>>>>>
>>>>>>>   Hi all,
>>>>>>>
>>>>>>>> I would like to start writing the proposal for the GSoc. I've put
>>>>>>>> together
>>>>>>>> some initial high level goals of the project. Please let me know
>>>>>>>>
>>>>>>> what
>>>
>>>> I
>>>>
>>>>>  can
>>>>>>>> improve.
>>>>>>>>
>>>>>>>> Per table plots: Accumulo 594
>>>>>>>> ---------------------
>>>>>>>>
>>>>>>>> The goal of this is to display plots that explains the various
>>>>>>>> activtities
>>>>>>>> that happens per table. When we go to the tables page of the monitor
>>>>>>>>
>>>>>>> and
>>>>
>>>>>  go
>>>>>>>> to a specific table it displays some information in a table format.
>>>>>>>>
>>>>>>> We
>>>
>>>>  can
>>>>>>>> argument this information by showing graphs for
>>>>>>>>
>>>>>>>> 1. Ingest entries
>>>>>>>> 2. Ingest data size
>>>>>>>> 3. Scan entries
>>>>>>>> 4. Scan data size
>>>>>>>>
>>>>>>>> Per tablet plots
>>>>>>>> ----------------------
>>>>>>>>
>>>>>>>> Same as in the table plots we can display information regarding
>>>>>>>>
>>>>>>> tablet
>>>
>>>>  servers in the tablet server page. The plots will display the same
>>>>>>>> information as table plots considering data per tablet server.
>>>>>>>>
>>>>>>>> Trace Visualization: Accumulo 1198
>>>>>>>> ----------------------------
>>>>>>>>
>>>>>>>> Since we are displaying graphs about each tablet and each table we
>>>>>>>>
>>>>>>> can
>>>
>>>>  add
>>>>>>>> major and minor compaction graph to each table and each tablet.
>>>>>>>>
>>>>>>>> Or other option is to display this in a single graph in overview
>>>>>>>>
>>>>>>> page
>>>
>>>>  with
>>>>>>>> different graph lines for different tables and tablets.
>>>>>>>>
>>>>>>>> Server type information : Accumulo 807
>>>>>>>> ------------------------------******---
>>>>>>>>
>>>>>>>> For displaying this informations we can add a new page and display
>>>>>>>>
>>>>>>> the
>>>
>>>>  information as a table. The table should specify the network address
>>>>>>>>
>>>>>>> of
>>>>
>>>>>  the
>>>>>>>> server, server type, weather it is active or in-active etc.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Supun...
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>


-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: GSOC: Monitor Improvements

Posted by Josh Elser <jo...@gmail.com>.

That would be pretty sweet, actually. Potentially parallel to what you 
want to do, Supun, but cool nonetheless.

I could see a lot of benefit by having some process that could emulate 
the output from a non-trivially-sized Accumulo cluster on a single box.

On 4/22/13 10:43 AM, Eric Newton wrote:
> You could mock the stats collection.
>
> -Eric
>
>
> On Mon, Apr 22, 2013 at 10:41 AM, David Medinets
> <da...@gmail.com>wrote:
>
>> The average developer probably can't access a large cluster with hundred of
>> nodes. Is there a way to simulate this?
>>
>>
>> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
>> wrote:
>>
>>> Another thing to consider is scale.  On large clusters (many hundreds of
>>> nodes), more data is not helpful for visualization.  Instead, summaries,
>>> averages and outliers are important.
>>>
>>> For example, if one node is consistently slow, it is better to know that
>>> than to see one graph with low numbers in a sea of graphs.
>>>
>>> If the monitor collects information using JMX, collection time for each
>>> node would be a good thing to know, too.
>>>
>>> -Eric
>>>
>>>
>>> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
>> wrote:
>>>> Supun,
>>>>
>>>> Yup, very much so. Having a way to consume any and all metrics via JMX
>>>> would simplify things for any consumers (internal or external).
>>>>
>>>>
>>>>
>>>> On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
>>>>
>>>>> Hi Josh,
>>>>>
>>>>> Thanks for the suggestions. I'll incorporate these to the proposal.
>>>>>
>>>>> Another area I would like to work is on JMX. There is a Jira that says
>>> to
>>>>> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
>>> think
>>>>> this is a good addition to the Monitor?
>>>>>
>>>>> Thanks,
>>>>> Supun..
>>>>>
>>>>>
>>>>> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
>>> wrote:
>>>>>   Supun,
>>>>>> Looks good! Can I make some suggestions/comments?
>>>>>>
>>>>>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
>>>>>> compactions, major compactions, index cache hit rate, and data cache
>>> hit
>>>>>> rate per table (same graphs that are displayed system-wide when you
>>> visit
>>>>>> http://${MONITOR_HOST}:50095/.
>>>>>>
>>>>>> For "Per tablet [server] plots", it would be neat if you could also
>>>>>> extract some general statistics like top N least performing, top N
>>>>>> highest
>>>>>> performing, etc. tablet servers. Ideally, this could correlate with
>>>>>> servers
>>>>>> that may be having problems :).
>>>>>>
>>>>>> Do you see these proposed changes as being sufficient for 3-4 months
>> of
>>>>>> 40hrs/week work? If you plan to really dig into these changes
>> (perhaps
>>>>>> reworking components of the monitor itself), I could perhaps see
>> this.
>>> Do
>>>>>> you have any ideas for more lofty goals that you could pursue as
>> well?
>>> I
>>>>>> don't want you/us to get one month into things and see you complete
>>>>>> everything we initially planned to accomplish :)
>>>>>>
>>>>>> - Josh
>>>>>>
>>>>>>
>>>>>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
>>>>>>
>>>>>>   Hi all,
>>>>>>> I would like to start writing the proposal for the GSoc. I've put
>>>>>>> together
>>>>>>> some initial high level goals of the project. Please let me know
>> what
>>> I
>>>>>>> can
>>>>>>> improve.
>>>>>>>
>>>>>>> Per table plots: Accumulo 594
>>>>>>> ---------------------
>>>>>>>
>>>>>>> The goal of this is to display plots that explains the various
>>>>>>> activtities
>>>>>>> that happens per table. When we go to the tables page of the monitor
>>> and
>>>>>>> go
>>>>>>> to a specific table it displays some information in a table format.
>> We
>>>>>>> can
>>>>>>> argument this information by showing graphs for
>>>>>>>
>>>>>>> 1. Ingest entries
>>>>>>> 2. Ingest data size
>>>>>>> 3. Scan entries
>>>>>>> 4. Scan data size
>>>>>>>
>>>>>>> Per tablet plots
>>>>>>> ----------------------
>>>>>>>
>>>>>>> Same as in the table plots we can display information regarding
>> tablet
>>>>>>> servers in the tablet server page. The plots will display the same
>>>>>>> information as table plots considering data per tablet server.
>>>>>>>
>>>>>>> Trace Visualization: Accumulo 1198
>>>>>>> ----------------------------
>>>>>>>
>>>>>>> Since we are displaying graphs about each tablet and each table we
>> can
>>>>>>> add
>>>>>>> major and minor compaction graph to each table and each tablet.
>>>>>>>
>>>>>>> Or other option is to display this in a single graph in overview
>> page
>>>>>>> with
>>>>>>> different graph lines for different tables and tablets.
>>>>>>>
>>>>>>> Server type information : Accumulo 807
>>>>>>> ------------------------------****---
>>>>>>>
>>>>>>> For displaying this informations we can add a new page and display
>> the
>>>>>>> information as a table. The table should specify the network address
>>> of
>>>>>>> the
>>>>>>> server, server type, weather it is active or in-active etc.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Supun...
>>>>>>>
>>>>>>>
>>>>>>>

Re: GSOC: Monitor Improvements

Posted by Eric Newton <er...@gmail.com>.

You could mock the stats collection.

-Eric


On Mon, Apr 22, 2013 at 10:41 AM, David Medinets
<da...@gmail.com>wrote:

> The average developer probably can't access a large cluster with hundred of
> nodes. Is there a way to simulate this?
>
>
> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
> wrote:
>
> > Another thing to consider is scale.  On large clusters (many hundreds of
> > nodes), more data is not helpful for visualization.  Instead, summaries,
> > averages and outliers are important.
> >
> > For example, if one node is consistently slow, it is better to know that
> > than to see one graph with low numbers in a sea of graphs.
> >
> > If the monitor collects information using JMX, collection time for each
> > node would be a good thing to know, too.
> >
> > -Eric
> >
> >
> > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >
> > > Supun,
> > >
> > > Yup, very much so. Having a way to consume any and all metrics via JMX
> > > would simplify things for any consumers (internal or external).
> > >
> > >
> > >
> > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > >
> > >> Hi Josh,
> > >>
> > >> Thanks for the suggestions. I'll incorporate these to the proposal.
> > >>
> > >> Another area I would like to work is on JMX. There is a Jira that says
> > to
> > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> > think
> > >> this is a good addition to the Monitor?
> > >>
> > >> Thanks,
> > >> Supun..
> > >>
> > >>
> > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
> > wrote:
> > >>
> > >>  Supun,
> > >>>
> > >>> Looks good! Can I make some suggestions/comments?
> > >>>
> > >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> > >>> compactions, major compactions, index cache hit rate, and data cache
> > hit
> > >>> rate per table (same graphs that are displayed system-wide when you
> > visit
> > >>> http://${MONITOR_HOST}:50095/.
> > >>>
> > >>> For "Per tablet [server] plots", it would be neat if you could also
> > >>> extract some general statistics like top N least performing, top N
> > >>> highest
> > >>> performing, etc. tablet servers. Ideally, this could correlate with
> > >>> servers
> > >>> that may be having problems :).
> > >>>
> > >>> Do you see these proposed changes as being sufficient for 3-4 months
> of
> > >>> 40hrs/week work? If you plan to really dig into these changes
> (perhaps
> > >>> reworking components of the monitor itself), I could perhaps see
> this.
> > Do
> > >>> you have any ideas for more lofty goals that you could pursue as
> well?
> > I
> > >>> don't want you/us to get one month into things and see you complete
> > >>> everything we initially planned to accomplish :)
> > >>>
> > >>> - Josh
> > >>>
> > >>>
> > >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> > >>>
> > >>>  Hi all,
> > >>>>
> > >>>> I would like to start writing the proposal for the GSoc. I've put
> > >>>> together
> > >>>> some initial high level goals of the project. Please let me know
> what
> > I
> > >>>> can
> > >>>> improve.
> > >>>>
> > >>>> Per table plots: Accumulo 594
> > >>>> ---------------------
> > >>>>
> > >>>> The goal of this is to display plots that explains the various
> > >>>> activtities
> > >>>> that happens per table. When we go to the tables page of the monitor
> > and
> > >>>> go
> > >>>> to a specific table it displays some information in a table format.
> We
> > >>>> can
> > >>>> argument this information by showing graphs for
> > >>>>
> > >>>> 1. Ingest entries
> > >>>> 2. Ingest data size
> > >>>> 3. Scan entries
> > >>>> 4. Scan data size
> > >>>>
> > >>>> Per tablet plots
> > >>>> ----------------------
> > >>>>
> > >>>> Same as in the table plots we can display information regarding
> tablet
> > >>>> servers in the tablet server page. The plots will display the same
> > >>>> information as table plots considering data per tablet server.
> > >>>>
> > >>>> Trace Visualization: Accumulo 1198
> > >>>> ----------------------------
> > >>>>
> > >>>> Since we are displaying graphs about each tablet and each table we
> can
> > >>>> add
> > >>>> major and minor compaction graph to each table and each tablet.
> > >>>>
> > >>>> Or other option is to display this in a single graph in overview
> page
> > >>>> with
> > >>>> different graph lines for different tables and tablets.
> > >>>>
> > >>>> Server type information : Accumulo 807
> > >>>> ------------------------------****---
> > >>>>
> > >>>> For displaying this informations we can add a new page and display
> the
> > >>>> information as a table. The table should specify the network address
> > of
> > >>>> the
> > >>>> server, server type, weather it is active or in-active etc.
> > >>>>
> > >>>> Thanks,
> > >>>> Supun...
> > >>>>
> > >>>>
> > >>>>
> > >>
> > >
> >
>

Re: GSOC: Monitor Improvements

Posted by David Medinets <da...@gmail.com>.

The average developer probably can't access a large cluster with hundred of
nodes. Is there a way to simulate this?


On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com> wrote:

> Another thing to consider is scale.  On large clusters (many hundreds of
> nodes), more data is not helpful for visualization.  Instead, summaries,
> averages and outliers are important.
>
> For example, if one node is consistently slow, it is better to know that
> than to see one graph with low numbers in a sea of graphs.
>
> If the monitor collects information using JMX, collection time for each
> node would be a good thing to know, too.
>
> -Eric
>
>
> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com> wrote:
>
> > Supun,
> >
> > Yup, very much so. Having a way to consume any and all metrics via JMX
> > would simplify things for any consumers (internal or external).
> >
> >
> >
> > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> >
> >> Hi Josh,
> >>
> >> Thanks for the suggestions. I'll incorporate these to the proposal.
> >>
> >> Another area I would like to work is on JMX. There is a Jira that says
> to
> >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> think
> >> this is a good addition to the Monitor?
> >>
> >> Thanks,
> >> Supun..
> >>
> >>
> >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >>
> >>  Supun,
> >>>
> >>> Looks good! Can I make some suggestions/comments?
> >>>
> >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> >>> compactions, major compactions, index cache hit rate, and data cache
> hit
> >>> rate per table (same graphs that are displayed system-wide when you
> visit
> >>> http://${MONITOR_HOST}:50095/.
> >>>
> >>> For "Per tablet [server] plots", it would be neat if you could also
> >>> extract some general statistics like top N least performing, top N
> >>> highest
> >>> performing, etc. tablet servers. Ideally, this could correlate with
> >>> servers
> >>> that may be having problems :).
> >>>
> >>> Do you see these proposed changes as being sufficient for 3-4 months of
> >>> 40hrs/week work? If you plan to really dig into these changes (perhaps
> >>> reworking components of the monitor itself), I could perhaps see this.
> Do
> >>> you have any ideas for more lofty goals that you could pursue as well?
> I
> >>> don't want you/us to get one month into things and see you complete
> >>> everything we initially planned to accomplish :)
> >>>
> >>> - Josh
> >>>
> >>>
> >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> >>>
> >>>  Hi all,
> >>>>
> >>>> I would like to start writing the proposal for the GSoc. I've put
> >>>> together
> >>>> some initial high level goals of the project. Please let me know what
> I
> >>>> can
> >>>> improve.
> >>>>
> >>>> Per table plots: Accumulo 594
> >>>> ---------------------
> >>>>
> >>>> The goal of this is to display plots that explains the various
> >>>> activtities
> >>>> that happens per table. When we go to the tables page of the monitor
> and
> >>>> go
> >>>> to a specific table it displays some information in a table format. We
> >>>> can
> >>>> argument this information by showing graphs for
> >>>>
> >>>> 1. Ingest entries
> >>>> 2. Ingest data size
> >>>> 3. Scan entries
> >>>> 4. Scan data size
> >>>>
> >>>> Per tablet plots
> >>>> ----------------------
> >>>>
> >>>> Same as in the table plots we can display information regarding tablet
> >>>> servers in the tablet server page. The plots will display the same
> >>>> information as table plots considering data per tablet server.
> >>>>
> >>>> Trace Visualization: Accumulo 1198
> >>>> ----------------------------
> >>>>
> >>>> Since we are displaying graphs about each tablet and each table we can
> >>>> add
> >>>> major and minor compaction graph to each table and each tablet.
> >>>>
> >>>> Or other option is to display this in a single graph in overview page
> >>>> with
> >>>> different graph lines for different tables and tablets.
> >>>>
> >>>> Server type information : Accumulo 807
> >>>> ------------------------------****---
> >>>>
> >>>> For displaying this informations we can add a new page and display the
> >>>> information as a table. The table should specify the network address
> of
> >>>> the
> >>>> server, server type, weather it is active or in-active etc.
> >>>>
> >>>> Thanks,
> >>>> Supun...
> >>>>
> >>>>
> >>>>
> >>
> >
>

Re: GSOC: Monitor Improvements

Posted by Gabe Bell <ch...@gmail.com>.

RE: RRDTool, there is rrd4j - a Java implementation licensed under Apache 2.0 (https://code.google.com/p/rrd4j/)
On Apr 22, 2013, at 11:03 AM, Eric Newton <er...@gmail.com> wrote:

> Presently the information is stored in memory and it certainly could be
> stored in tables.
> 
> This reminds me of an idea that I've been thinking about for a long time.
> It's a little aggressive to do in a single summer.
> 
> ----
> 
> RRDTool stores time series data in fixed-length files.  One important
> feature is the ability to compress time-series data into less-fine-grained
> results over time.
> 
> However, updating many RRD files, with periodic updates, requires making
> lots of small seeks and updates to individual files.  It works well when
> all the files fit in the disk cache.  It falls down hard when it doesn't.
> 
> My idea is to put updates into an Accumulo row for one collected data
> point, along with some recent version in RRD format:
> 
> Key                         Value
> row, cf:cq
> --------------------------------------------------------
> point rrd:                  [RRDTool data]
> point ts:timestamp    value
> point ts:timestamp    value
> point ts:timestamp    value
> point ts:timestamp    value
> point ts:timestamp    value
> 
> When the tablet compacts, you use a Combiner to push the updates into the
> RRD data:
> 
> Key                         Value
> row, cf:cv
> -------------------------------------------------------
> point rrd:                  [Updated RRDTool data]
> point ts:timestamp    value
> 
> Further, when you scan the data, you could use an RRD iterator to perform
> queries on the RRD format, which would extract out only the
> summary/graph/data you want.
> 
> This leverages the Accumulo write-ahead log, and efficiency of
> log-structured merge trees to defer RRD updates to a point where they can
> be done efficiently (with respect to disk seeks), and even the block cache
> to access recently read information quickly.  And, the data won't grow
> indefinitely due to the properties of the RRD storage format.
> 
> Sadly, RRDTool does not have a Java API.  But there appear to be java-based
> substitutes; I have no idea if they are license compatible.
> 
> OpenTSDB does something similar: they compress updates into blocks of
> updates in hourly chunks, converting many small records into one larger
> one.  Their scheme does not lose data, which was important to them.
> 
> 
> -Eric
> 
> 
> 
> On Mon, Apr 22, 2013 at 10:33 AM, Supun Kamburugamuva <su...@gmail.com>wrote:
> 
>> I can see how summaries are very helpful to a user. We can introduce new
>> fields to the existing table/tablet summery tables that displays problem
>> information etc.
>> 
>> To make the JMX polling time configurable we can introduce configuration
>> parameters.
>> 
>> For the JMX statistics we can keep data at the server for a constant time
>> to avoid memory growth. I think the stats are stored in memory (please
>> correct me if I'm wrong). If that is the case, is it possible to store them
>> in accumulo tables?
>> 
>> Thanks,
>> Supun...
>> 
>> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
>> wrote:
>> 
>>> Another thing to consider is scale.  On large clusters (many hundreds of
>>> nodes), more data is not helpful for visualization.  Instead, summaries,
>>> averages and outliers are important.
>>> 
>>> For example, if one node is consistently slow, it is better to know that
>>> than to see one graph with low numbers in a sea of graphs.
>> 
>> 
>>> If the monitor collects information using JMX, collection time for each
>>> node would be a good thing to know, too.
>>> 
>> 
>> 
>> 
>> 
>>> 
>>> -Eric
>>> 
>>> 
>>> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
>> wrote:
>>> 
>>>> Supun,
>>>> 
>>>> Yup, very much so. Having a way to consume any and all metrics via JMX
>>>> would simplify things for any consumers (internal or external).
>>>> 
>>>> 
>>>> 
>>>> On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
>>>> 
>>>>> Hi Josh,
>>>>> 
>>>>> Thanks for the suggestions. I'll incorporate these to the proposal.
>>>>> 
>>>>> Another area I would like to work is on JMX. There is a Jira that says
>>> to
>>>>> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
>>> think
>>>>> this is a good addition to the Monitor?
>>>>> 
>>>>> Thanks,
>>>>> Supun..
>>>>> 
>>>>> 
>>>>> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
>>> wrote:
>>>>> 
>>>>> Supun,
>>>>>> 
>>>>>> Looks good! Can I make some suggestions/comments?
>>>>>> 
>>>>>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
>>>>>> compactions, major compactions, index cache hit rate, and data cache
>>> hit
>>>>>> rate per table (same graphs that are displayed system-wide when you
>>> visit
>>>>>> http://${MONITOR_HOST}:50095/.
>>>>>> 
>>>>>> For "Per tablet [server] plots", it would be neat if you could also
>>>>>> extract some general statistics like top N least performing, top N
>>>>>> highest
>>>>>> performing, etc. tablet servers. Ideally, this could correlate with
>>>>>> servers
>>>>>> that may be having problems :).
>>>>>> 
>>>>>> Do you see these proposed changes as being sufficient for 3-4 months
>> of
>>>>>> 40hrs/week work? If you plan to really dig into these changes
>> (perhaps
>>>>>> reworking components of the monitor itself), I could perhaps see
>> this.
>>> Do
>>>>>> you have any ideas for more lofty goals that you could pursue as
>> well?
>>> I
>>>>>> don't want you/us to get one month into things and see you complete
>>>>>> everything we initially planned to accomplish :)
>>>>>> 
>>>>>> - Josh
>>>>>> 
>>>>>> 
>>>>>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
>>>>>> 
>>>>>> Hi all,
>>>>>>> 
>>>>>>> I would like to start writing the proposal for the GSoc. I've put
>>>>>>> together
>>>>>>> some initial high level goals of the project. Please let me know
>> what
>>> I
>>>>>>> can
>>>>>>> improve.
>>>>>>> 
>>>>>>> Per table plots: Accumulo 594
>>>>>>> ---------------------
>>>>>>> 
>>>>>>> The goal of this is to display plots that explains the various
>>>>>>> activtities
>>>>>>> that happens per table. When we go to the tables page of the monitor
>>> and
>>>>>>> go
>>>>>>> to a specific table it displays some information in a table format.
>> We
>>>>>>> can
>>>>>>> argument this information by showing graphs for
>>>>>>> 
>>>>>>> 1. Ingest entries
>>>>>>> 2. Ingest data size
>>>>>>> 3. Scan entries
>>>>>>> 4. Scan data size
>>>>>>> 
>>>>>>> Per tablet plots
>>>>>>> ----------------------
>>>>>>> 
>>>>>>> Same as in the table plots we can display information regarding
>> tablet
>>>>>>> servers in the tablet server page. The plots will display the same
>>>>>>> information as table plots considering data per tablet server.
>>>>>>> 
>>>>>>> Trace Visualization: Accumulo 1198
>>>>>>> ----------------------------
>>>>>>> 
>>>>>>> Since we are displaying graphs about each tablet and each table we
>> can
>>>>>>> add
>>>>>>> major and minor compaction graph to each table and each tablet.
>>>>>>> 
>>>>>>> Or other option is to display this in a single graph in overview
>> page
>>>>>>> with
>>>>>>> different graph lines for different tables and tablets.
>>>>>>> 
>>>>>>> Server type information : Accumulo 807
>>>>>>> ------------------------------****---
>>>>>>> 
>>>>>>> For displaying this informations we can add a new page and display
>> the
>>>>>>> information as a table. The table should specify the network address
>>> of
>>>>>>> the
>>>>>>> server, server type, weather it is active or in-active etc.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Supun...
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>> 
>> 
>> --
>> Supun Kamburugamuva
>> Member, Apache Software Foundation; http://www.apache.org
>> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
>> Blog: http://supunk.blogspot.com
>>

Re: GSOC: Monitor Improvements

Posted by Eric Newton <er...@gmail.com>.

Presently the information is stored in memory and it certainly could be
stored in tables.

This reminds me of an idea that I've been thinking about for a long time.
 It's a little aggressive to do in a single summer.

----

RRDTool stores time series data in fixed-length files.  One important
feature is the ability to compress time-series data into less-fine-grained
results over time.

However, updating many RRD files, with periodic updates, requires making
lots of small seeks and updates to individual files.  It works well when
all the files fit in the disk cache.  It falls down hard when it doesn't.

My idea is to put updates into an Accumulo row for one collected data
point, along with some recent version in RRD format:

Key                         Value
row, cf:cq
--------------------------------------------------------
point rrd:                  [RRDTool data]
point ts:timestamp    value
point ts:timestamp    value
point ts:timestamp    value
point ts:timestamp    value
point ts:timestamp    value

When the tablet compacts, you use a Combiner to push the updates into the
RRD data:

Key                         Value
row, cf:cv
-------------------------------------------------------
point rrd:                  [Updated RRDTool data]
point ts:timestamp    value

Further, when you scan the data, you could use an RRD iterator to perform
queries on the RRD format, which would extract out only the
summary/graph/data you want.

This leverages the Accumulo write-ahead log, and efficiency of
log-structured merge trees to defer RRD updates to a point where they can
be done efficiently (with respect to disk seeks), and even the block cache
to access recently read information quickly.  And, the data won't grow
indefinitely due to the properties of the RRD storage format.

Sadly, RRDTool does not have a Java API.  But there appear to be java-based
substitutes; I have no idea if they are license compatible.

OpenTSDB does something similar: they compress updates into blocks of
updates in hourly chunks, converting many small records into one larger
one.  Their scheme does not lose data, which was important to them.


-Eric



On Mon, Apr 22, 2013 at 10:33 AM, Supun Kamburugamuva <su...@gmail.com>wrote:

> I can see how summaries are very helpful to a user. We can introduce new
> fields to the existing table/tablet summery tables that displays problem
> information etc.
>
> To make the JMX polling time configurable we can introduce configuration
> parameters.
>
> For the JMX statistics we can keep data at the server for a constant time
> to avoid memory growth. I think the stats are stored in memory (please
> correct me if I'm wrong). If that is the case, is it possible to store them
> in accumulo tables?
>
> Thanks,
> Supun...
>
> On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com>
> wrote:
>
> > Another thing to consider is scale.  On large clusters (many hundreds of
> > nodes), more data is not helpful for visualization.  Instead, summaries,
> > averages and outliers are important.
> >
> > For example, if one node is consistently slow, it is better to know that
> > than to see one graph with low numbers in a sea of graphs.
>
>
> > If the monitor collects information using JMX, collection time for each
> > node would be a good thing to know, too.
> >
>
>
>
>
> >
> > -Eric
> >
> >
> > On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >
> > > Supun,
> > >
> > > Yup, very much so. Having a way to consume any and all metrics via JMX
> > > would simplify things for any consumers (internal or external).
> > >
> > >
> > >
> > > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> > >
> > >> Hi Josh,
> > >>
> > >> Thanks for the suggestions. I'll incorporate these to the proposal.
> > >>
> > >> Another area I would like to work is on JMX. There is a Jira that says
> > to
> > >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> > think
> > >> this is a good addition to the Monitor?
> > >>
> > >> Thanks,
> > >> Supun..
> > >>
> > >>
> > >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
> > wrote:
> > >>
> > >>  Supun,
> > >>>
> > >>> Looks good! Can I make some suggestions/comments?
> > >>>
> > >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> > >>> compactions, major compactions, index cache hit rate, and data cache
> > hit
> > >>> rate per table (same graphs that are displayed system-wide when you
> > visit
> > >>> http://${MONITOR_HOST}:50095/.
> > >>>
> > >>> For "Per tablet [server] plots", it would be neat if you could also
> > >>> extract some general statistics like top N least performing, top N
> > >>> highest
> > >>> performing, etc. tablet servers. Ideally, this could correlate with
> > >>> servers
> > >>> that may be having problems :).
> > >>>
> > >>> Do you see these proposed changes as being sufficient for 3-4 months
> of
> > >>> 40hrs/week work? If you plan to really dig into these changes
> (perhaps
> > >>> reworking components of the monitor itself), I could perhaps see
> this.
> > Do
> > >>> you have any ideas for more lofty goals that you could pursue as
> well?
> > I
> > >>> don't want you/us to get one month into things and see you complete
> > >>> everything we initially planned to accomplish :)
> > >>>
> > >>> - Josh
> > >>>
> > >>>
> > >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> > >>>
> > >>>  Hi all,
> > >>>>
> > >>>> I would like to start writing the proposal for the GSoc. I've put
> > >>>> together
> > >>>> some initial high level goals of the project. Please let me know
> what
> > I
> > >>>> can
> > >>>> improve.
> > >>>>
> > >>>> Per table plots: Accumulo 594
> > >>>> ---------------------
> > >>>>
> > >>>> The goal of this is to display plots that explains the various
> > >>>> activtities
> > >>>> that happens per table. When we go to the tables page of the monitor
> > and
> > >>>> go
> > >>>> to a specific table it displays some information in a table format.
> We
> > >>>> can
> > >>>> argument this information by showing graphs for
> > >>>>
> > >>>> 1. Ingest entries
> > >>>> 2. Ingest data size
> > >>>> 3. Scan entries
> > >>>> 4. Scan data size
> > >>>>
> > >>>> Per tablet plots
> > >>>> ----------------------
> > >>>>
> > >>>> Same as in the table plots we can display information regarding
> tablet
> > >>>> servers in the tablet server page. The plots will display the same
> > >>>> information as table plots considering data per tablet server.
> > >>>>
> > >>>> Trace Visualization: Accumulo 1198
> > >>>> ----------------------------
> > >>>>
> > >>>> Since we are displaying graphs about each tablet and each table we
> can
> > >>>> add
> > >>>> major and minor compaction graph to each table and each tablet.
> > >>>>
> > >>>> Or other option is to display this in a single graph in overview
> page
> > >>>> with
> > >>>> different graph lines for different tables and tablets.
> > >>>>
> > >>>> Server type information : Accumulo 807
> > >>>> ------------------------------****---
> > >>>>
> > >>>> For displaying this informations we can add a new page and display
> the
> > >>>> information as a table. The table should specify the network address
> > of
> > >>>> the
> > >>>> server, server type, weather it is active or in-active etc.
> > >>>>
> > >>>> Thanks,
> > >>>> Supun...
> > >>>>
> > >>>>
> > >>>>
> > >>
> > >
> >
>
>
>
> --
> Supun Kamburugamuva
> Member, Apache Software Foundation; http://www.apache.org
> E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
> Blog: http://supunk.blogspot.com
>

Re: GSOC: Monitor Improvements

Posted by Supun Kamburugamuva <su...@gmail.com>.

I can see how summaries are very helpful to a user. We can introduce new
fields to the existing table/tablet summery tables that displays problem
information etc.

To make the JMX polling time configurable we can introduce configuration
parameters.

For the JMX statistics we can keep data at the server for a constant time
to avoid memory growth. I think the stats are stored in memory (please
correct me if I'm wrong). If that is the case, is it possible to store them
in accumulo tables?

Thanks,
Supun...

On Mon, Apr 22, 2013 at 9:05 AM, Eric Newton <er...@gmail.com> wrote:

> Another thing to consider is scale.  On large clusters (many hundreds of
> nodes), more data is not helpful for visualization.  Instead, summaries,
> averages and outliers are important.
>
> For example, if one node is consistently slow, it is better to know that
> than to see one graph with low numbers in a sea of graphs.


> If the monitor collects information using JMX, collection time for each
> node would be a good thing to know, too.
>




>
> -Eric
>
>
> On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com> wrote:
>
> > Supun,
> >
> > Yup, very much so. Having a way to consume any and all metrics via JMX
> > would simplify things for any consumers (internal or external).
> >
> >
> >
> > On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> >
> >> Hi Josh,
> >>
> >> Thanks for the suggestions. I'll incorporate these to the proposal.
> >>
> >> Another area I would like to work is on JMX. There is a Jira that says
> to
> >> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you
> think
> >> this is a good addition to the Monitor?
> >>
> >> Thanks,
> >> Supun..
> >>
> >>
> >> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com>
> wrote:
> >>
> >>  Supun,
> >>>
> >>> Looks good! Can I make some suggestions/comments?
> >>>
> >>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> >>> compactions, major compactions, index cache hit rate, and data cache
> hit
> >>> rate per table (same graphs that are displayed system-wide when you
> visit
> >>> http://${MONITOR_HOST}:50095/.
> >>>
> >>> For "Per tablet [server] plots", it would be neat if you could also
> >>> extract some general statistics like top N least performing, top N
> >>> highest
> >>> performing, etc. tablet servers. Ideally, this could correlate with
> >>> servers
> >>> that may be having problems :).
> >>>
> >>> Do you see these proposed changes as being sufficient for 3-4 months of
> >>> 40hrs/week work? If you plan to really dig into these changes (perhaps
> >>> reworking components of the monitor itself), I could perhaps see this.
> Do
> >>> you have any ideas for more lofty goals that you could pursue as well?
> I
> >>> don't want you/us to get one month into things and see you complete
> >>> everything we initially planned to accomplish :)
> >>>
> >>> - Josh
> >>>
> >>>
> >>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> >>>
> >>>  Hi all,
> >>>>
> >>>> I would like to start writing the proposal for the GSoc. I've put
> >>>> together
> >>>> some initial high level goals of the project. Please let me know what
> I
> >>>> can
> >>>> improve.
> >>>>
> >>>> Per table plots: Accumulo 594
> >>>> ---------------------
> >>>>
> >>>> The goal of this is to display plots that explains the various
> >>>> activtities
> >>>> that happens per table. When we go to the tables page of the monitor
> and
> >>>> go
> >>>> to a specific table it displays some information in a table format. We
> >>>> can
> >>>> argument this information by showing graphs for
> >>>>
> >>>> 1. Ingest entries
> >>>> 2. Ingest data size
> >>>> 3. Scan entries
> >>>> 4. Scan data size
> >>>>
> >>>> Per tablet plots
> >>>> ----------------------
> >>>>
> >>>> Same as in the table plots we can display information regarding tablet
> >>>> servers in the tablet server page. The plots will display the same
> >>>> information as table plots considering data per tablet server.
> >>>>
> >>>> Trace Visualization: Accumulo 1198
> >>>> ----------------------------
> >>>>
> >>>> Since we are displaying graphs about each tablet and each table we can
> >>>> add
> >>>> major and minor compaction graph to each table and each tablet.
> >>>>
> >>>> Or other option is to display this in a single graph in overview page
> >>>> with
> >>>> different graph lines for different tables and tablets.
> >>>>
> >>>> Server type information : Accumulo 807
> >>>> ------------------------------****---
> >>>>
> >>>> For displaying this informations we can add a new page and display the
> >>>> information as a table. The table should specify the network address
> of
> >>>> the
> >>>> server, server type, weather it is active or in-active etc.
> >>>>
> >>>> Thanks,
> >>>> Supun...
> >>>>
> >>>>
> >>>>
> >>
> >
>



-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: GSOC: Monitor Improvements

Posted by Eric Newton <er...@gmail.com>.

Another thing to consider is scale.  On large clusters (many hundreds of
nodes), more data is not helpful for visualization.  Instead, summaries,
averages and outliers are important.

For example, if one node is consistently slow, it is better to know that
than to see one graph with low numbers in a sea of graphs.

If the monitor collects information using JMX, collection time for each
node would be a good thing to know, too.

-Eric


On Sun, Apr 21, 2013 at 10:00 PM, Josh Elser <jo...@gmail.com> wrote:

> Supun,
>
> Yup, very much so. Having a way to consume any and all metrics via JMX
> would simplify things for any consumers (internal or external).
>
>
>
> On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
>
>> Hi Josh,
>>
>> Thanks for the suggestions. I'll incorporate these to the proposal.
>>
>> Another area I would like to work is on JMX. There is a Jira that says to
>> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you think
>> this is a good addition to the Monitor?
>>
>> Thanks,
>> Supun..
>>
>>
>> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com> wrote:
>>
>>  Supun,
>>>
>>> Looks good! Can I make some suggestions/comments?
>>>
>>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
>>> compactions, major compactions, index cache hit rate, and data cache hit
>>> rate per table (same graphs that are displayed system-wide when you visit
>>> http://${MONITOR_HOST}:50095/.
>>>
>>> For "Per tablet [server] plots", it would be neat if you could also
>>> extract some general statistics like top N least performing, top N
>>> highest
>>> performing, etc. tablet servers. Ideally, this could correlate with
>>> servers
>>> that may be having problems :).
>>>
>>> Do you see these proposed changes as being sufficient for 3-4 months of
>>> 40hrs/week work? If you plan to really dig into these changes (perhaps
>>> reworking components of the monitor itself), I could perhaps see this. Do
>>> you have any ideas for more lofty goals that you could pursue as well? I
>>> don't want you/us to get one month into things and see you complete
>>> everything we initially planned to accomplish :)
>>>
>>> - Josh
>>>
>>>
>>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
>>>
>>>  Hi all,
>>>>
>>>> I would like to start writing the proposal for the GSoc. I've put
>>>> together
>>>> some initial high level goals of the project. Please let me know what I
>>>> can
>>>> improve.
>>>>
>>>> Per table plots: Accumulo 594
>>>> ---------------------
>>>>
>>>> The goal of this is to display plots that explains the various
>>>> activtities
>>>> that happens per table. When we go to the tables page of the monitor and
>>>> go
>>>> to a specific table it displays some information in a table format. We
>>>> can
>>>> argument this information by showing graphs for
>>>>
>>>> 1. Ingest entries
>>>> 2. Ingest data size
>>>> 3. Scan entries
>>>> 4. Scan data size
>>>>
>>>> Per tablet plots
>>>> ----------------------
>>>>
>>>> Same as in the table plots we can display information regarding tablet
>>>> servers in the tablet server page. The plots will display the same
>>>> information as table plots considering data per tablet server.
>>>>
>>>> Trace Visualization: Accumulo 1198
>>>> ----------------------------
>>>>
>>>> Since we are displaying graphs about each tablet and each table we can
>>>> add
>>>> major and minor compaction graph to each table and each tablet.
>>>>
>>>> Or other option is to display this in a single graph in overview page
>>>> with
>>>> different graph lines for different tables and tablets.
>>>>
>>>> Server type information : Accumulo 807
>>>> ------------------------------****---
>>>>
>>>> For displaying this informations we can add a new page and display the
>>>> information as a table. The table should specify the network address of
>>>> the
>>>> server, server type, weather it is active or in-active etc.
>>>>
>>>> Thanks,
>>>> Supun...
>>>>
>>>>
>>>>
>>
>

Re: GSOC: Monitor Improvements

Posted by Josh Elser <jo...@gmail.com>.

Supun,

Yup, very much so. Having a way to consume any and all metrics via JMX 
would simplify things for any consumers (internal or external).


On 04/21/2013 02:15 PM, Supun Kamburugamuva wrote:
> Hi Josh,
>
> Thanks for the suggestions. I'll incorporate these to the proposal.
>
> Another area I would like to work is on JMX. There is a Jira that says to
> replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you think
> this is a good addition to the Monitor?
>
> Thanks,
> Supun..
>
>
> On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com> wrote:
>
>> Supun,
>>
>> Looks good! Can I make some suggestions/comments?
>>
>> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
>> compactions, major compactions, index cache hit rate, and data cache hit
>> rate per table (same graphs that are displayed system-wide when you visit
>> http://${MONITOR_HOST}:50095/.
>>
>> For "Per tablet [server] plots", it would be neat if you could also
>> extract some general statistics like top N least performing, top N highest
>> performing, etc. tablet servers. Ideally, this could correlate with servers
>> that may be having problems :).
>>
>> Do you see these proposed changes as being sufficient for 3-4 months of
>> 40hrs/week work? If you plan to really dig into these changes (perhaps
>> reworking components of the monitor itself), I could perhaps see this. Do
>> you have any ideas for more lofty goals that you could pursue as well? I
>> don't want you/us to get one month into things and see you complete
>> everything we initially planned to accomplish :)
>>
>> - Josh
>>
>>
>> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
>>
>>> Hi all,
>>>
>>> I would like to start writing the proposal for the GSoc. I've put together
>>> some initial high level goals of the project. Please let me know what I
>>> can
>>> improve.
>>>
>>> Per table plots: Accumulo 594
>>> ---------------------
>>>
>>> The goal of this is to display plots that explains the various activtities
>>> that happens per table. When we go to the tables page of the monitor and
>>> go
>>> to a specific table it displays some information in a table format. We can
>>> argument this information by showing graphs for
>>>
>>> 1. Ingest entries
>>> 2. Ingest data size
>>> 3. Scan entries
>>> 4. Scan data size
>>>
>>> Per tablet plots
>>> ----------------------
>>>
>>> Same as in the table plots we can display information regarding tablet
>>> servers in the tablet server page. The plots will display the same
>>> information as table plots considering data per tablet server.
>>>
>>> Trace Visualization: Accumulo 1198
>>> ----------------------------
>>>
>>> Since we are displaying graphs about each tablet and each table we can add
>>> major and minor compaction graph to each table and each tablet.
>>>
>>> Or other option is to display this in a single graph in overview page with
>>> different graph lines for different tables and tablets.
>>>
>>> Server type information : Accumulo 807
>>> ------------------------------**---
>>>
>>> For displaying this informations we can add a new page and display the
>>> information as a table. The table should specify the network address of
>>> the
>>> server, server type, weather it is active or in-active etc.
>>>
>>> Thanks,
>>> Supun...
>>>
>>>
>

Re: GSOC: Monitor Improvements

Posted by Supun Kamburugamuva <su...@gmail.com>.

Hi Josh,

Thanks for the suggestions. I'll incorporate these to the proposal.

Another area I would like to work is on JMX. There is a Jira that says to
replace the Monitor calls from Thrift to JMX (Accumulo 694). Do you think
this is a good addition to the Monitor?

Thanks,
Supun..


On Sun, Apr 21, 2013 at 1:45 PM, Josh Elser <jo...@gmail.com> wrote:

> Supun,
>
> Looks good! Can I make some suggestions/comments?
>
> For: "Per table plots: ACCUMULO-594", I'd also like to see minor
> compactions, major compactions, index cache hit rate, and data cache hit
> rate per table (same graphs that are displayed system-wide when you visit
> http://${MONITOR_HOST}:50095/.
>
> For "Per tablet [server] plots", it would be neat if you could also
> extract some general statistics like top N least performing, top N highest
> performing, etc. tablet servers. Ideally, this could correlate with servers
> that may be having problems :).
>
> Do you see these proposed changes as being sufficient for 3-4 months of
> 40hrs/week work? If you plan to really dig into these changes (perhaps
> reworking components of the monitor itself), I could perhaps see this. Do
> you have any ideas for more lofty goals that you could pursue as well? I
> don't want you/us to get one month into things and see you complete
> everything we initially planned to accomplish :)
>
> - Josh
>
>
> On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
>
>> Hi all,
>>
>> I would like to start writing the proposal for the GSoc. I've put together
>> some initial high level goals of the project. Please let me know what I
>> can
>> improve.
>>
>> Per table plots: Accumulo 594
>> ---------------------
>>
>> The goal of this is to display plots that explains the various activtities
>> that happens per table. When we go to the tables page of the monitor and
>> go
>> to a specific table it displays some information in a table format. We can
>> argument this information by showing graphs for
>>
>> 1. Ingest entries
>> 2. Ingest data size
>> 3. Scan entries
>> 4. Scan data size
>>
>> Per tablet plots
>> ----------------------
>>
>> Same as in the table plots we can display information regarding tablet
>> servers in the tablet server page. The plots will display the same
>> information as table plots considering data per tablet server.
>>
>> Trace Visualization: Accumulo 1198
>> ----------------------------
>>
>> Since we are displaying graphs about each tablet and each table we can add
>> major and minor compaction graph to each table and each tablet.
>>
>> Or other option is to display this in a single graph in overview page with
>> different graph lines for different tables and tablets.
>>
>> Server type information : Accumulo 807
>> ------------------------------**---
>>
>> For displaying this informations we can add a new page and display the
>> information as a table. The table should specify the network address of
>> the
>> server, server type, weather it is active or in-active etc.
>>
>> Thanks,
>> Supun...
>>
>>
>


-- 
Supun Kamburugamuva
Member, Apache Software Foundation; http://www.apache.org
E-mail: supun06@gmail.com;  Mobile: +1 812 369 6762
Blog: http://supunk.blogspot.com

Re: GSOC: Monitor Improvements

Posted by Josh Elser <jo...@gmail.com>.

Supun,

Looks good! Can I make some suggestions/comments?

For: "Per table plots: ACCUMULO-594", I'd also like to see minor 
compactions, major compactions, index cache hit rate, and data cache hit 
rate per table (same graphs that are displayed system-wide when you 
visit http://${MONITOR_HOST}:50095/.

For "Per tablet [server] plots", it would be neat if you could also 
extract some general statistics like top N least performing, top N 
highest performing, etc. tablet servers. Ideally, this could correlate 
with servers that may be having problems :).

Do you see these proposed changes as being sufficient for 3-4 months of 
40hrs/week work? If you plan to really dig into these changes (perhaps 
reworking components of the monitor itself), I could perhaps see this. 
Do you have any ideas for more lofty goals that you could pursue as 
well? I don't want you/us to get one month into things and see you 
complete everything we initially planned to accomplish :)

- Josh

On 04/21/2013 10:37 AM, Supun Kamburugamuva wrote:
> Hi all,
>
> I would like to start writing the proposal for the GSoc. I've put together
> some initial high level goals of the project. Please let me know what I can
> improve.
>
> Per table plots: Accumulo 594
> ---------------------
>
> The goal of this is to display plots that explains the various activtities
> that happens per table. When we go to the tables page of the monitor and go
> to a specific table it displays some information in a table format. We can
> argument this information by showing graphs for
>
> 1. Ingest entries
> 2. Ingest data size
> 3. Scan entries
> 4. Scan data size
>
> Per tablet plots
> ----------------------
>
> Same as in the table plots we can display information regarding tablet
> servers in the tablet server page. The plots will display the same
> information as table plots considering data per tablet server.
>
> Trace Visualization: Accumulo 1198
> ----------------------------
>
> Since we are displaying graphs about each tablet and each table we can add
> major and minor compaction graph to each table and each tablet.
>
> Or other option is to display this in a single graph in overview page with
> different graph lines for different tables and tablets.
>
> Server type information : Accumulo 807
> ---------------------------------
>
> For displaying this informations we can add a new page and display the
> information as a table. The table should specify the network address of the
> server, server type, weather it is active or in-active etc.
>
> Thanks,
> Supun...
>