You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by z11373 <z1...@outlook.com> on 2016/06/10 13:35:39 UTC

measuring perf

Good morning!
I have a service running against different Accumulo instance (in different
datacenter).
Both Accumulo should have same configurations, but I was told by consumer of
my service is they experience one is faster than one in another datacenter.
The service being deployed is running on machine with same spec, and most
operations are against Accumulo, hence I am interested to capture the perf
(including network latency from Accumulo server to my service), and compare
them to verify if the problem is indeed accessing Accumulo instance is
slower than the other one. Right now I capture time from my service being
called and results being returned, but that doesn't tell how much time it
spent on Accumulo.

Unlike in traditional SQL database, I could measure the time it takes to run
a SELECT statement for example, but in Accumulo, nothing being read from
server, until we iterate (my understanding may be wrong), so for now I am
thinking perhaps I'd set the start time before setting the ranges, and set
the stop time when there is no more item from that iterator. Is this
reasonable, or perhaps there is a better way?

For additional info, my service will read from iterator, for each item, it
will make another scanner (and set range), and iterate again, and so on. So
if it ends up with 10 scanners, my current approach will log 10 perf
captures.


Thanks,
Z



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/measuring-perf-tp17245.html
Sent from the Developers mailing list archive at Nabble.com.

Re: measuring perf

Posted by z11373 <z1...@outlook.com>.
Thanks Mike for the pointer!
Enabling the tracing seems pretty involved to me, so I'd try with simple
solution suggested by Bill for now, and revisit this if needed later.

Thanks,
Z



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/measuring-perf-tp17245p17253.html
Sent from the Developers mailing list archive at Nabble.com.

Re: measuring perf

Posted by Michael Wall <mj...@gmail.com>.
Have you looking at the tracing service?  What version of Accumulo are
using?  In 1.7, tracing moved to using HTrace, so setup and implementation
will be a little different between 1.7 and 1.6.  Here are some docs.

http://accumulo.apache.org/1.7/accumulo_user_manual#tracing for 1.7 and
http://accumulo.apache.org/1.6/accumulo_user_manual#_tracing for 1.6.

If you can get this running, it should give you information about what is
going on with the scans.

Mike


On Fri, Jun 10, 2016 at 3:38 PM, z11373 <z1...@outlook.com> wrote:

> Thanks Bill. I'll give it a try.
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/measuring-perf-tp17245p17247.html
> Sent from the Developers mailing list archive at Nabble.com.
>

Re: measuring perf

Posted by z11373 <z1...@outlook.com>.
Thanks Bill. I'll give it a try.



--
View this message in context: http://apache-accumulo.1065345.n5.nabble.com/measuring-perf-tp17245p17247.html
Sent from the Developers mailing list archive at Nabble.com.

Re: measuring perf

Posted by William Slacum <ws...@gmail.com>.
I think it's reasonable to measure from the start of a for/while loop over
the Scanner. Such as:

```
// .. my initialization code
scanner.setRange(someRange)
Stopwatch timer = Stopwatch.createStarted();
for(Entry<Key, Value> e: scanner) {
  // my logic
}
timer.stop();
```
I've personally done this when measuring query performance and usually
gives a good estimate of what's going on, especially if the network has
low, constant latency.


On Fri, Jun 10, 2016 at 3:35 PM, z11373 <z1...@outlook.com> wrote:

> Good morning!
> I have a service running against different Accumulo instance (in different
> datacenter).
> Both Accumulo should have same configurations, but I was told by consumer
> of
> my service is they experience one is faster than one in another datacenter.
> The service being deployed is running on machine with same spec, and most
> operations are against Accumulo, hence I am interested to capture the perf
> (including network latency from Accumulo server to my service), and compare
> them to verify if the problem is indeed accessing Accumulo instance is
> slower than the other one. Right now I capture time from my service being
> called and results being returned, but that doesn't tell how much time it
> spent on Accumulo.
>
> Unlike in traditional SQL database, I could measure the time it takes to
> run
> a SELECT statement for example, but in Accumulo, nothing being read from
> server, until we iterate (my understanding may be wrong), so for now I am
> thinking perhaps I'd set the start time before setting the ranges, and set
> the stop time when there is no more item from that iterator. Is this
> reasonable, or perhaps there is a better way?
>
> For additional info, my service will read from iterator, for each item, it
> will make another scanner (and set range), and iterate again, and so on. So
> if it ends up with 10 scanners, my current approach will log 10 perf
> captures.
>
>
> Thanks,
> Z
>
>
>
> --
> View this message in context:
> http://apache-accumulo.1065345.n5.nabble.com/measuring-perf-tp17245.html
> Sent from the Developers mailing list archive at Nabble.com.
>