You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Mario Pastorelli <ma...@teralytics.ch> on 2016/08/26 20:54:33 UTC

Profile a (batch) scan

I would like to understand the performance of a batch scan and I would like
to have some hints on how to proceed. I have enabled the distributed trace,
and it tells me that some batch scanner threads take much more time than
others to complete but this is not helpful enough because it's not telling
me why some threads take more. My gut feeling is that one batch thread is
scanning more data than the others, which means that the data is not well
distributed for a query, but I use a random shard byte as prefix of the
keys which should guarantee that data of the same range is almost equally
distributed among the tservers. I enabled JMX on the tservers and attached
jvisualvm to get an idea of the state of each tserver but I couldn't find
anything meaningful. I would like to know if there is a way to profile
what's going on on a single tserver for a single scan thread and by this I
mean:

   1. where are the tablets required by a scan? Which tablet server?
   2. how fast was the lookups on the index for that scan?
   3. how many bytes/records were read for that scan without the iterators
   4. how many seeks are done by the scan and possibly why

The main Accumulo UI is fine to get an overview of Accumulo but don't
really give you any information about the performance of a single query and
it seems to me that they are heavily affected by what iterators do.
Profiling a single scan is much more interesting. Is there a way to profile
a single (batch) scan in Accumulo such that I have a complete overview of
the entire process of reading and sending back records to the driver?

Thanks,
Mario

-- 
Mario Pastorelli | TERALYTICS

*software engineer*

Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
phone: +41794381682
email: mario.pastorelli@teralytics.ch
www.teralytics.net

Company registration number: CH-020.3.037.709-7 | Trade register Canton
Zurich
Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
de Vries

This e-mail message contains confidential information which is for the sole
attention and use of the intended recipient. Please notify us at once if
you think that it may not be intended for you and delete it immediately.

Re: Profile a (batch) scan

Posted by Mario Pastorelli <ma...@teralytics.ch>.
That StopWatch class is a great idea, thanks! That's exactly what I need to
check the performance of the iterators.

On Sat, Aug 27, 2016 at 2:21 AM, Dylan Hutchison <dhutchis@cs.washington.edu
> wrote:

> If Zipkin isn't providing fine-grained enough information, one approach
> you could take is to add an iterator to your scan that tracks performance.
> Log the start times, seek ranges, and count throughput.  Each tablet will
> do this independently in separate threads.
>
> You might base this approach on Accumulo's StopWatch
> <https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/util/StopWatch.java>
> utility class.  I adapted this class when I wanted to time particular code
> segments in the Graphulo Watch
> <https://github.com/Accla/graphulo/blob/tracing/src/main/java/edu/mit/ll/graphulo/skvi/Watch.java>
> class.  You can see the Watch in action in the commented segments of this
> old branch
> <https://github.com/Accla/graphulo/blob/tracing/src/main/java/edu/mit/ll/graphulo/skvi/RemoteWriteIterator.java#L396>.
> It captures the number of times a code segment is called, the average
> runtime in that section, and the max and min runtime in that section.  It's
> a kind of brute force approach, one that would have been better to
> integrate with HTrace, but it worked well enough for my purpose.
>
> In your iterator, you would count and measure the time statistics for the
> next and seek methods.
>
> On Fri, Aug 26, 2016 at 3:31 PM, Jonathan Lasko <jl...@bbn.com> wrote:
>
>> I haven't had time to dig into it yet but am hoping the Zipkin will help
>> with some of these insights. (Unless that is the distributed trace you were
>> referring to?)
>>
>> -Jonathan
>>
>>
>> On 08/26/2016 04:54 PM, Mario Pastorelli wrote:
>>
>> I would like to understand the performance of a batch scan and I would
>> like to have some hints on how to proceed. I have enabled the distributed
>> trace, and it tells me that some batch scanner threads take much more time
>> than others to complete but this is not helpful enough because it's not
>> telling me why some threads take more. My gut feeling is that one batch
>> thread is scanning more data than the others, which means that the data is
>> not well distributed for a query, but I use a random shard byte as prefix
>> of the keys which should guarantee that data of the same range is almost
>> equally distributed among the tservers. I enabled JMX on the tservers and
>> attached jvisualvm to get an idea of the state of each tserver but I
>> couldn't find anything meaningful. I would like to know if there is a way
>> to profile what's going on on a single tserver for a single scan thread and
>> by this I mean:
>>
>>    1. where are the tablets required by a scan? Which tablet server?
>>    2. how fast was the lookups on the index for that scan?
>>    3. how many bytes/records were read for that scan without the
>>    iterators
>>    4. how many seeks are done by the scan and possibly why
>>
>> The main Accumulo UI is fine to get an overview of Accumulo but don't
>> really give you any information about the performance of a single query and
>> it seems to me that they are heavily affected by what iterators do.
>> Profiling a single scan is much more interesting. Is there a way to profile
>> a single (batch) scan in Accumulo such that I have a complete overview of
>> the entire process of reading and sending back records to the driver?
>>
>> Thanks,
>> Mario
>>
>> --
>> Mario Pastorelli | TERALYTICS
>>
>> *software engineer*
>>
>> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>> phone: +41794381682
>> email: <ma...@teralytics.ch>mario.pastorelli@teralytics.ch
>> www.teralytics.net
>>
>> Company registration number: CH-020.3.037.709-7 | Trade register Canton
>> Zurich
>> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
>> Yann de Vries
>>
>> This e-mail message contains confidential information which is for the
>> sole attention and use of the intended recipient. Please notify us at once
>> if you think that it may not be intended for you and delete it immediately.
>>
>>
>>
>


-- 
Mario Pastorelli | TERALYTICS

*software engineer*

Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
phone: +41794381682
email: mario.pastorelli@teralytics.ch
www.teralytics.net

Company registration number: CH-020.3.037.709-7 | Trade register Canton
Zurich
Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
de Vries

This e-mail message contains confidential information which is for the sole
attention and use of the intended recipient. Please notify us at once if
you think that it may not be intended for you and delete it immediately.

Re: Profile a (batch) scan

Posted by Dylan Hutchison <dh...@cs.washington.edu>.
If Zipkin isn't providing fine-grained enough information, one approach you
could take is to add an iterator to your scan that tracks performance.  Log
the start times, seek ranges, and count throughput.  Each tablet will do
this independently in separate threads.

You might base this approach on Accumulo's StopWatch
<https://github.com/apache/accumulo/blob/master/core/src/main/java/org/apache/accumulo/core/util/StopWatch.java>
utility class.  I adapted this class when I wanted to time particular code
segments in the Graphulo Watch
<https://github.com/Accla/graphulo/blob/tracing/src/main/java/edu/mit/ll/graphulo/skvi/Watch.java>
class.  You can see the Watch in action in the commented segments of this
old branch
<https://github.com/Accla/graphulo/blob/tracing/src/main/java/edu/mit/ll/graphulo/skvi/RemoteWriteIterator.java#L396>.
It captures the number of times a code segment is called, the average
runtime in that section, and the max and min runtime in that section.  It's
a kind of brute force approach, one that would have been better to
integrate with HTrace, but it worked well enough for my purpose.

In your iterator, you would count and measure the time statistics for the
next and seek methods.

On Fri, Aug 26, 2016 at 3:31 PM, Jonathan Lasko <jl...@bbn.com> wrote:

> I haven't had time to dig into it yet but am hoping the Zipkin will help
> with some of these insights. (Unless that is the distributed trace you were
> referring to?)
>
> -Jonathan
>
>
> On 08/26/2016 04:54 PM, Mario Pastorelli wrote:
>
> I would like to understand the performance of a batch scan and I would
> like to have some hints on how to proceed. I have enabled the distributed
> trace, and it tells me that some batch scanner threads take much more time
> than others to complete but this is not helpful enough because it's not
> telling me why some threads take more. My gut feeling is that one batch
> thread is scanning more data than the others, which means that the data is
> not well distributed for a query, but I use a random shard byte as prefix
> of the keys which should guarantee that data of the same range is almost
> equally distributed among the tservers. I enabled JMX on the tservers and
> attached jvisualvm to get an idea of the state of each tserver but I
> couldn't find anything meaningful. I would like to know if there is a way
> to profile what's going on on a single tserver for a single scan thread and
> by this I mean:
>
>    1. where are the tablets required by a scan? Which tablet server?
>    2. how fast was the lookups on the index for that scan?
>    3. how many bytes/records were read for that scan without the iterators
>    4. how many seeks are done by the scan and possibly why
>
> The main Accumulo UI is fine to get an overview of Accumulo but don't
> really give you any information about the performance of a single query and
> it seems to me that they are heavily affected by what iterators do.
> Profiling a single scan is much more interesting. Is there a way to profile
> a single (batch) scan in Accumulo such that I have a complete overview of
> the entire process of reading and sending back records to the driver?
>
> Thanks,
> Mario
>
> --
> Mario Pastorelli | TERALYTICS
>
> *software engineer*
>
> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> phone: +41794381682
> email: <ma...@teralytics.ch>mario.pastorelli@teralytics.ch
> www.teralytics.net
>
> Company registration number: CH-020.3.037.709-7 | Trade register Canton
> Zurich
> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
> de Vries
>
> This e-mail message contains confidential information which is for the
> sole attention and use of the intended recipient. Please notify us at once
> if you think that it may not be intended for you and delete it immediately.
>
>
>

Re: Profile a (batch) scan

Posted by Jonathan Lasko <jl...@bbn.com>.
I haven't had time to dig into it yet but am hoping the Zipkin will help 
with some of these insights. (Unless that is the distributed trace you 
were referring to?)

-Jonathan

On 08/26/2016 04:54 PM, Mario Pastorelli wrote:
> I would like to understand the performance of a batch scan and I would 
> like to have some hints on how to proceed. I have enabled the 
> distributed trace, and it tells me that some batch scanner threads 
> take much more time than others to complete but this is not helpful 
> enough because it's not telling me why some threads take more. My gut 
> feeling is that one batch thread is scanning more data than the 
> others, which means that the data is not well distributed for a query, 
> but I use a random shard byte as prefix of the keys which should 
> guarantee that data of the same range is almost equally distributed 
> among the tservers. I enabled JMX on the tservers and attached 
> jvisualvm to get an idea of the state of each tserver but I couldn't 
> find anything meaningful. I would like to know if there is a way to 
> profile what's going on on a single tserver for a single scan thread 
> and by this I mean:
>
>  1. where are the tablets required by a scan? Which tablet server?
>  2. how fast was the lookups on the index for that scan?
>  3. how many bytes/records were read for that scan without the iterators
>  4. how many seeks are done by the scan and possibly why
>
> The main Accumulo UI is fine to get an overview of Accumulo but don't 
> really give you any information about the performance of a single 
> query and it seems to me that they are heavily affected by what 
> iterators do. Profiling a single scan is much more interesting. Is 
> there a way to profile a single (batch) scan in Accumulo such that I 
> have a complete overview of the entire process of reading and sending 
> back records to the driver?
>
> Thanks,
> Mario
>
> -- 
> Mario Pastorelli| TERALYTICS
>
> *software engineer*
>
> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> phone:+41794381682
> email: mario.pastorelli@teralytics.ch 
> <ma...@teralytics.ch>
> www.teralytics.net <http://www.teralytics.net/>
>
> Company registration number: CH-020.3.037.709-7 | Trade register 
> Canton Zurich
> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, 
> Yann de Vries
>
> This e-mail message contains confidential information which is for the 
> sole attention and use of the intended recipient. Please notify us at 
> once if you think that it may not be intended for you and delete it 
> immediately.
>


Re: Profile a (batch) scan

Posted by Mario Pastorelli <ma...@teralytics.ch>.
Below you can find a good example of what I'm struggling to understand
right now. It's a trace for a simple scan over some columns with a
BatchScanner using 75 threads. The scan takes 877 milliseconds and the main
contributor is the entry "tablet read ahead 1", which starts at 248 ms.
These are the questions that I cannot answer with this trace:

   1. why this heavy operation starts after 248ms? By summing up the delay
   before this operation you get a number which is not even close to 248ms.
   2. what does "tablet read ahead 1" means? In general, how to map the
   entries of a trace to their meaning? Is there a guide about this?
   3. why "tablet read ahead 1" takes 600ms? It's clearly not the sum of
   the entries under this one but that's the important part.
   4. I may be naive but...how much data have been read by this scan? How
   many entries? That's very important to understand what's going on.

Thanks for the help,

Mario

877+ 0 Dice@h01 counts
2+ 7 tserver@h12 startScan
6+ 10 tserver@h15 startScan
5+ 11 tserver@h15 metadata tablets read ahead 4
843+ 34 Dice@h01 batch scanner 74- 1
620+ 230 tserver@h09 startMultiScan
600+ 248 tserver@h09 tablet read ahead 1
22+ 299 tserver@h09 newDFSInputStream
22+ 299 tserver@h09 getBlockLocations
2+ 310 tserver@h09 ClientNamenodeProtocol#getBlockLocations
1+ 321 tserver@h09 getFileInfo
1+ 321 tserver@h09 ClientNamenodeProtocol#getFileInfo
2+ 322 tserver@h09 DFSInputStream#byteArrayRead
1+ 324 tserver@h09 DFSInputStream#byteArrayRead
2+ 831 tserver@h09 DFSInputStream#byteArrayRead
2+ 834 tserver@h09 DFSInputStream#byteArrayRead
1+ 835 tserver@h09 BlockReaderLocal#fillBuffer(1091850413)
1+ 874 tserver@h09 closeMultiScan

On Mon, Aug 29, 2016 at 5:06 PM, Josh Elser <jo...@gmail.com> wrote:

> Right, no, I understand.
>
> I just meant that the metrics we do expose are lacking. It would be a huge
> benefit to everyone if we could find more things that we can expose, do
> that once, and then prevent the need for the next person to hand-roll some
> things like you are doing now :)
>
> Mario Pastorelli wrote:
>
>> I think suggestions in this mailing list are useful, Josh, that's why I
>> keep asking questions. I'm sorry that I'm asking so many questions but
>> I'm trying to improve my knowledge of Accumulo and documentation is
>> limited. Ideally, I would like to use only the metrics provided by
>> Accumulo, because that's less stuff that I have to maintain. The
>> StopWatch writing to the tracer could help.
>>
>>
>>
>> On Sun, Aug 28, 2016 at 10:14 PM, Josh Elser <josh.elser@gmail.com
>> <ma...@gmail.com>> wrote:
>>
>>     I know it's not a super-helpful response, but I would love to help
>>     you work through things we *can* expose and help you do that.
>>
>>     I imagine there is significantly more that we can add into the
>>     dist-tracing information for BatchScanners now which would give more
>>     insight into the tserver (amount of data read, number of ranges per
>>     scan RPC, amount of data returned). This would be ideal as it would
>>     prevent you from having to update your application code (although,
>>     the suggestion of writing some iterator for timing purposes is a
>>     simple way to move forward)
>>
>>     Mario Pastorelli wrote:
>>
>>         I would like to understand the performance of a batch scan and I
>>         would
>>         like to have some hints on how to proceed. I have enabled the
>>         distributed trace, and it tells me that some batch scanner
>>         threads take
>>         much more time than others to complete but this is not helpful
>>         enough
>>         because it's not telling me why some threads take more. My gut
>>         feeling
>>         is that one batch thread is scanning more data than the others,
>>         which
>>         means that the data is not well distributed for a query, but I
>> use a
>>         random shard byte as prefix of the keys which should guarantee
>>         that data
>>         of the same range is almost equally distributed among the
>>         tservers. I
>>         enabled JMX on the tservers and attached jvisualvm to get an
>>         idea of the
>>         state of each tserver but I couldn't find anything meaningful. I
>>         would
>>         like to know if there is a way to profile what's going on on a
>>         single
>>         tserver for a single scan thread and by this I mean:
>>
>>           1. where are the tablets required by a scan? Which tablet
>> server?
>>           2. how fast was the lookups on the index for that scan?
>>           3. how many bytes/records were read for that scan without the
>>         iterators
>>           4. how many seeks are done by the scan and possibly why
>>
>>         The main Accumulo UI is fine to get an overview of Accumulo but
>>         don't
>>         really give you any information about the performance of a
>>         single query
>>         and it seems to me that they are heavily affected by what
>>         iterators do.
>>         Profiling a single scan is much more interesting. Is there a way
>> to
>>         profile a single (batch) scan in Accumulo such that I have a
>>         complete
>>         overview of the entire process of reading and sending back
>>         records to
>>         the driver?
>>
>>         Thanks,
>>         Mario
>>
>>         --
>>         Mario Pastorelli| TERALYTICS
>>
>>         *software engineer*
>>
>>         Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>>         phone:+41794381682 <tel:%2B41794381682>
>>         email: mario.pastorelli@teralytics.ch
>>         <ma...@teralytics.ch>
>>         <mailto:mario.pastorelli@teralytics.ch
>>
>>         <ma...@teralytics.ch>>
>>         www.teralytics.net <http://www.teralytics.net>
>>         <http://www.teralytics.net/>
>>
>>         Company registration number: CH-020.3.037.709-7 | Trade register
>>         Canton
>>         Zurich
>>         Board of directors: Georg Polzer, Luciano Franceschina, Mark
>>         Schmitz,
>>         Yann de Vries
>>
>>         This e-mail message contains confidential information which is
>>         for the
>>         sole attention and use of the intended recipient. Please notify
>>         us at
>>         once if you think that it may not be intended for you and delete
>> it
>>         immediately.
>>
>>
>>
>>
>> --
>> Mario Pastorelli| TERALYTICS
>>
>> *software engineer*
>>
>> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>> phone:+41794381682
>> email: mario.pastorelli@teralytics.ch
>> <ma...@teralytics.ch>
>> www.teralytics.net <http://www.teralytics.net/>
>>
>> Company registration number: CH-020.3.037.709-7 | Trade register Canton
>> Zurich
>> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
>> Yann de Vries
>>
>> This e-mail message contains confidential information which is for the
>> sole attention and use of the intended recipient. Please notify us at
>> once if you think that it may not be intended for you and delete it
>> immediately.
>>
>>


-- 
Mario Pastorelli | TERALYTICS

*software engineer*

Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
phone: +41794381682
email: mario.pastorelli@teralytics.ch
www.teralytics.net

Company registration number: CH-020.3.037.709-7 | Trade register Canton
Zurich
Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
de Vries

This e-mail message contains confidential information which is for the sole
attention and use of the intended recipient. Please notify us at once if
you think that it may not be intended for you and delete it immediately.

Re: Profile a (batch) scan

Posted by Josh Elser <jo...@gmail.com>.
Right, no, I understand.

I just meant that the metrics we do expose are lacking. It would be a 
huge benefit to everyone if we could find more things that we can 
expose, do that once, and then prevent the need for the next person to 
hand-roll some things like you are doing now :)

Mario Pastorelli wrote:
> I think suggestions in this mailing list are useful, Josh, that's why I
> keep asking questions. I'm sorry that I'm asking so many questions but
> I'm trying to improve my knowledge of Accumulo and documentation is
> limited. Ideally, I would like to use only the metrics provided by
> Accumulo, because that's less stuff that I have to maintain. The
> StopWatch writing to the tracer could help.
>
>
>
> On Sun, Aug 28, 2016 at 10:14 PM, Josh Elser <josh.elser@gmail.com
> <ma...@gmail.com>> wrote:
>
>     I know it's not a super-helpful response, but I would love to help
>     you work through things we *can* expose and help you do that.
>
>     I imagine there is significantly more that we can add into the
>     dist-tracing information for BatchScanners now which would give more
>     insight into the tserver (amount of data read, number of ranges per
>     scan RPC, amount of data returned). This would be ideal as it would
>     prevent you from having to update your application code (although,
>     the suggestion of writing some iterator for timing purposes is a
>     simple way to move forward)
>
>     Mario Pastorelli wrote:
>
>         I would like to understand the performance of a batch scan and I
>         would
>         like to have some hints on how to proceed. I have enabled the
>         distributed trace, and it tells me that some batch scanner
>         threads take
>         much more time than others to complete but this is not helpful
>         enough
>         because it's not telling me why some threads take more. My gut
>         feeling
>         is that one batch thread is scanning more data than the others,
>         which
>         means that the data is not well distributed for a query, but I use a
>         random shard byte as prefix of the keys which should guarantee
>         that data
>         of the same range is almost equally distributed among the
>         tservers. I
>         enabled JMX on the tservers and attached jvisualvm to get an
>         idea of the
>         state of each tserver but I couldn't find anything meaningful. I
>         would
>         like to know if there is a way to profile what's going on on a
>         single
>         tserver for a single scan thread and by this I mean:
>
>           1. where are the tablets required by a scan? Which tablet server?
>           2. how fast was the lookups on the index for that scan?
>           3. how many bytes/records were read for that scan without the
>         iterators
>           4. how many seeks are done by the scan and possibly why
>
>         The main Accumulo UI is fine to get an overview of Accumulo but
>         don't
>         really give you any information about the performance of a
>         single query
>         and it seems to me that they are heavily affected by what
>         iterators do.
>         Profiling a single scan is much more interesting. Is there a way to
>         profile a single (batch) scan in Accumulo such that I have a
>         complete
>         overview of the entire process of reading and sending back
>         records to
>         the driver?
>
>         Thanks,
>         Mario
>
>         --
>         Mario Pastorelli| TERALYTICS
>
>         *software engineer*
>
>         Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>         phone:+41794381682 <tel:%2B41794381682>
>         email: mario.pastorelli@teralytics.ch
>         <ma...@teralytics.ch>
>         <mailto:mario.pastorelli@teralytics.ch
>         <ma...@teralytics.ch>>
>         www.teralytics.net <http://www.teralytics.net>
>         <http://www.teralytics.net/>
>
>         Company registration number: CH-020.3.037.709-7 | Trade register
>         Canton
>         Zurich
>         Board of directors: Georg Polzer, Luciano Franceschina, Mark
>         Schmitz,
>         Yann de Vries
>
>         This e-mail message contains confidential information which is
>         for the
>         sole attention and use of the intended recipient. Please notify
>         us at
>         once if you think that it may not be intended for you and delete it
>         immediately.
>
>
>
>
> --
> Mario Pastorelli| TERALYTICS
>
> *software engineer*
>
> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> phone:+41794381682
> email: mario.pastorelli@teralytics.ch
> <ma...@teralytics.ch>
> www.teralytics.net <http://www.teralytics.net/>
>
> Company registration number: CH-020.3.037.709-7 | Trade register Canton
> Zurich
> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
> Yann de Vries
>
> This e-mail message contains confidential information which is for the
> sole attention and use of the intended recipient. Please notify us at
> once if you think that it may not be intended for you and delete it
> immediately.
>

Re: Profile a (batch) scan

Posted by Mario Pastorelli <ma...@teralytics.ch>.
I think suggestions in this mailing list are useful, Josh, that's why I
keep asking questions. I'm sorry that I'm asking so many questions but I'm
trying to improve my knowledge of Accumulo and documentation is limited.
Ideally, I would like to use only the metrics provided by Accumulo, because
that's less stuff that I have to maintain. The StopWatch writing to the
tracer could help.



On Sun, Aug 28, 2016 at 10:14 PM, Josh Elser <jo...@gmail.com> wrote:

> I know it's not a super-helpful response, but I would love to help you
> work through things we *can* expose and help you do that.
>
> I imagine there is significantly more that we can add into the
> dist-tracing information for BatchScanners now which would give more
> insight into the tserver (amount of data read, number of ranges per scan
> RPC, amount of data returned). This would be ideal as it would prevent you
> from having to update your application code (although, the suggestion of
> writing some iterator for timing purposes is a simple way to move forward)
>
> Mario Pastorelli wrote:
>
>> I would like to understand the performance of a batch scan and I would
>> like to have some hints on how to proceed. I have enabled the
>> distributed trace, and it tells me that some batch scanner threads take
>> much more time than others to complete but this is not helpful enough
>> because it's not telling me why some threads take more. My gut feeling
>> is that one batch thread is scanning more data than the others, which
>> means that the data is not well distributed for a query, but I use a
>> random shard byte as prefix of the keys which should guarantee that data
>> of the same range is almost equally distributed among the tservers. I
>> enabled JMX on the tservers and attached jvisualvm to get an idea of the
>> state of each tserver but I couldn't find anything meaningful. I would
>> like to know if there is a way to profile what's going on on a single
>> tserver for a single scan thread and by this I mean:
>>
>>  1. where are the tablets required by a scan? Which tablet server?
>>  2. how fast was the lookups on the index for that scan?
>>  3. how many bytes/records were read for that scan without the iterators
>>  4. how many seeks are done by the scan and possibly why
>>
>> The main Accumulo UI is fine to get an overview of Accumulo but don't
>> really give you any information about the performance of a single query
>> and it seems to me that they are heavily affected by what iterators do.
>> Profiling a single scan is much more interesting. Is there a way to
>> profile a single (batch) scan in Accumulo such that I have a complete
>> overview of the entire process of reading and sending back records to
>> the driver?
>>
>> Thanks,
>> Mario
>>
>> --
>> Mario Pastorelli| TERALYTICS
>>
>> *software engineer*
>>
>> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
>> phone:+41794381682
>> email: mario.pastorelli@teralytics.ch
>> <ma...@teralytics.ch>
>> www.teralytics.net <http://www.teralytics.net/>
>>
>> Company registration number: CH-020.3.037.709-7 | Trade register Canton
>> Zurich
>> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
>> Yann de Vries
>>
>> This e-mail message contains confidential information which is for the
>> sole attention and use of the intended recipient. Please notify us at
>> once if you think that it may not be intended for you and delete it
>> immediately.
>>
>>


-- 
Mario Pastorelli | TERALYTICS

*software engineer*

Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
phone: +41794381682
email: mario.pastorelli@teralytics.ch
www.teralytics.net

Company registration number: CH-020.3.037.709-7 | Trade register Canton
Zurich
Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann
de Vries

This e-mail message contains confidential information which is for the sole
attention and use of the intended recipient. Please notify us at once if
you think that it may not be intended for you and delete it immediately.

Re: Profile a (batch) scan

Posted by Josh Elser <jo...@gmail.com>.
I know it's not a super-helpful response, but I would love to help you 
work through things we *can* expose and help you do that.

I imagine there is significantly more that we can add into the 
dist-tracing information for BatchScanners now which would give more 
insight into the tserver (amount of data read, number of ranges per scan 
RPC, amount of data returned). This would be ideal as it would prevent 
you from having to update your application code (although, the 
suggestion of writing some iterator for timing purposes is a simple way 
to move forward)

Mario Pastorelli wrote:
> I would like to understand the performance of a batch scan and I would
> like to have some hints on how to proceed. I have enabled the
> distributed trace, and it tells me that some batch scanner threads take
> much more time than others to complete but this is not helpful enough
> because it's not telling me why some threads take more. My gut feeling
> is that one batch thread is scanning more data than the others, which
> means that the data is not well distributed for a query, but I use a
> random shard byte as prefix of the keys which should guarantee that data
> of the same range is almost equally distributed among the tservers. I
> enabled JMX on the tservers and attached jvisualvm to get an idea of the
> state of each tserver but I couldn't find anything meaningful. I would
> like to know if there is a way to profile what's going on on a single
> tserver for a single scan thread and by this I mean:
>
>  1. where are the tablets required by a scan? Which tablet server?
>  2. how fast was the lookups on the index for that scan?
>  3. how many bytes/records were read for that scan without the iterators
>  4. how many seeks are done by the scan and possibly why
>
> The main Accumulo UI is fine to get an overview of Accumulo but don't
> really give you any information about the performance of a single query
> and it seems to me that they are heavily affected by what iterators do.
> Profiling a single scan is much more interesting. Is there a way to
> profile a single (batch) scan in Accumulo such that I have a complete
> overview of the entire process of reading and sending back records to
> the driver?
>
> Thanks,
> Mario
>
> --
> Mario Pastorelli| TERALYTICS
>
> *software engineer*
>
> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland
> phone:+41794381682
> email: mario.pastorelli@teralytics.ch
> <ma...@teralytics.ch>
> www.teralytics.net <http://www.teralytics.net/>
>
> Company registration number: CH-020.3.037.709-7 | Trade register Canton
> Zurich
> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz,
> Yann de Vries
>
> This e-mail message contains confidential information which is for the
> sole attention and use of the intended recipient. Please notify us at
> once if you think that it may not be intended for you and delete it
> immediately.
>