You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@accumulo.apache.org by Ara Ebrahimi <ar...@argyledata.com> on 2015/02/07 03:34:08 UTC

hdfs cpu usage

Hi,

We’re seeing some weird behavior from the hdfs daemon on google cloud environment when we use accumulo Scanner to sequentially scan a table. Top reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of free memory. It seems like something causes the hdfs daemon to consume a lot of cpu and not to send enough read requests to the disk (ssd actually, so disk is super fast and vastly under-utilized). The process which sends scan requests to accumulo is 500% active (using 3 query batch threads and aggressive scan-batch-size/read-ahead-threashold values). So it seems like somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon going over 10% cpu usage. Any idea what the issue could be?

Thanks,
Ara.



________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.

________________________________

Re: hdfs cpu usage

Posted by Ara Ebrahimi <ar...@argyledata.com>.
Hi,

On all nodes. Nope, this is the behavior we’ve been seeing since we started testing a few weeks ago. Yes the behavior persists after restarts.

Ara.

On Feb 7, 2015, at 10:50 AM, Keith Turner <ke...@deenlo.com>> wrote:

Is this happening on just one node, or all nodes?  Have you run w/o problem on google cloud env before?  Do you see the problem if you restart the vms?

On Fri, Feb 6, 2015 at 9:34 PM, Ara Ebrahimi <ar...@argyledata.com>> wrote:
Hi,

We’re seeing some weird behavior from the hdfs daemon on google cloud environment when we use accumulo Scanner to sequentially scan a table. Top reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of free memory. It seems like something causes the hdfs daemon to consume a lot of cpu and not to send enough read requests to the disk (ssd actually, so disk is super fast and vastly under-utilized). The process which sends scan requests to accumulo is 500% active (using 3 query batch threads and aggressive scan-batch-size/read-ahead-threashold values). So it seems like somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon going over 10% cpu usage. Any idea what the issue could be?

Thanks,
Ara.



________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.

________________________________




________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.

________________________________




________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.

________________________________

Re: hdfs cpu usage

Posted by Keith Turner <ke...@deenlo.com>.
Is this happening on just one node, or all nodes?  Have you run w/o problem
on google cloud env before?  Do you see the problem if you restart the vms?

On Fri, Feb 6, 2015 at 9:34 PM, Ara Ebrahimi <ar...@argyledata.com>
wrote:

> Hi,
>
> We’re seeing some weird behavior from the hdfs daemon on google cloud
> environment when we use accumulo Scanner to sequentially scan a table. Top
> reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around
> 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of
> free memory. It seems like something causes the hdfs daemon to consume a
> lot of cpu and not to send enough read requests to the disk (ssd actually,
> so disk is super fast and vastly under-utilized). The process which sends
> scan requests to accumulo is 500% active (using 3 query batch threads and
> aggressive scan-batch-size/read-ahead-threashold values). So it seems like
> somehow hdfs is the bottleneck. On another cluster we rarely see hdfs
> daemon going over 10% cpu usage. Any idea what the issue could be?
>
> Thanks,
> Ara.
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Thank you in
> advance for your cooperation.
>
> ________________________________
>

Re: hdfs cpu usage

Posted by Ara Ebrahimi <ar...@argyledata.com>.
Nope. This is for a full table scan. The same config in our in-premise cluster performs well while on google cloud we see this weird hdfs cpu usage issue.

Ara.

> On Feb 9, 2015, at 9:31 AM, Adam Fuchs <af...@apache.org> wrote:
>
> Ara,
>
> What kind of query load are you generating within your batch scanners?
> Are you using an iterator that seeks around a lot? Are you grabbing
> many small batches (only a few keys per range) from the batch scanner?
> As a wild guess, this could be the result of lots of seeks with a low
> cache hit rate, which would induce CPU load in HDFS fetching blocks
> and CPU load in Accumulo decrypting/decompressing those blocks. The
> monitor page will show you seek rates and cache hit rates.
>
> Adam
>
>
> On Sat, Feb 7, 2015 at 8:48 PM, Ara Ebrahimi
> <ar...@argyledata.com> wrote:
>> 2.4.0.2.1.
>>
>> Yeah seems like I need to do that. I was hoping I’d get some advice based on
>> prior experience with google cloud environment.
>>
>> Ara.
>>
>> On Feb 7, 2015, at 11:23 AM, Josh Elser <jo...@gmail.com> wrote:
>>
>> What version of Hadoop are you using?
>>
>> Have you considered hooking up a profiler to the Datanode on GCE to see
>> where the time is being spent? That might help shed some light on the
>> situation.
>>
>> Ara Ebrahimi wrote:
>>
>> Hi,
>>
>> We’re seeing some weird behavior from the hdfs daemon on google cloud
>> environment when we use accumulo Scanner to sequentially scan a table. Top
>> reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around
>> 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of
>> free memory. It seems like something causes the hdfs daemon to consume a lot
>> of cpu and not to send enough read requests to the disk (ssd actually, so
>> disk is super fast and vastly under-utilized). The process which sends scan
>> requests to accumulo is 500% active (using 3 query batch threads and
>> aggressive scan-batch-size/read-ahead-threashold values). So it seems like
>> somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon
>> going over 10% cpu usage. Any idea what the issue could be?
>>
>> Thanks,
>> Ara.
>>
>>
>>
>> ________________________________
>>
>> This message is for the designated recipient only and may contain
>> privileged, proprietary, or otherwise confidential information. If you have
>> received it in error, please notify the sender immediately and delete the
>> original. Any other use of the e-mail by you is prohibited. Thank you in
>> advance for your cooperation.
>>
>> ________________________________
>>
>>
>>
>>
>> ________________________________
>>
>> This message is for the designated recipient only and may contain
>> privileged, proprietary, or otherwise confidential information. If you have
>> received it in error, please notify the sender immediately and delete the
>> original. Any other use of the e-mail by you is prohibited. Thank you in
>> advance for your cooperation.
>>
>> ________________________________
>>
>>
>>
>>
>>
>> ________________________________
>>
>> This message is for the designated recipient only and may contain
>> privileged, proprietary, or otherwise confidential information. If you have
>> received it in error, please notify the sender immediately and delete the
>> original. Any other use of the e-mail by you is prohibited. Thank you in
>> advance for your cooperation.
>>
>> ________________________________
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.
>
> ________________________________




________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.

________________________________

Re: hdfs cpu usage

Posted by Adam Fuchs <af...@apache.org>.
Ara,

What kind of query load are you generating within your batch scanners?
Are you using an iterator that seeks around a lot? Are you grabbing
many small batches (only a few keys per range) from the batch scanner?
As a wild guess, this could be the result of lots of seeks with a low
cache hit rate, which would induce CPU load in HDFS fetching blocks
and CPU load in Accumulo decrypting/decompressing those blocks. The
monitor page will show you seek rates and cache hit rates.

Adam


On Sat, Feb 7, 2015 at 8:48 PM, Ara Ebrahimi
<ar...@argyledata.com> wrote:
> 2.4.0.2.1.
>
> Yeah seems like I need to do that. I was hoping I’d get some advice based on
> prior experience with google cloud environment.
>
> Ara.
>
> On Feb 7, 2015, at 11:23 AM, Josh Elser <jo...@gmail.com> wrote:
>
> What version of Hadoop are you using?
>
> Have you considered hooking up a profiler to the Datanode on GCE to see
> where the time is being spent? That might help shed some light on the
> situation.
>
> Ara Ebrahimi wrote:
>
> Hi,
>
> We’re seeing some weird behavior from the hdfs daemon on google cloud
> environment when we use accumulo Scanner to sequentially scan a table. Top
> reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around
> 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of
> free memory. It seems like something causes the hdfs daemon to consume a lot
> of cpu and not to send enough read requests to the disk (ssd actually, so
> disk is super fast and vastly under-utilized). The process which sends scan
> requests to accumulo is 500% active (using 3 query batch threads and
> aggressive scan-batch-size/read-ahead-threashold values). So it seems like
> somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon
> going over 10% cpu usage. Any idea what the issue could be?
>
> Thanks,
> Ara.
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Thank you in
> advance for your cooperation.
>
> ________________________________
>
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Thank you in
> advance for your cooperation.
>
> ________________________________
>
>
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise confidential information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the e-mail by you is prohibited. Thank you in
> advance for your cooperation.
>
> ________________________________

Re: hdfs cpu usage

Posted by Ara Ebrahimi <ar...@argyledata.com>.
2.4.0.2.1.

Yeah seems like I need to do that. I was hoping I’d get some advice based on prior experience with google cloud environment.

Ara.

On Feb 7, 2015, at 11:23 AM, Josh Elser <jo...@gmail.com>> wrote:

What version of Hadoop are you using?

Have you considered hooking up a profiler to the Datanode on GCE to see
where the time is being spent? That might help shed some light on the
situation.

Ara Ebrahimi wrote:
Hi,

We’re seeing some weird behavior from the hdfs daemon on google cloud environment when we use accumulo Scanner to sequentially scan a table. Top reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of free memory. It seems like something causes the hdfs daemon to consume a lot of cpu and not to send enough read requests to the disk (ssd actually, so disk is super fast and vastly under-utilized). The process which sends scan requests to accumulo is 500% active (using 3 query batch threads and aggressive scan-batch-size/read-ahead-threashold values). So it seems like somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon going over 10% cpu usage. Any idea what the issue could be?

Thanks,
Ara.



________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.

________________________________



________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.

________________________________




________________________________

This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.

________________________________

Re: hdfs cpu usage

Posted by Josh Elser <jo...@gmail.com>.
What version of Hadoop are you using?

Have you considered hooking up a profiler to the Datanode on GCE to see 
where the time is being spent? That might help shed some light on the 
situation.

Ara Ebrahimi wrote:
> Hi,
>
> We’re seeing some weird behavior from the hdfs daemon on google cloud environment when we use accumulo Scanner to sequentially scan a table. Top reports 200-300% cpu usage for the hdfs daemon. Accumulo is also around 500%. iostat %util is low. avgrq-sz is low, rMB/s is low, there’s lots of free memory. It seems like something causes the hdfs daemon to consume a lot of cpu and not to send enough read requests to the disk (ssd actually, so disk is super fast and vastly under-utilized). The process which sends scan requests to accumulo is 500% active (using 3 query batch threads and aggressive scan-batch-size/read-ahead-threashold values). So it seems like somehow hdfs is the bottleneck. On another cluster we rarely see hdfs daemon going over 10% cpu usage. Any idea what the issue could be?
>
> Thanks,
> Ara.
>
>
>
> ________________________________
>
> This message is for the designated recipient only and may contain privileged, proprietary, or otherwise confidential information. If you have received it in error, please notify the sender immediately and delete the original. Any other use of the e-mail by you is prohibited. Thank you in advance for your cooperation.
>
> ________________________________