You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Ted Yu <yu...@gmail.com> on 2011/01/30 21:41:35 UTC

write performance degradation

Using 0.90 RC3 on our 7 node clsuter, the first time (after cluster start)
our map tasks ran quite fast writing to hbase (client side buffer size being
17MB).

If I run the same hadoop job soon after the first finishes, there is
noticeable slow down.

Connections to zookeepr quorum wasn't high - 71 including connections from
region servers.

On region server(s), jps didn't show map task processes although ps could.

jstack reported:

13331: Unable to open socket file: target process not responding or HotSpot
VM not loaded
The -F option can be used when the target process is not responding

Advice for further debugging is welcome.

Re: write performance degradation

Posted by Ted Yu <yu...@gmail.com>.

Right now my bigger concern is the following:

java.io.IOException: All datanodes 10.202.50.79:50010 are bad. Aborting...
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2603)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2139)
	at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2306)
2011-01-31 05:42:36,927 WARN org.apache.hadoop.mapred.TaskTracker:
Error running child
java.lang.RuntimeException: java.io.IOException: All datanodes
10.202.50.79:50010 are bad. Aborting...

hadoop-core-0.20-append-r1056497.jar is newer than the jar from cdh3b2.
Some user is using it but I haven't got it working on cdh3b2 installation.

On Sun, Jan 30, 2011 at 10:24 PM, tsuna <ts...@gmail.com> wrote:

> It could be useful to see the output of "vmstat 1" (let it run for at
> least 30 seconds while the problem is occurring and pastebin the
> result).  If the problem seems i/o related, the output of "iostat -xkd
> 1" can also help.
>
> --
> Benoit "tsuna" Sigoure
> Software Engineer @ www.StumbleUpon.com
>

Re: write performance degradation

Posted by tsuna <ts...@gmail.com>.

On Tue, Feb 1, 2011 at 7:40 AM, Ted Yu <yu...@gmail.com> wrote:
> Thanks for the reminder.
>
> After adjusting swappiness to 2 on each node, I see:
> http://pastebin.com/tFcAj7rw
>
> Client buffer is now 10MB.
> But map tasks seem to be slower than before.

The output of vmstat doesn't show anything useful here, the machine is
100% idle up until the 3rd section where it's reading a bit from disk,
but it's still mostly idle.  Something else that we don't see from
this trace is amiss.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: write performance degradation

Posted by Ted Yu <yu...@gmail.com>.

Thanks for the reminder.

After adjusting swappiness to 2 on each node, I see:
http://pastebin.com/tFcAj7rw

Client buffer is now 10MB.
But map tasks seem to be slower than before.

On Mon, Jan 31, 2011 at 11:54 PM, tsuna <ts...@gmail.com> wrote:

> On Mon, Jan 31, 2011 at 10:25 AM, Ted Yu <yu...@gmail.com> wrote:
> > Here is output from vmstat:
> > http://pastebin.com/VeGC5KQr
>
> This indicates that you're actively bringing in things from swap,
> which kills performance.  The columns "si" (swap in) and "so" (swap
> out) should ideally always be 0.
>
> What's printed by /sbin/sysctl vm.swappiness ?
>
> --
> Benoit "tsuna" Sigoure
> Software Engineer @ www.StumbleUpon.com
>

Re: write performance degradation

Posted by tsuna <ts...@gmail.com>.

On Mon, Jan 31, 2011 at 10:25 AM, Ted Yu <yu...@gmail.com> wrote:
> Here is output from vmstat:
> http://pastebin.com/VeGC5KQr

This indicates that you're actively bringing in things from swap,
which kills performance.  The columns "si" (swap in) and "so" (swap
out) should ideally always be 0.

What's printed by /sbin/sysctl vm.swappiness ?

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: write performance degradation

Posted by Ted Yu <yu...@gmail.com>.

Here is output from vmstat:
http://pastebin.com/VeGC5KQr

Here is output from iostat:
http://pastebin.com/iCMNSZcm

On Sun, Jan 30, 2011 at 10:24 PM, tsuna <ts...@gmail.com> wrote:

> It could be useful to see the output of "vmstat 1" (let it run for at
> least 30 seconds while the problem is occurring and pastebin the
> result).  If the problem seems i/o related, the output of "iostat -xkd
> 1" can also help.
>
> --
> Benoit "tsuna" Sigoure
> Software Engineer @ www.StumbleUpon.com
>

Re: write performance degradation

Posted by tsuna <ts...@gmail.com>.

It could be useful to see the output of "vmstat 1" (let it run for at
least 30 seconds while the problem is occurring and pastebin the
result).  If the problem seems i/o related, the output of "iostat -xkd
1" can also help.

-- 
Benoit "tsuna" Sigoure
Software Engineer @ www.StumbleUpon.com

Re: write performance degradation

Posted by Ted Yu <yu...@gmail.com>.

I will setup Ganglia on the dev cluster.

There wasn't sign of GC pause from the region server logs BTW

On Sun, Jan 30, 2011 at 2:59 PM, Stack <st...@duboce.net> wrote:

> On Sun, Jan 30, 2011 at 5:49 PM, Ted Yu <yu...@gmail.com> wrote:
> > RC3 was promoted to 0.90 release, so I didn't expect much difference.
>
> Pardon me.  You are right.
>
> > Here is output from top on one node:
> >
> > top - 22:42:26 up 104 days, 20:38,  1 user,  load average: 1.04, 1.34,
> 2.13
>
> Load is low.
>
>
> > Tasks: 152 total,   1 running, 151 sleeping,   0 stopped,   0 zombie
> > Cpu(s):  0.5%us,  0.2%sy,  0.0%ni, 98.8%id,  0.0%wa,  0.0%hi,  0.4%si,
> > 0.0%st
> > Mem:  16440036k total, 15532616k used,   907420k free,    42472k buffers
> > Swap: 16777208k total,  5719488k used, 11057720k free,  4530368k cached
> >
>
> Is this swap active?  Are you swapping?
>
> Regards the rest, its hard to offer advice based off a single
> snapshot.  Do you have ganglia up and running on this cluster?
>
> St.Ack
>

Re: write performance degradation

Posted by Stack <st...@duboce.net>.

On Sun, Jan 30, 2011 at 5:49 PM, Ted Yu <yu...@gmail.com> wrote:
> RC3 was promoted to 0.90 release, so I didn't expect much difference.

Pardon me.  You are right.

> Here is output from top on one node:
>
> top - 22:42:26 up 104 days, 20:38,  1 user,  load average: 1.04, 1.34, 2.13

Load is low.


> Tasks: 152 total,   1 running, 151 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.5%us,  0.2%sy,  0.0%ni, 98.8%id,  0.0%wa,  0.0%hi,  0.4%si,
> 0.0%st
> Mem:  16440036k total, 15532616k used,   907420k free,    42472k buffers
> Swap: 16777208k total,  5719488k used, 11057720k free,  4530368k cached
>

Is this swap active?  Are you swapping?

Regards the rest, its hard to offer advice based off a single
snapshot.  Do you have ganglia up and running on this cluster?

St.Ack

Re: write performance degradation

Posted by Ted Yu <yu...@gmail.com>.

RC3 was promoted to 0.90 release, so I didn't expect much difference.
I also tried 0.90 build which also included newer patches such as HBASE-3455
and observed same result.

The cluster only runs one job at a time.

Client buffer size for our 0.20.6 cluster was set to 20MB and I didn't see
similar issue.

Here is output from top on one node:

top - 22:42:26 up 104 days, 20:38,  1 user,  load average: 1.04, 1.34, 2.13
Tasks: 152 total,   1 running, 151 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.5%us,  0.2%sy,  0.0%ni, 98.8%id,  0.0%wa,  0.0%hi,  0.4%si,
0.0%st
Mem:  16440036k total, 15532616k used,   907420k free,    42472k buffers
Swap: 16777208k total,  5719488k used, 11057720k free,  4530368k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
COMMAND

15261 hadoop    25   0 4776m 1.0g 9312 S  2.0  6.7 133:27.04 java
16442 hadoop    21   0 1625m 760m  14m S  1.3  4.7   0:39.28
java

16453 hadoop    18   0 4690m 3.4g 9712 S  1.3 21.4 311:49.25 java

15261 DataNode
19442 map task
16453 HRegionServer

---------
[sjc1-hadoop3.sjc1:hadoop 1298]iostat
Linux 2.6.18-53.el5 (sjc1-hadoop3.sjc1.carrieriq.com)   01/30/2011

avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           4.56    0.00    0.48    0.60    0.00   94.35

Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda               6.15       403.62       411.58 3656841492 3728977064
sdb               7.64       131.10       303.53 1187817806 2750029496
sdc               7.38       396.23       387.12 3589886886 3507388096
sdd               6.35       403.99       399.77 3660260454 3621972144
dm-0             22.51        37.78       173.52  342296914 1572156872
dm-1             27.92        93.32       130.01  845519592 1177872624

On Sun, Jan 30, 2011 at 2:34 PM, Stack <st...@duboce.net> wrote:

> On Sun, Jan 30, 2011 at 3:41 PM, Ted Yu <yu...@gmail.com> wrote:
> > Using 0.90 RC3 on our 7 node clsuter, the first time (after cluster
> start)
> > our map tasks ran quite fast writing to hbase (client side buffer size
> being
> > 17MB).
> >
>
> Can you come up on release 0.90.0 Ted?
>
> 17MB client buffer might be a little large.
>
>
> > If I run the same hadoop job soon after the first finishes, there is
> > noticeable slow down.
> >
>
> What do you see in your ganglia?  Anything else running on cluster at
> the time?  What are the machines doing?  CPU, iowait?  Swapping?
>
> Thanks,
> St.Ack
>

Re: write performance degradation

Posted by Stack <st...@duboce.net>.

On Sun, Jan 30, 2011 at 3:41 PM, Ted Yu <yu...@gmail.com> wrote:
> Using 0.90 RC3 on our 7 node clsuter, the first time (after cluster start)
> our map tasks ran quite fast writing to hbase (client side buffer size being
> 17MB).
>

Can you come up on release 0.90.0 Ted?

17MB client buffer might be a little large.

> If I run the same hadoop job soon after the first finishes, there is
> noticeable slow down.
>

What do you see in your ganglia?  Anything else running on cluster at
the time?  What are the machines doing?  CPU, iowait?  Swapping?

Thanks,
St.Ack