You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by pob <pe...@gmail.com> on 2011/05/06 19:12:23 UTC

Write performance

Hello,


I set cluster of 4 nodes with hdfs + hbase (1 node - namenode, hbase master,
zookeeper and three remains with datanode + regionserver).

I was trying to test maximal throughput of write operation - with yahoo
ycbs. I got max 25000op/s while inserting 1KB data (1row with one column).


Whats really strange for me is that i run same test with same configuration
once with RF=1, then RF=3 on HDFS. The results was kind of same. Why?
Replication should add overhead..., so how it could be same?

Then I double cluster with 3 more datanodes and regionservers. I run same
test and I got the same op/s like with 3 nodes. Why? Isnt the point of
distributed DB to scale? so dobule nodes ~half double write speed?


To omit any client bottleneck, I wrote simple inserter and was trying
inserting from 3 different nodes into cluster, doesnt help (i pre-split
regions. was trying to get evenly load, during monitoring it with jconsole
the load was about 25:50:25).  But the results was same like with one
client.


I found there is performance util with hbase,  PerformanceEvaluation.java
but i cant find any information how to compile it.


Thanks for any ideas.

Re: Write performance

Posted by Ted Dunning <td...@maprtech.com>.

It sounds to me like you have a different bottleneck than the file
system.  25K inserts x 1KB is only 25MB per second.  Even HDFS can
write that much with replication.

The issue is much more likely to do with the cost of sync'ing the logs.

On Fri, May 6, 2011 at 12:56 PM, lohit <lo...@gmail.com> wrote:
>> ..
>> Whats really strange for me is that i run same test with same configuration
>> once with RF=1, then RF=3 on HDFS. The results was kind of same. Why?
>> Replication should add overhead..., so how it could be same?
>>
>
> By default ycsb disables autoflush and increases write buffer size to large
> amount.
> My guess is this could be the reason you do not see much difference in when
> changing replication.
> Also, you might want to increase record size and then look at real
> replication overhead.
>

Re: Write performance

Posted by lohit <lo...@gmail.com>.

2011/5/6 pob <pe...@gmail.com>

> Hello,
>
>
> I set cluster of 4 nodes with hdfs + hbase (1 node - namenode, hbase
> master,
> zookeeper and three remains with datanode + regionserver).
>
> I was trying to test maximal throughput of write operation - with yahoo
> ycbs. I got max 25000op/s while inserting 1KB data (1row with one column).
>
>
> Whats really strange for me is that i run same test with same configuration
> once with RF=1, then RF=3 on HDFS. The results was kind of same. Why?
> Replication should add overhead..., so how it could be same?
>

By default ycsb disables autoflush and increases write buffer size to large
amount.
My guess is this could be the reason you do not see much difference in when
changing replication.
Also, you might want to increase record size and then look at real
replication overhead.


>
> Then I double cluster with 3 more datanodes and regionservers. I run same
> test and I got the same op/s like with 3 nodes. Why? Isnt the point of
> distributed DB to scale? so dobule nodes ~half double write speed?
>
>
> To omit any client bottleneck, I wrote simple inserter and was trying
> inserting from 3 different nodes into cluster, doesnt help (i pre-split
> regions. was trying to get evenly load, during monitoring it with jconsole
> the load was about 25:50:25).  But the results was same like with one
> client.
>
>
> I found there is performance util with hbase,  PerformanceEvaluation.java
> but i cant find any information how to compile it.
>
>
> Thanks for any ideas.
>



-- 
Have a Nice Day!
Lohit