You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by abhishek1015 <ab...@gmail.com> on 2014/09/27 06:55:49 UTC

hbase row locking

Hello everyone,

I have been running some experiments to see the effect of locking overhead
in HBase. For this, I am comparing the throughput difference between these
two schema.

Schema1:
rowkey-> <userid>
columnkey-> cf:<orderid>
value-> blob of 2000 character

Schema2:
rowkey-><userid>:<oderid>
columnkey-> cf:order
value-> blob of 2000 character

Now, for generating data, I generate a userid uniformly between 1 and
max_num_user, then I look for the last generated orderid from the hashmap
for this user and generate the next orderid. I have extended the YCSB to
generate this workload.

I am running experiment with max_num_user=10 and number of insertion
operation  = 1 million. So, average number of orders per user is 100k. I am
hoping to see the decreased throughput in case of schema1 due to locking.
However, I am observing the same throughput in two cases. I am not sure how
to explain this. Given that I have 8 server thread on each region server and
10 rows in schema1 is distributed across 4 node. It is highly likely that
there will be a lock contention. 

I checked that I am doing flushCommits() after every put operation. Am I
missing anything?

Thanks for your help.

Abhishek



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: hbase row locking

Posted by abhishek1015 <ab...@gmail.com>.
Thanks lars for sharing HBASE-4528. I was not aware of this impl
optimization.
Abhishek



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432p4064988.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: hbase row locking

Posted by lars hofhansl <la...@apache.org>.
What are you trying to prove/observe, Abhishek?
HBase does not hold the row lock while the WAL is sync'ed (see my previous response), so network settings would have no bearing on how long the row locks are being held.See HBASE-4528. You'd to go back to an 0.92 release to still observe that effect.

-- Lars

     From: abhishek1015 <ab...@gmail.com>
 To: user@hbase.apache.org 
 Sent: Saturday, October 11, 2014 9:39 PM
 Subject: Re: hbase row locking
   
I set the durability of both tables to SYNC_WAL and changed the network delay
using 'tc' command from .1ms to 20ms to 200ms. I  see throughput to be 1500,
58 and 6 per regionserver respectively for both schema. This was unexpected
as I was certainly hoping to see row locking for wide table case.

Thanks
Abhishek 



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432p4064972.html


Sent from the HBase User mailing list archive at Nabble.com.


  

Re: hbase row locking

Posted by abhishek1015 <ab...@gmail.com>.
I set the durability of both tables to SYNC_WAL and changed the network delay
using 'tc' command from .1ms to 20ms to 200ms. I  see throughput to be 1500,
58 and 6 per regionserver respectively for both schema. This was unexpected
as I was certainly hoping to see row locking for wide table case.

Thanks
Abhishek 



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432p4064972.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: hbase row locking

Posted by Andrew Purtell <an...@gmail.com>.
> When HDFS gets tiered storage, we can revive this and put HBase's WAL on SSD storage.

I think the archival storage work just committed (and ported into branch-2?) might be sufficient for a pilot of HBASE-6572. A few important changes were filed late, like APIs for apps for setting policy and persistence for changes in the edit log, so we should survey what is in there exactly. 

Related: HDFS-5682



On Sep 28, 2014, at 1:32 PM, lars hofhansl <la...@apache.org> wrote:

>> Are anyone aware of any company who does not
> use the hdfs default policy and flush every WAL sync.
> 
> 
> It's a trade-off. You'll only lose data when and the wrong three machines die around the same time (you'd have an outage that any block that exists only on these three boxes). Not also that time of the data not being on disk is bounded, eventually the OS the flush dirty pages, Linux does it every 15s by default.
> 
> So you'd have all machines die before a single of them manages to flush the dirty pages to disk. Of course that can happen, for example during a data center power outage.
> 
> 
> A while ago I added HDFS-744 to HDFS, but never finished the parts in HBase as nobody (including myself in the end) was interested in it. Reminds to maybe take this up again in HBase 2.0 since now we support fewer versions of Hadoop.
> 
> 
> When HDFS gets tiered storage, we can revive this and put HBase's WAL on SSD storage.
> 
> 
> -- Lars
> 
> 
> 
> ----- Original Message -----
> From: abhishek1015 <ab...@gmail.com>
> To: user@hbase.apache.org
> Cc: 
> Sent: Sunday, September 28, 2014 9:13 AM
> Subject: Re: hbase row locking
> 
> Sorry for confusion. I meant that I am getting 6000 ops/sec throughput
> overall using 4 machine. That is 1500 ops/sec/regionserver on average.
> 
> I checked the ping response time between machines. It is approximately .09
> ms.
> 
> Assuming that WAL sync thread tries to sync with two other hdfs node
> sequentially, the row lock will be held for at least 0.18 ms, which will
> still give a very high throughput per regionserver even if only one thread
> is working and all other threads are blocked because of locking. 
> 
> It appears that bottleneck is then the hdfs disk flush.  And, consequently,
> above mentioned schema are equivalent w.r.t. performance.
> 
> However, I have a question regarding the default hdfs policy of not flushing
> every WAL sync. Are not people in industry afraid of data loss however small
> probability of this happening. Are anyone aware of any company who does not
> use the hdfs default policy and flush every WAL sync.
> 
> Thanks
> Abhishek
> 
> 
> 
> --
> View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432p4064458.html
> 
> 
> 
> Sent from the HBase User mailing list archive at Nabble.com.
> 

Re: hbase row locking

Posted by lars hofhansl <la...@apache.org>.
> Are anyone aware of any company who does not
use the hdfs default policy and flush every WAL sync.


It's a trade-off. You'll only lose data when and the wrong three machines die around the same time (you'd have an outage that any block that exists only on these three boxes). Not also that time of the data not being on disk is bounded, eventually the OS the flush dirty pages, Linux does it every 15s by default.

So you'd have all machines die before a single of them manages to flush the dirty pages to disk. Of course that can happen, for example during a data center power outage.


A while ago I added HDFS-744 to HDFS, but never finished the parts in HBase as nobody (including myself in the end) was interested in it. Reminds to maybe take this up again in HBase 2.0 since now we support fewer versions of Hadoop.


When HDFS gets tiered storage, we can revive this and put HBase's WAL on SSD storage.


-- Lars



----- Original Message -----
From: abhishek1015 <ab...@gmail.com>
To: user@hbase.apache.org
Cc: 
Sent: Sunday, September 28, 2014 9:13 AM
Subject: Re: hbase row locking

Sorry for confusion. I meant that I am getting 6000 ops/sec throughput
overall using 4 machine. That is 1500 ops/sec/regionserver on average.

I checked the ping response time between machines. It is approximately .09
ms.

Assuming that WAL sync thread tries to sync with two other hdfs node
sequentially, the row lock will be held for at least 0.18 ms, which will
still give a very high throughput per regionserver even if only one thread
is working and all other threads are blocked because of locking. 

It appears that bottleneck is then the hdfs disk flush.  And, consequently,
above mentioned schema are equivalent w.r.t. performance.

However, I have a question regarding the default hdfs policy of not flushing
every WAL sync. Are not people in industry afraid of data loss however small
probability of this happening. Are anyone aware of any company who does not
use the hdfs default policy and flush every WAL sync.

Thanks
Abhishek



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432p4064458.html



Sent from the HBase User mailing list archive at Nabble.com.


Re: hbase row locking

Posted by abhishek1015 <ab...@gmail.com>.
Sorry for confusion. I meant that I am getting 6000 ops/sec throughput
overall using 4 machine. That is 1500 ops/sec/regionserver on average.

I checked the ping response time between machines. It is approximately .09
ms.

Assuming that WAL sync thread tries to sync with two other hdfs node
sequentially, the row lock will be held for at least 0.18 ms, which will
still give a very high throughput per regionserver even if only one thread
is working and all other threads are blocked because of locking. 

It appears that bottleneck is then the hdfs disk flush.  And, consequently,
above mentioned schema are equivalent w.r.t. performance.

However, I have a question regarding the default hdfs policy of not flushing
every WAL sync. Are not people in industry afraid of data loss however small
probability of this happening. Are anyone aware of any company who does not
use the hdfs default policy and flush every WAL sync.

Thanks
Abhishek



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432p4064458.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: hbase row locking

Posted by lars hofhansl <la...@apache.org>.
That would mean that the WAL is "sync'ed" after every request.
Keep in mind, though, that by default HDFS does not actually sync edits to disk, but just enforces that edits made it to the 3 replicas.
Still, 6000/ops/s seems fast for single edit RPCs. You must have good (sub ms) network latency - the data needs to travel from the client to the RegionServer, from there it is pipelined to three DataNodes.
Seem possible only if your RTT is < 0.05ms or so.


-- Lars



________________________________
 From: abhishek1015 <ab...@gmail.com>
To: user@hbase.apache.org 
Sent: Saturday, September 27, 2014 12:10 PM
Subject: Re: hbase row locking
 

Thanks for the clarification.

I am using Hbase 0.98.5.

I observe around 6000 ops/sec throughput. I do flushCommits() after every
put request. Does this mean that WAL sync performed after every put
operation? I have 4 machine in my cluster each with 8 core and 8 GB memory.

I am trying to figure out under what condition schema-1 performs better than
the schema-2 and vice versa. 

Thanks
Abhishek



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432p4064437.html



Sent from the HBase User mailing list archive at Nabble.com.

Re: hbase row locking

Posted by abhishek1015 <ab...@gmail.com>.
Thanks for the clarification.

I am using Hbase 0.98.5.

I observe around 6000 ops/sec throughput. I do flushCommits() after every
put request. Does this mean that WAL sync performed after every put
operation? I have 4 machine in my cluster each with 8 core and 8 GB memory.

I am trying to figure out under what condition schema-1 performs better than
the schema-2 and vice versa. 

Thanks
Abhishek



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432p4064437.html
Sent from the HBase User mailing list archive at Nabble.com.

Re: hbase row locking

Posted by lars hofhansl <la...@apache.org>.
You didn't tell us which version of HBase.

HBase is pretty smart about how long it needs to hold locks.
For example the flush to the WAL is done without the row lock held.

The row lock is only help to create the WAL edit and to add the edit to memstore, then it is released.
After that we sync the WAL (this is the part that takes time). If that fails we undo the changes to the memstore before we make the changes visible via MVCC.

Curious what kind of throughput you _do_ see.

-- Lars



________________________________
 From: abhishek1015 <ab...@gmail.com>
To: user@hbase.apache.org 
Sent: Friday, September 26, 2014 9:55 PM
Subject: hbase row locking
 

Hello everyone,

I have been running some experiments to see the effect of locking overhead
in HBase. For this, I am comparing the throughput difference between these
two schema.

Schema1:
rowkey-> <userid>
columnkey-> cf:<orderid>
value-> blob of 2000 character

Schema2:
rowkey-><userid>:<oderid>
columnkey-> cf:order
value-> blob of 2000 character

Now, for generating data, I generate a userid uniformly between 1 and
max_num_user, then I look for the last generated orderid from the hashmap
for this user and generate the next orderid. I have extended the YCSB to
generate this workload.

I am running experiment with max_num_user=10 and number of insertion
operation  = 1 million. So, average number of orders per user is 100k. I am
hoping to see the decreased throughput in case of schema1 due to locking.
However, I am observing the same throughput in two cases. I am not sure how
to explain this. Given that I have 8 server thread on each region server and
10 rows in schema1 is distributed across 4 node. It is highly likely that
there will be a lock contention. 

I checked that I am doing flushCommits() after every put operation. Am I
missing anything?

Thanks for your help.

Abhishek



--
View this message in context: http://apache-hbase.679495.n3.nabble.com/hbase-row-locking-tp4064432.html
Sent from the HBase User mailing list archive at Nabble.com.