You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Vladimir Rodionov <vl...@gmail.com> on 2013/11/27 05:00:34 UTC

Next big thing for HBase

Global optimization and performance tuning:
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja

Some numbers from this report does not look right for HBase. I do not
believe that 5 RS on Fusion drive scores only 1605 reads per sec per node.

Re: Next big thing for HBase

Posted by lars hofhansl <la...@apache.org>.
:)

Yeah, sorry, wasn't clear.
Your point is well taken, though, HBase should come configured and tuned out of the box for common workloads.

-- Lars



________________________________
 From: Vladimir Rodionov <vl...@gmail.com>
To: "dev@hbase.apache.org" <de...@hbase.apache.org>; lars hofhansl <la...@apache.org> 
Sent: Wednesday, November 27, 2013 12:01 AM
Subject: Re: Next big thing for HBase
 

Oh, I got it. "Next big thing for HBase" is not MapR M7 , but global
optimization and tuning of HBase itself.



On Tue, Nov 26, 2013 at 11:56 PM, Vladimir Rodionov
<vl...@gmail.com>wrote:

> Why do you think I got excited? I do not work for MapR. MapR has posted
> benchmark results and some numbers for HBase look quite low. I thought may
> be community will be interested in these results.
>
>
> On Tue, Nov 26, 2013 at 10:04 PM, lars hofhansl <la...@apache.org> wrote:
>
>> Excuse me if I do not get too exited about a report published by MapR
>> that comes to the conclusion that MapR's M7 is faster than "other
>> distribution".
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Vladimir Rodionov <vl...@gmail.com>
>> To: "dev@hbase.apache.org" <de...@hbase.apache.org>
>> Sent: Tuesday, November 26, 2013 8:00 PM
>> Subject: Next big thing for HBase
>>
>>
>> Global optimization and performance tuning:
>>
>> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja
>>
>> Some numbers from this report does not look right for HBase. I do not
>> believe that 5 RS on Fusion drive scores only 1605 reads per sec per node.
>>
>
>

Re: Next big thing for HBase

Posted by lars hofhansl <la...@apache.org>.
Thanks Varun.


seekTo is the worst case, though, and not representative for scanning; but it would be representative for gets.

It needs to look up the right block in the index again and seek from the beginning of the block found.
reseek should be doing much better.
500 iops across 4 HDD seems reasonable to me :)


-- Lars



________________________________
 From: Varun Sharma <va...@pinterest.com>
To: "dev@hbase.apache.org" <de...@hbase.apache.org> 
Sent: Wednesday, November 27, 2013 9:55 AM
Subject: Re: Next big thing for HBase
 

I think I sent that too early - I could buy the results for the HDD
comparison but I dont buy the Fusion I/O comparison at all. I have been
able to push it much much more on SSD(s) on EC2. It could well be that they
are not maxing out the region servers.



On Wed, Nov 27, 2013 at 9:53 AM, Varun Sharma <va...@pinterest.com> wrote:

> I could buy these results for a totally disk bound application as far as
> reads go. I was running some experiments where I have HFiles on disk.
> Memory : data ratio is 1:2 - so half the data can fit in memory. Then I run
> "new HFileScanner()" and then scanner.seekTo("someKeyValue"). On a 4 HDD
> system, I can get ~400 reads max. The hard drives end run quite hot - and
> the max I can push this thing to is 500 reads per second. Note that this is
> raw HFile seeks - no HBase or HDFS layers are present. I suspect HBase just
> issues way more iops than it needs to do.
>
> Varun
>
>
> On Wed, Nov 27, 2013 at 12:01 AM, Vladimir Rodionov <
> vladrodionov@gmail.com> wrote:
>
>> Oh, I got it. "Next big thing for HBase" is not MapR M7 , but global
>> optimization and tuning of HBase itself.
>>
>>
>> On Tue, Nov 26, 2013 at 11:56 PM, Vladimir Rodionov
>> <vl...@gmail.com>wrote:
>>
>> > Why do you think I got excited? I do not work for MapR. MapR has posted
>> > benchmark results and some numbers for HBase look quite low. I thought
>> may
>> > be community will be interested in these results.
>> >
>> >
>> > On Tue, Nov 26, 2013 at 10:04 PM, lars hofhansl <la...@apache.org>
>> wrote:
>> >
>> >> Excuse me if I do not get too exited about a report published by MapR
>> >> that comes to the conclusion that MapR's M7 is faster than "other
>> >> distribution".
>> >>
>> >> -- Lars
>> >>
>> >>
>> >>
>> >> ________________________________
>> >>  From: Vladimir Rodionov <vl...@gmail.com>
>> >> To: "dev@hbase.apache.org" <de...@hbase.apache.org>
>> >> Sent: Tuesday, November 26, 2013 8:00 PM
>> >> Subject: Next big thing for HBase
>> >>
>> >>
>> >> Global optimization and performance tuning:
>> >>
>> >>
>> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja
>> >>
>> >> Some numbers from this report does not look right for HBase. I do not
>> >> believe that 5 RS on Fusion drive scores only 1605 reads per sec per
>> node.
>> >>
>> >
>> >
>>
>
>

Re: Next big thing for HBase

Posted by Varun Sharma <va...@pinterest.com>.
I think I sent that too early - I could buy the results for the HDD
comparison but I dont buy the Fusion I/O comparison at all. I have been
able to push it much much more on SSD(s) on EC2. It could well be that they
are not maxing out the region servers.


On Wed, Nov 27, 2013 at 9:53 AM, Varun Sharma <va...@pinterest.com> wrote:

> I could buy these results for a totally disk bound application as far as
> reads go. I was running some experiments where I have HFiles on disk.
> Memory : data ratio is 1:2 - so half the data can fit in memory. Then I run
> "new HFileScanner()" and then scanner.seekTo("someKeyValue"). On a 4 HDD
> system, I can get ~400 reads max. The hard drives end run quite hot - and
> the max I can push this thing to is 500 reads per second. Note that this is
> raw HFile seeks - no HBase or HDFS layers are present. I suspect HBase just
> issues way more iops than it needs to do.
>
> Varun
>
>
> On Wed, Nov 27, 2013 at 12:01 AM, Vladimir Rodionov <
> vladrodionov@gmail.com> wrote:
>
>> Oh, I got it. "Next big thing for HBase" is not MapR M7 , but global
>> optimization and tuning of HBase itself.
>>
>>
>> On Tue, Nov 26, 2013 at 11:56 PM, Vladimir Rodionov
>> <vl...@gmail.com>wrote:
>>
>> > Why do you think I got excited? I do not work for MapR. MapR has posted
>> > benchmark results and some numbers for HBase look quite low. I thought
>> may
>> > be community will be interested in these results.
>> >
>> >
>> > On Tue, Nov 26, 2013 at 10:04 PM, lars hofhansl <la...@apache.org>
>> wrote:
>> >
>> >> Excuse me if I do not get too exited about a report published by MapR
>> >> that comes to the conclusion that MapR's M7 is faster than "other
>> >> distribution".
>> >>
>> >> -- Lars
>> >>
>> >>
>> >>
>> >> ________________________________
>> >>  From: Vladimir Rodionov <vl...@gmail.com>
>> >> To: "dev@hbase.apache.org" <de...@hbase.apache.org>
>> >> Sent: Tuesday, November 26, 2013 8:00 PM
>> >> Subject: Next big thing for HBase
>> >>
>> >>
>> >> Global optimization and performance tuning:
>> >>
>> >>
>> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja
>> >>
>> >> Some numbers from this report does not look right for HBase. I do not
>> >> believe that 5 RS on Fusion drive scores only 1605 reads per sec per
>> node.
>> >>
>> >
>> >
>>
>
>

Re: Next big thing for HBase

Posted by Varun Sharma <va...@pinterest.com>.
I could buy these results for a totally disk bound application as far as
reads go. I was running some experiments where I have HFiles on disk.
Memory : data ratio is 1:2 - so half the data can fit in memory. Then I run
"new HFileScanner()" and then scanner.seekTo("someKeyValue"). On a 4 HDD
system, I can get ~400 reads max. The hard drives end run quite hot - and
the max I can push this thing to is 500 reads per second. Note that this is
raw HFile seeks - no HBase or HDFS layers are present. I suspect HBase just
issues way more iops than it needs to do.

Varun


On Wed, Nov 27, 2013 at 12:01 AM, Vladimir Rodionov
<vl...@gmail.com>wrote:

> Oh, I got it. "Next big thing for HBase" is not MapR M7 , but global
> optimization and tuning of HBase itself.
>
>
> On Tue, Nov 26, 2013 at 11:56 PM, Vladimir Rodionov
> <vl...@gmail.com>wrote:
>
> > Why do you think I got excited? I do not work for MapR. MapR has posted
> > benchmark results and some numbers for HBase look quite low. I thought
> may
> > be community will be interested in these results.
> >
> >
> > On Tue, Nov 26, 2013 at 10:04 PM, lars hofhansl <la...@apache.org>
> wrote:
> >
> >> Excuse me if I do not get too exited about a report published by MapR
> >> that comes to the conclusion that MapR's M7 is faster than "other
> >> distribution".
> >>
> >> -- Lars
> >>
> >>
> >>
> >> ________________________________
> >>  From: Vladimir Rodionov <vl...@gmail.com>
> >> To: "dev@hbase.apache.org" <de...@hbase.apache.org>
> >> Sent: Tuesday, November 26, 2013 8:00 PM
> >> Subject: Next big thing for HBase
> >>
> >>
> >> Global optimization and performance tuning:
> >>
> >>
> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja
> >>
> >> Some numbers from this report does not look right for HBase. I do not
> >> believe that 5 RS on Fusion drive scores only 1605 reads per sec per
> node.
> >>
> >
> >
>

Re: Next big thing for HBase

Posted by Vladimir Rodionov <vl...@gmail.com>.
Oh, I got it. "Next big thing for HBase" is not MapR M7 , but global
optimization and tuning of HBase itself.


On Tue, Nov 26, 2013 at 11:56 PM, Vladimir Rodionov
<vl...@gmail.com>wrote:

> Why do you think I got excited? I do not work for MapR. MapR has posted
> benchmark results and some numbers for HBase look quite low. I thought may
> be community will be interested in these results.
>
>
> On Tue, Nov 26, 2013 at 10:04 PM, lars hofhansl <la...@apache.org> wrote:
>
>> Excuse me if I do not get too exited about a report published by MapR
>> that comes to the conclusion that MapR's M7 is faster than "other
>> distribution".
>>
>> -- Lars
>>
>>
>>
>> ________________________________
>>  From: Vladimir Rodionov <vl...@gmail.com>
>> To: "dev@hbase.apache.org" <de...@hbase.apache.org>
>> Sent: Tuesday, November 26, 2013 8:00 PM
>> Subject: Next big thing for HBase
>>
>>
>> Global optimization and performance tuning:
>>
>> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja
>>
>> Some numbers from this report does not look right for HBase. I do not
>> believe that 5 RS on Fusion drive scores only 1605 reads per sec per node.
>>
>
>

Re: Next big thing for HBase

Posted by Vladimir Rodionov <vl...@gmail.com>.
Why do you think I got excited? I do not work for MapR. MapR has posted
benchmark results and some numbers for HBase look quite low. I thought may
be community will be interested in these results.


On Tue, Nov 26, 2013 at 10:04 PM, lars hofhansl <la...@apache.org> wrote:

> Excuse me if I do not get too exited about a report published by MapR that
> comes to the conclusion that MapR's M7 is faster than "other distribution".
>
> -- Lars
>
>
>
> ________________________________
>  From: Vladimir Rodionov <vl...@gmail.com>
> To: "dev@hbase.apache.org" <de...@hbase.apache.org>
> Sent: Tuesday, November 26, 2013 8:00 PM
> Subject: Next big thing for HBase
>
>
> Global optimization and performance tuning:
>
> http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja
>
> Some numbers from this report does not look right for HBase. I do not
> believe that 5 RS on Fusion drive scores only 1605 reads per sec per node.
>

Re: Next big thing for HBase

Posted by lars hofhansl <la...@apache.org>.
Excuse me if I do not get too exited about a report published by MapR that comes to the conclusion that MapR's M7 is faster than "other distribution".

-- Lars



________________________________
 From: Vladimir Rodionov <vl...@gmail.com>
To: "dev@hbase.apache.org" <de...@hbase.apache.org> 
Sent: Tuesday, November 26, 2013 8:00 PM
Subject: Next big thing for HBase
 

Global optimization and performance tuning:
http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=19&ved=0CG8QFjAIOAo&url=http%3A%2F%2Fwww.mapr.com%2FDownload-document%2F52-MapR-M7-Performance-Benchmark&ei=QGuVUr-cA6ewjAL_94DoCQ&usg=AFQjCNH2Brlp5n2rIAarEbj39c_X_lnvDg&sig2=bLTKxbspEgsRN3bJXUnspQ&bvm=bv.57155469,d.cGE&cad=rja

Some numbers from this report does not look right for HBase. I do not
believe that 5 RS on Fusion drive scores only 1605 reads per sec per node.