You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Sergey Semenoff <bo...@gmail.com> on 2021/01/07 19:03:54 UTC

Re: Improvment - speed up HBase to 2-3 times

Hello, guys!

Sorry for bothering so smallest thing like increasing performance up to 3
times)) I am not sure how much time is ok to consider a PR in open source
projects, if I too persistent please forgive me.

Maybe someone will have time to take a look at some proposals improvement:
https://github.com/apache/hbase/pull/1257

Thanks)

ср, 16 сент. 2020 г., 12:41 Sergey Semenoff <bo...@gmail.com>:

> Hi *!
>
> I think everybody who working with the real BigData know – performance is
> very important.
>
> Unfortunaly our lovely HBase slower then Cassandra approximately in 2
> times when reading huge amount of data.
>
>
> For example – this is Cassandra the performance test run from 2 hosts
> (client side)
>
> Host1 - Throughput(ops/sec), 231 021
>
> Host2 - Throughput(ops/sec), 224 691
>
>
>
> Summary ~450 000.
>
> HBase shows in the same conditions only 210 000.
>
>
>
> Maybe this is one of the reason why Cassandra is more popular (see
> https://db-engines.com/en/ranking/wide+column+store)
>
> I’ve done an improvment which can make HBase faster up 2-3 times (it
> depends of many reasons, and sometimes even faster).
>
> With the improvement HBase speed up to 430 000 ops/sec.
>
> See the picture in attachment.
>
>
>
> If you interested to get this improvement in release you can help to
> attract some developers attention here -
> https://issues.apache.org/jira/browse/HBASE-23887
>
> Put some line there with your opinion and vote if you think it could be
> useful for your work.
>
> I believe discussion about this approach can make HBase more useful and
> popular.
>
>
>
> Thanks for attention)
>
> With the best regards,
>
> Pustota
>
>

Re: Improvment - speed up HBase to 2-3 times

Posted by Viraj Jasani <vj...@apache.org>.

I am planning to commit this PR#2934 after 48 hr from now. If anyone would
like to take a look in the meantime, please let me know over this thread or
on Jira and I will wait until the review is complete.

FYI previous reviewes took place on old PR#1257 and over Jira HBASE-23887
itself.

Thanks


On Mon, 8 Feb 2021 at 11:09 PM, Viraj Jasani <vj...@apache.org> wrote:

> Thanks for working through feedback provided over HBASE-23887 Jira and
> creating
> this new PR [1] with new L1 cache: AdaptiveLRU. Really appreciate your
> efforts!!
> I just had high level look today, structure looks great and will spend
> some time
> (day after) tomorrow for detailed review.
>
> Requesting other reviewers to take a look at this nice new PR [1].
> Thanks
>
> 1. https://github.com/apache/hbase/pull/2934
>
> On 2021/01/07 19:03:54, Sergey Semenoff <bo...@gmail.com> wrote:
> > Hello, guys!
> >
> > Sorry for bothering so smallest thing like increasing performance up to 3
> > times)) I am not sure how much time is ok to consider a PR in open source
> > projects, if I too persistent please forgive me.
> >
> > Maybe someone will have time to take a look at some proposals
> improvement:
> > https://github.com/apache/hbase/pull/1257
> >
> > Thanks)
> >
> > ср, 16 сент. 2020 г., 12:41 Sergey Semenoff <bo...@gmail.com>:
> >
> > > Hi *!
> > >
> > > I think everybody who working with the real BigData know – performance
> is
> > > very important.
> > >
> > > Unfortunaly our lovely HBase slower then Cassandra approximately in 2
> > > times when reading huge amount of data.
> > >
> > >
> > > For example – this is Cassandra the performance test run from 2 hosts
> > > (client side)
> > >
> > > Host1 - Throughput(ops/sec), 231 021
> > >
> > > Host2 - Throughput(ops/sec), 224 691
> > >
> > >
> > >
> > > Summary ~450 000.
> > >
> > > HBase shows in the same conditions only 210 000.
> > >
> > >
> > >
> > > Maybe this is one of the reason why Cassandra is more popular (see
> > > https://db-engines.com/en/ranking/wide+column+store)
> > >
> > > I’ve done an improvment which can make HBase faster up 2-3 times (it
> > > depends of many reasons, and sometimes even faster).
> > >
> > > With the improvement HBase speed up to 430 000 ops/sec.
> > >
> > > See the picture in attachment.
> > >
> > >
> > >
> > > If you interested to get this improvement in release you can help to
> > > attract some developers attention here -
> > > https://issues.apache.org/jira/browse/HBASE-23887
> > >
> > > Put some line there with your opinion and vote if you think it could be
> > > useful for your work.
> > >
> > > I believe discussion about this approach can make HBase more useful and
> > > popular.
> > >
> > >
> > >
> > > Thanks for attention)
> > >
> > > With the best regards,
> > >
> > > Pustota
> > >
> > >
> >
>

Re: Improvment - speed up HBase to 2-3 times

Posted by Viraj Jasani <vj...@apache.org>.

Thanks for working through feedback provided over HBASE-23887 Jira and creating
this new PR [1] with new L1 cache: AdaptiveLRU. Really appreciate your efforts!!
I just had high level look today, structure looks great and will spend some time
(day after) tomorrow for detailed review.

Requesting other reviewers to take a look at this nice new PR [1].
Thanks

1. https://github.com/apache/hbase/pull/2934

On 2021/01/07 19:03:54, Sergey Semenoff <bo...@gmail.com> wrote: 
> Hello, guys!
> 
> Sorry for bothering so smallest thing like increasing performance up to 3
> times)) I am not sure how much time is ok to consider a PR in open source
> projects, if I too persistent please forgive me.
> 
> Maybe someone will have time to take a look at some proposals improvement:
> https://github.com/apache/hbase/pull/1257
> 
> Thanks)
> 
> ср, 16 сент. 2020 г., 12:41 Sergey Semenoff <bo...@gmail.com>:
> 
> > Hi *!
> >
> > I think everybody who working with the real BigData know – performance is
> > very important.
> >
> > Unfortunaly our lovely HBase slower then Cassandra approximately in 2
> > times when reading huge amount of data.
> >
> >
> > For example – this is Cassandra the performance test run from 2 hosts
> > (client side)
> >
> > Host1 - Throughput(ops/sec), 231 021
> >
> > Host2 - Throughput(ops/sec), 224 691
> >
> >
> >
> > Summary ~450 000.
> >
> > HBase shows in the same conditions only 210 000.
> >
> >
> >
> > Maybe this is one of the reason why Cassandra is more popular (see
> > https://db-engines.com/en/ranking/wide+column+store)
> >
> > I’ve done an improvment which can make HBase faster up 2-3 times (it
> > depends of many reasons, and sometimes even faster).
> >
> > With the improvement HBase speed up to 430 000 ops/sec.
> >
> > See the picture in attachment.
> >
> >
> >
> > If you interested to get this improvement in release you can help to
> > attract some developers attention here -
> > https://issues.apache.org/jira/browse/HBASE-23887
> >
> > Put some line there with your opinion and vote if you think it could be
> > useful for your work.
> >
> > I believe discussion about this approach can make HBase more useful and
> > popular.
> >
> >
> >
> > Thanks for attention)
> >
> > With the best regards,
> >
> > Pustota
> >
> >
>