You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hbase.apache.org by Neelesh <ne...@gmail.com> on 2016/12/01 07:59:36 UTC

Re: Hbase on HDFS versus Cassandra

Ted,
we use HDP 2.3.4 (HBase 1.1.2, phoenix 4.4 - but with a lot of backports
from later versions)

The key of the data table is  <customerid- 11 bytes><userid- 36 bytes-  but
it really is a right padded long due to historic data><event type
long><timestamp><random long>

The two global indexes are <customerid-11 bytes><event type
long><timestamp> <user id> and <customerid-11 bytes><campaign id
long><timestamp>
Around 100B rows in the main table. The main issues we see are

# Sudden spikes in queueSize - going all the way to 1G limit and staying
there, without any correlated client traffic
# Boatloads of these errors 2016-11-30 11:28:54,907 INFO
 [RW.default.writeRpcServer.handler=43,queue=9,port=16020]
util.IndexManagementUtil: Rethrowing
org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR
2008 (INT10): Unable to find cached index metadata.  key=120521194876100862
region=<region-key>. Index update failed

We have cross datacenter WAL replication enabled.
We saw PHOENIX-1718, and changed all recommended timeouts to 1 hour.  Our
HBase version has HBase-11705. We also discovered that the queuesize is
global (across general/replication/priority queues) and if it reaches the
1GB limit, calls to all queues will drop. That was interesting because even
though the replication handlers have  a different queue, the size is
counted globally, affecting others. Please correct me on this. I hope I'm
wrong on this one :)

Our challenge has been to understand what's HBase doing under various
scenarios. We monitor call queue lengths, sizes and latencies as the
primary alerting mechanism to tell us something is going on with HBase.

Thanks!
-neelesh

On Wed, Nov 30, 2016 at 1:15 PM, Ted Yu <te...@yahoo.com.invalid> wrote:

> Neelesh:Can you share more details about the sluggish cluster performance
> (such as version of hbase / phoenix, your schema, region server log
> snippet, stack traces, etc) ?
> As hbase / phoenix evolve, I hope the performance keeps getting better for
> your use case.
> Cheers
>
>     On Wednesday, November 30, 2016 10:07 AM, Neelesh <ne...@gmail.com>
> wrote:
>
>
>  We use both, in different capacities. Cassandra is an x-DC archive store
> with mostly batch writes and occasional key based reads. Hbase is for
> real-time event ingestion. Our experience so far on hbase + phoenix is that
> when it works, it is fast and scales like crazy. But if you ever hit a snag
> around data patterns, you will have a VERY hard time figuring out what's
> going on. A combination of global phoenix indexes and heavy writes leave an
> entire cluster sluggish, if there is a hint of hotspotting.
>
> On the other hand, we had a big struggle getting Cassandra when a node
> recovery was in progress. What with twice the amount of disk requirements
> during recovery etc. Other than that, it is quiet.
> But the access patterns are not the same.
>
> I think the old rule still stays. If you are already on hadoop , or
> interested in using/analysing data in several different ways, go with hbase
> . If you just need a big data store with a few predefined query patterns,
> Cassandra is good
>
> Of course, I'm biased towards HBase.
>
> On Nov 30, 2016 7:02 AM, "Mich Talebzadeh" <mi...@gmail.com>
> wrote:
>
> > Hi Guys,
> >
> > Used Hbase on HDFS reasonably well. Happy to to stick with it and more
> with
> > Hive/Phoenix views and Phoenix indexes where I can.
> >
> > I have a bunch of users now vocal about the use case for Cassandra and
> > whether it can do a better job than Hbase.
> >
> > Unfortunately I am no expert on Cassandra. However, some use case fit
> would
> > be very valuable.
> >
> > Thanks
> >
> > Dr Mich Talebzadeh
> >
> >
> >
> > LinkedIn * https://www.linkedin.com/profile/view?id=
> > AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> > OABUrV8Pw>*
> >
> >
> >
> > http://talebzadehmich.wordpress.com
> >
> >
> > *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> > loss, damage or destruction of data or any other property which may arise
> > from relying on this email's technical content is explicitly disclaimed.
> > The author will in no case be liable for any monetary damages arising
> from
> > such loss, damage or destruction.
> >
>
>
>
>

Re: Hbase on HDFS versus Cassandra

Posted by Jerry He <je...@gmail.com>.

> We also discovered that the queuesize is
> global (across general/replication/priority queues) and if it reaches the
> 1GB limit, calls to all queues will drop. That was interesting because
even
> though the replication handlers have  a different queue, the size is
> counted globally, affecting others. Please correct me on this. I hope I'm
> wrong on this one :)

You are right.  The size is checked before the call is dispatched to the
different queues.
It sounds like something that can be improved.


Jerry


On Thu, Dec 1, 2016 at 8:04 AM, Neelesh <ne...@gmail.com> wrote:

> Yes, PHOENIX-1718 has about the same information
>
> On Dec 1, 2016 1:12 AM, "Ted Yu" <te...@yahoo.com> wrote:
>
> > w.r.t. "Unable to find cached index metadata" error, have you seen this ?
> >
> > http://search-hadoop.com/m/Phoenix/9UY0h2YBSOhgbflB1?
> > subj=Re+Global+Secondary+Index+ERROR+2008+INT10+Unable+
> > to+find+cached+index+metadata+PHOENIX+1718+
> >
> > Cheers
> >
> >
> > On Wednesday, November 30, 2016 11:59 PM, Neelesh <ne...@gmail.com>
> > wrote:
> >
> >
> > Ted,
> > we use HDP 2.3.4 (HBase 1.1.2, phoenix 4.4 - but with a lot of backports
> > from later versions)
> >
> > The key of the data table is  <customerid- 11 bytes><userid- 36 bytes-
> >  but it really is a right padded long due to historic data><event type
> > long><timestamp><random long>
> >
> > The two global indexes are <customerid-11 bytes><event type
> > long><timestamp> <user id> and <customerid-11 bytes><campaign id
> > long><timestamp>
> > Around 100B rows in the main table. The main issues we see are
> >
> > # Sudden spikes in queueSize - going all the way to 1G limit and staying
> > there, without any correlated client traffic
> > # Boatloads of these errors 2016-11-30 11:28:54,907 INFO
> >  [RW.default.writeRpcServer.handler=43,queue=9,port=16020]
> > util.IndexManagementUtil: Rethrowing org.apache.hadoop.hbase.
> DoNotRetryIOException:
> > ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index
> > metadata.  key=120521194876100862 region=<region-key>. Index update
> failed
> >
> > We have cross datacenter WAL replication enabled.
> > We saw PHOENIX-1718, and changed all recommended timeouts to 1 hour.  Our
> > HBase version has HBase-11705. We also discovered that the queuesize is
> > global (across general/replication/priority queues) and if it reaches the
> > 1GB limit, calls to all queues will drop. That was interesting because
> even
> > though the replication handlers have  a different queue, the size is
> > counted globally, affecting others. Please correct me on this. I hope I'm
> > wrong on this one :)
> >
> > Our challenge has been to understand what's HBase doing under various
> > scenarios. We monitor call queue lengths, sizes and latencies as the
> > primary alerting mechanism to tell us something is going on with HBase.
> >
> > Thanks!
> > -neelesh
> >
> > On Wed, Nov 30, 2016 at 1:15 PM, Ted Yu <te...@yahoo.com.invalid>
> wrote:
> >
> > Neelesh:Can you share more details about the sluggish cluster performance
> > (such as version of hbase / phoenix, your schema, region server log
> > snippet, stack traces, etc) ?
> > As hbase / phoenix evolve, I hope the performance keeps getting better
> for
> > your use case.
> > Cheers
> >
> >     On Wednesday, November 30, 2016 10:07 AM, Neelesh <
> neeleshs@gmail.com>
> > wrote:
> >
> >
> >  We use both, in different capacities. Cassandra is an x-DC archive store
> > with mostly batch writes and occasional key based reads. Hbase is for
> > real-time event ingestion. Our experience so far on hbase + phoenix is
> that
> > when it works, it is fast and scales like crazy. But if you ever hit a
> snag
> > around data patterns, you will have a VERY hard time figuring out what's
> > going on. A combination of global phoenix indexes and heavy writes leave
> an
> > entire cluster sluggish, if there is a hint of hotspotting.
> >
> > On the other hand, we had a big struggle getting Cassandra when a node
> > recovery was in progress. What with twice the amount of disk requirements
> > during recovery etc. Other than that, it is quiet.
> > But the access patterns are not the same.
> >
> > I think the old rule still stays. If you are already on hadoop , or
> > interested in using/analysing data in several different ways, go with
> hbase
> > . If you just need a big data store with a few predefined query patterns,
> > Cassandra is good
> >
> > Of course, I'm biased towards HBase.
> >
> > On Nov 30, 2016 7:02 AM, "Mich Talebzadeh" <mi...@gmail.com>
> > wrote:
> >
> > > Hi Guys,
> > >
> > > Used Hbase on HDFS reasonably well. Happy to to stick with it and more
> > with
> > > Hive/Phoenix views and Phoenix indexes where I can.
> > >
> > > I have a bunch of users now vocal about the use case for Cassandra and
> > > whether it can do a better job than Hbase.
> > >
> > > Unfortunately I am no expert on Cassandra. However, some use case fit
> > would
> > > be very valuable.
> > >
> > > Thanks
> > >
> > > Dr Mich Talebzadeh
> > >
> > >
> > >
> > > LinkedIn * https://www.linkedin.com/ profile/view?id=
> > <https://www.linkedin.com/profile/view?id=>
> > > AAEAAAAWh2gBxianrbJd6zP6AcPCCd OABUrV8Pw
> > > <https://www.linkedin.com/ profile/view?id=
> > AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> > <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> >
> > > OABUrV8Pw>*
> > >
> > >
> > >
> > > http://talebzadehmich. wordpress.com
> > <http://talebzadehmich.wordpress.com/>
> > >
> > >
> > > *Disclaimer:* Use it at your own risk. Any and all responsibility for
> any
> > > loss, damage or destruction of data or any other property which may
> arise
> > > from relying on this email's technical content is explicitly
> disclaimed.
> > > The author will in no case be liable for any monetary damages arising
> > from
> > > such loss, damage or destruction.
> > >
> >
> >
> >
> >
> >
> >
> >
> >
>

Re: Hbase on HDFS versus Cassandra

Posted by Neelesh <ne...@gmail.com>.

Yes, PHOENIX-1718 has about the same information

On Dec 1, 2016 1:12 AM, "Ted Yu" <te...@yahoo.com> wrote:

> w.r.t. "Unable to find cached index metadata" error, have you seen this ?
>
> http://search-hadoop.com/m/Phoenix/9UY0h2YBSOhgbflB1?
> subj=Re+Global+Secondary+Index+ERROR+2008+INT10+Unable+
> to+find+cached+index+metadata+PHOENIX+1718+
>
> Cheers
>
>
> On Wednesday, November 30, 2016 11:59 PM, Neelesh <ne...@gmail.com>
> wrote:
>
>
> Ted,
> we use HDP 2.3.4 (HBase 1.1.2, phoenix 4.4 - but with a lot of backports
> from later versions)
>
> The key of the data table is  <customerid- 11 bytes><userid- 36 bytes-
>  but it really is a right padded long due to historic data><event type
> long><timestamp><random long>
>
> The two global indexes are <customerid-11 bytes><event type
> long><timestamp> <user id> and <customerid-11 bytes><campaign id
> long><timestamp>
> Around 100B rows in the main table. The main issues we see are
>
> # Sudden spikes in queueSize - going all the way to 1G limit and staying
> there, without any correlated client traffic
> # Boatloads of these errors 2016-11-30 11:28:54,907 INFO
>  [RW.default.writeRpcServer.handler=43,queue=9,port=16020]
> util.IndexManagementUtil: Rethrowing org.apache.hadoop.hbase.DoNotRetryIOException:
> ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index
> metadata.  key=120521194876100862 region=<region-key>. Index update failed
>
> We have cross datacenter WAL replication enabled.
> We saw PHOENIX-1718, and changed all recommended timeouts to 1 hour.  Our
> HBase version has HBase-11705. We also discovered that the queuesize is
> global (across general/replication/priority queues) and if it reaches the
> 1GB limit, calls to all queues will drop. That was interesting because even
> though the replication handlers have  a different queue, the size is
> counted globally, affecting others. Please correct me on this. I hope I'm
> wrong on this one :)
>
> Our challenge has been to understand what's HBase doing under various
> scenarios. We monitor call queue lengths, sizes and latencies as the
> primary alerting mechanism to tell us something is going on with HBase.
>
> Thanks!
> -neelesh
>
> On Wed, Nov 30, 2016 at 1:15 PM, Ted Yu <te...@yahoo.com.invalid> wrote:
>
> Neelesh:Can you share more details about the sluggish cluster performance
> (such as version of hbase / phoenix, your schema, region server log
> snippet, stack traces, etc) ?
> As hbase / phoenix evolve, I hope the performance keeps getting better for
> your use case.
> Cheers
>
>     On Wednesday, November 30, 2016 10:07 AM, Neelesh <ne...@gmail.com>
> wrote:
>
>
>  We use both, in different capacities. Cassandra is an x-DC archive store
> with mostly batch writes and occasional key based reads. Hbase is for
> real-time event ingestion. Our experience so far on hbase + phoenix is that
> when it works, it is fast and scales like crazy. But if you ever hit a snag
> around data patterns, you will have a VERY hard time figuring out what's
> going on. A combination of global phoenix indexes and heavy writes leave an
> entire cluster sluggish, if there is a hint of hotspotting.
>
> On the other hand, we had a big struggle getting Cassandra when a node
> recovery was in progress. What with twice the amount of disk requirements
> during recovery etc. Other than that, it is quiet.
> But the access patterns are not the same.
>
> I think the old rule still stays. If you are already on hadoop , or
> interested in using/analysing data in several different ways, go with hbase
> . If you just need a big data store with a few predefined query patterns,
> Cassandra is good
>
> Of course, I'm biased towards HBase.
>
> On Nov 30, 2016 7:02 AM, "Mich Talebzadeh" <mi...@gmail.com>
> wrote:
>
> > Hi Guys,
> >
> > Used Hbase on HDFS reasonably well. Happy to to stick with it and more
> with
> > Hive/Phoenix views and Phoenix indexes where I can.
> >
> > I have a bunch of users now vocal about the use case for Cassandra and
> > whether it can do a better job than Hbase.
> >
> > Unfortunately I am no expert on Cassandra. However, some use case fit
> would
> > be very valuable.
> >
> > Thanks
> >
> > Dr Mich Talebzadeh
> >
> >
> >
> > LinkedIn * https://www.linkedin.com/ profile/view?id=
> <https://www.linkedin.com/profile/view?id=>
> > AAEAAAAWh2gBxianrbJd6zP6AcPCCd OABUrV8Pw
> > <https://www.linkedin.com/ profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCd>
> > OABUrV8Pw>*
> >
> >
> >
> > http://talebzadehmich. wordpress.com
> <http://talebzadehmich.wordpress.com/>
> >
> >
> > *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> > loss, damage or destruction of data or any other property which may arise
> > from relying on this email's technical content is explicitly disclaimed.
> > The author will in no case be liable for any monetary damages arising
> from
> > such loss, damage or destruction.
> >
>
>
>
>
>
>
>
>

Re: Hbase on HDFS versus Cassandra

Posted by Ted Yu <te...@yahoo.com.INVALID>.

w.r.t. "Unable to find cached index metadata" error, have you seen this ?
http://search-hadoop.com/m/Phoenix/9UY0h2YBSOhgbflB1?subj=Re+Global+Secondary+Index+ERROR+2008+INT10+Unable+to+find+cached+index+metadata+PHOENIX+1718+

Cheers 

    On Wednesday, November 30, 2016 11:59 PM, Neelesh <ne...@gmail.com> wrote:

 Ted, we use HDP 2.3.4 (HBase 1.1.2, phoenix 4.4 - but with a lot of backports from later versions)
The key of the data table is  <customerid- 11 bytes><userid- 36 bytes-  but it really is a right padded long due to historic data><event type long><timestamp><random long>
The two global indexes are <customerid-11 bytes><event type long><timestamp> <user id> and <customerid-11 bytes><campaign id long><timestamp> Around 100B rows in the main table. The main issues we see are
# Sudden spikes in queueSize - going all the way to 1G limit and staying there, without any correlated client traffic# Boatloads of these errors 2016-11-30 11:28:54,907 INFO  [RW.default.writeRpcServer.handler=43,queue=9,port=16020] util.IndexManagementUtil: Rethrowing org.apache.hadoop.hbase.DoNotRetryIOException: ERROR 2008 (INT10): ERROR 2008 (INT10): Unable to find cached index metadata.  key=120521194876100862 region=<region-key>. Index update failed
We have cross datacenter WAL replication enabled.We saw PHOENIX-1718, and changed all recommended timeouts to 1 hour.  Our HBase version has HBase-11705. We also discovered that the queuesize is global (across general/replication/priority queues) and if it reaches the 1GB limit, calls to all queues will drop. That was interesting because even though the replication handlers have  a different queue, the size is counted globally, affecting others. Please correct me on this. I hope I'm wrong on this one :)
Our challenge has been to understand what's HBase doing under various scenarios. We monitor call queue lengths, sizes and latencies as the primary alerting mechanism to tell us something is going on with HBase.
Thanks!-neelesh
On Wed, Nov 30, 2016 at 1:15 PM, Ted Yu <te...@yahoo.com.invalid> wrote:

Neelesh:Can you share more details about the sluggish cluster performance (such as version of hbase / phoenix, your schema, region server log snippet, stack traces, etc) ?
As hbase / phoenix evolve, I hope the performance keeps getting better for your use case.
Cheers

    On Wednesday, November 30, 2016 10:07 AM, Neelesh <ne...@gmail.com> wrote:

 We use both, in different capacities. Cassandra is an x-DC archive store
with mostly batch writes and occasional key based reads. Hbase is for
real-time event ingestion. Our experience so far on hbase + phoenix is that
when it works, it is fast and scales like crazy. But if you ever hit a snag
around data patterns, you will have a VERY hard time figuring out what's
going on. A combination of global phoenix indexes and heavy writes leave an
entire cluster sluggish, if there is a hint of hotspotting.

On the other hand, we had a big struggle getting Cassandra when a node
recovery was in progress. What with twice the amount of disk requirements
during recovery etc. Other than that, it is quiet.
But the access patterns are not the same.

I think the old rule still stays. If you are already on hadoop , or
interested in using/analysing data in several different ways, go with hbase
. If you just need a big data store with a few predefined query patterns,
Cassandra is good

Of course, I'm biased towards HBase.

On Nov 30, 2016 7:02 AM, "Mich Talebzadeh" <mi...@gmail.com>
wrote:

> Hi Guys,
>
> Used Hbase on HDFS reasonably well. Happy to to stick with it and more with
> Hive/Phoenix views and Phoenix indexes where I can.
>
> I have a bunch of users now vocal about the use case for Cassandra and
> whether it can do a better job than Hbase.
>
> Unfortunately I am no expert on Cassandra. However, some use case fit would
> be very valuable.
>
> Thanks
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * https://www.linkedin.com/ profile/view?id=
> AAEAAAAWh2gBxianrbJd6zP6AcPCCd OABUrV8Pw
> <https://www.linkedin.com/ profile/view?id= AAEAAAAWh2gBxianrbJd6zP6AcPCCd
> OABUrV8Pw>*
>
>
>
> http://talebzadehmich. wordpress.com
>
>
> *Disclaimer:* Use it at your own risk. Any and all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
>