You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Michael Segel <mi...@hotmail.com> on 2010/03/17 05:25:46 UTC

Weird HBase Shell issue with count

Ok, 

Still trying to track down some issues.

I opened up an hbase shell and decided to use count  to count the number of rows in a table.

As it was running, count was flying along until it hit 150,000 then stopped.
Just stood there, nothing.

I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.

I'm going to check the logs, but has anyone seen something like this?

Thx

-Mike

 		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1

RE: Weird HBase Shell issue with count

Posted by Michael Segel <mi...@hotmail.com>.

Patrick,

Thanks for the response.
I'm still trying to find time to dig through all of the logs and to find out what exactly caused the issue.

It could be that we're 'overloading' what our nodes can handle and having ZK pushed the machines to the breaking point.

We're still tuning the environment and when I find something, I'll post it.

-Mike

> Date: Wed, 17 Mar 2010 14:30:04 -0700
> From: phunt@apache.org
> To: hbase-user@hadoop.apache.org
> Subject: Re: Weird HBase Shell issue with count
> 
> This log message is actually a bad message:
> https://issues.apache.org/jira/browse/ZOOKEEPER-588
> 
> > 2010-03-16 23:40:17,459 WARN org.apache.zookeeper.server.Request: Ignoring exception during toString
> > java.nio.BufferUnderflowException
> 
> In 3.3.0 we've removed it. Prior to that someone had wrapped some bad 
> tostring code with a try/catch rather than do it right. And then logged 
> as warning if the try/catch triggered. So this is just a bogus message, 
> it would have nothing to do with your real problem.
> 
> also note the messages that follow
> 
> > 2010-03-16 23:40:17,634 DEBUG org.apache.zookeeper.server.FinalRequestProcessor: Processing request:: sessionid:0x276a6c04f00000 type:createSession cxid:0x0 zxid:0x8900000006 txntype:-10 n/a
> 
> this indicates the actual operation was successful (session creation)
> 
> 
> Still, there's a problem with ZK other than this issue right? If you can 
> provide the logs I'll take a look. Or if you find something else that's 
> hinky I'd be willing to provide feedback.
> 
> regarding this issue:
> 
>  > Zookeeper doesn't need _that_ much ;)
> 
> JD is right. There was some issues initially when ZK was integrated with 
> Hbase, but most of that was due to GC in the ZK end-user client code, 
> not the ZK service/client itself per say. We worked with hbase to 
> address this and afaik it's pretty good now.
> 
> ZK doesn't need much to run. See this document for details on minimal 
> (single cpu) and we can still handle 15k+ ops/second - way more than 
> hbase needs at the moment. http://bit.ly/4ekN8G Obviously if you do 
> something like run hadoop/hbase/zkserver on an oversubscribed "small" 
> ec2 node you're bound to see issues, but in general the ZK service 
> itself doesn't need much HP to run. See this for addl detail on types of 
> things "not to do" we've accumulated over time http://bit.ly/5WwS44 ;-)
> 
> Regards,
> 
> Patrick
> 
> 
> Jean-Daniel Cryans wrote:
> > Zookeeper doesn't need _that_ much ;)
> > 
> > You say you are loosing your zk server... can we see the error? Pastebin?
> > 
> > Thx
> > 
> > J-D
> > 
> > On Tue, Mar 16, 2010 at 11:48 PM, Michael Segel
> > <mi...@hotmail.com> wrote:
> >> Unfortunately I can't up the ulimit easily. :-( I'll have to get an admin to do that.
> >>
> >> I did update the xceivers and set it to 2048 based on something I saw.
> >> But I'm losing my zookeeper on the node. Getting an IO error.
> >> I had the handler count high at 50 but reset it back down to 25 (default value)
> >>
> >> From what I've read, I definitely will move the zookeeper nodes when I can find additional machines to add to the cluster.
> >>
> >> Again any input welcome.
> >>
> >> Thx
> >>
> >> -Mike
> >>
> >>
> >>
> >>
> >>> Date: Tue, 16 Mar 2010 20:30:27 -0800
> >>> Subject: Re: Weird HBase Shell issue with count
> >>> From: stack@duboce.net
> >>> To: hbase-user@hadoop.apache.org
> >>>
> >>> Oh, you've read the 'getting started' and the hbase requirements where
> >>> it specifies upping ulimit and xceivers in your cluster?
> >>> St.Ack
> >>>
> >>> On Tue, Mar 16, 2010 at 8:29 PM, Stack <st...@duboce.net> wrote:
> >>>> Is DEBUG enabled in the log4j.properties that the client can see?  If
> >>>> not, enable it.  If so, can you see the regions loading as the count
> >>>> progresses?  Which region does it stop at?  Can you try to do a get on
> >>>> its startkey?  Does it work?
> >>>>
> >>>> St.Ack
> >>>>
> >>>> On Tue, Mar 16, 2010 at 8:25 PM, Michael Segel
> >>>> <mi...@hotmail.com> wrote:
> >>>>> Ok,
> >>>>>
> >>>>> Still trying to track down some issues.
> >>>>>
> >>>>> I opened up an hbase shell and decided to use count  to count the number of rows in a table.
> >>>>>
> >>>>> As it was running, count was flying along until it hit 150,000 then stopped.
> >>>>> Just stood there, nothing.
> >>>>>
> >>>>> I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.
> >>>>>
> >>>>> I'm going to check the logs, but has anyone seen something like this?
> >>>>>
> >>>>> Thx
> >>>>>
> >>>>> -Mike
> >>>>>
> >>>>>
> >>>>> _________________________________________________________________
> >>>>> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
> >>>>> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1
> >> _________________________________________________________________
> >> The New Busy is not the old busy. Search, chat and e-mail from your inbox.
> >> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_3
 		 	   		  
_________________________________________________________________
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_3

Re: Weird HBase Shell issue with count

Posted by Patrick Hunt <ph...@apache.org>.
This log message is actually a bad message:
https://issues.apache.org/jira/browse/ZOOKEEPER-588

> 2010-03-16 23:40:17,459 WARN org.apache.zookeeper.server.Request: Ignoring exception during toString
> java.nio.BufferUnderflowException

In 3.3.0 we've removed it. Prior to that someone had wrapped some bad 
tostring code with a try/catch rather than do it right. And then logged 
as warning if the try/catch triggered. So this is just a bogus message, 
it would have nothing to do with your real problem.

also note the messages that follow

> 2010-03-16 23:40:17,634 DEBUG org.apache.zookeeper.server.FinalRequestProcessor: Processing request:: sessionid:0x276a6c04f00000 type:createSession cxid:0x0 zxid:0x8900000006 txntype:-10 n/a

this indicates the actual operation was successful (session creation)


Still, there's a problem with ZK other than this issue right? If you can 
provide the logs I'll take a look. Or if you find something else that's 
hinky I'd be willing to provide feedback.

regarding this issue:

 > Zookeeper doesn't need _that_ much ;)

JD is right. There was some issues initially when ZK was integrated with 
Hbase, but most of that was due to GC in the ZK end-user client code, 
not the ZK service/client itself per say. We worked with hbase to 
address this and afaik it's pretty good now.

ZK doesn't need much to run. See this document for details on minimal 
(single cpu) and we can still handle 15k+ ops/second - way more than 
hbase needs at the moment. http://bit.ly/4ekN8G Obviously if you do 
something like run hadoop/hbase/zkserver on an oversubscribed "small" 
ec2 node you're bound to see issues, but in general the ZK service 
itself doesn't need much HP to run. See this for addl detail on types of 
things "not to do" we've accumulated over time http://bit.ly/5WwS44 ;-)

Regards,

Patrick


Jean-Daniel Cryans wrote:
> Zookeeper doesn't need _that_ much ;)
> 
> You say you are loosing your zk server... can we see the error? Pastebin?
> 
> Thx
> 
> J-D
> 
> On Tue, Mar 16, 2010 at 11:48 PM, Michael Segel
> <mi...@hotmail.com> wrote:
>> Unfortunately I can't up the ulimit easily. :-( I'll have to get an admin to do that.
>>
>> I did update the xceivers and set it to 2048 based on something I saw.
>> But I'm losing my zookeeper on the node. Getting an IO error.
>> I had the handler count high at 50 but reset it back down to 25 (default value)
>>
>> From what I've read, I definitely will move the zookeeper nodes when I can find additional machines to add to the cluster.
>>
>> Again any input welcome.
>>
>> Thx
>>
>> -Mike
>>
>>
>>
>>
>>> Date: Tue, 16 Mar 2010 20:30:27 -0800
>>> Subject: Re: Weird HBase Shell issue with count
>>> From: stack@duboce.net
>>> To: hbase-user@hadoop.apache.org
>>>
>>> Oh, you've read the 'getting started' and the hbase requirements where
>>> it specifies upping ulimit and xceivers in your cluster?
>>> St.Ack
>>>
>>> On Tue, Mar 16, 2010 at 8:29 PM, Stack <st...@duboce.net> wrote:
>>>> Is DEBUG enabled in the log4j.properties that the client can see?  If
>>>> not, enable it.  If so, can you see the regions loading as the count
>>>> progresses?  Which region does it stop at?  Can you try to do a get on
>>>> its startkey?  Does it work?
>>>>
>>>> St.Ack
>>>>
>>>> On Tue, Mar 16, 2010 at 8:25 PM, Michael Segel
>>>> <mi...@hotmail.com> wrote:
>>>>> Ok,
>>>>>
>>>>> Still trying to track down some issues.
>>>>>
>>>>> I opened up an hbase shell and decided to use count  to count the number of rows in a table.
>>>>>
>>>>> As it was running, count was flying along until it hit 150,000 then stopped.
>>>>> Just stood there, nothing.
>>>>>
>>>>> I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.
>>>>>
>>>>> I'm going to check the logs, but has anyone seen something like this?
>>>>>
>>>>> Thx
>>>>>
>>>>> -Mike
>>>>>
>>>>>
>>>>> _________________________________________________________________
>>>>> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
>>>>> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1
>> _________________________________________________________________
>> The New Busy is not the old busy. Search, chat and e-mail from your inbox.
>> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_3

RE: Weird HBase Shell issue with count

Posted by Michael Segel <mi...@hotmail.com>.
On my data nodes, I have hadoop, hbase and zk running. 
Not optimal and I just found out I have a couple of new machines that I can add to the cloud....

The interesting thing about ZK is that I saw a couple of your posts from last year. Something about ZK and RSs sharing the same drives. (which they are). So I have to think about moving the ZKs...

As to the error..

This what I saw in the logs... Note I haven't gone through all of the logs, I've been a little busy and sleep deprived.
:-(

2010-03-16 23:40:16,568 DEBUG org.apache.zookeeper.server.FinalRequestProcessor: sessionid:0x1276a6c01e10000 type:getChildren cxid:0x9 zxid:0xfffffffffffffffe txntype:unknown /hbase/rs
2010-03-16 23:40:17,458 INFO org.apache.zookeeper.server.NIOServerCnxn: Connected to /10.22.166.2:5222 lastZxid 545460847224
2010-03-16 23:40:17,458 INFO org.apache.zookeeper.server.NIOServerCnxn: Renewing session 0x2274dd2909f0036
2010-03-16 23:40:17,459 INFO org.apache.zookeeper.server.NIOServerCnxn: Finished init of 0x2274dd2909f0036 valid:true
2010-03-16 23:40:17,459 WARN org.apache.zookeeper.server.Request: Ignoring exception during toString
java.nio.BufferUnderflowException
        at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127)
        at java.nio.ByteBuffer.get(ByteBuffer.java:675)
        at org.apache.zookeeper.server.Request.toString(Request.java:199)
        at java.lang.String.valueOf(String.java:2826)
        at java.lang.StringBuilder.append(StringBuilder.java:115)
        at org.apache.zookeeper.server.quorum.CommitProcessor.processRequest(CommitProcessor.java:169)
        at org.apache.zookeeper.server.quorum.FollowerRequestProcessor.run(FollowerRequestProcessor.java:68)
2010-03-16 23:40:17,459 DEBUG org.apache.zookeeper.server.quorum.CommitProcessor: Processing request:: sessionid:0x2274dd2909f0036 type:setWatches cxid:0xfffffffffffffff8 zxid:0xfffffffffffffffe txntype:unknown n/a
2010-03-16 23:40:17,460 WARN org.apache.zookeeper.server.Request: Ignoring exception during toString
java.nio.BufferUnderflowException
        at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127)
        at java.nio.ByteBuffer.get(ByteBuffer.java:675)
        at org.apache.zookeeper.server.Request.toString(Request.java:199)
        at java.lang.String.valueOf(String.java:2826)
        at java.lang.StringBuilder.append(StringBuilder.java:115)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:74)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
2010-03-16 23:40:17,460 DEBUG org.apache.zookeeper.server.FinalRequestProcessor: Processing request:: sessionid:0x2274dd2909f0036 type:setWatches cxid:0xfffffffffffffff8 zxid:0xfffffffffffffffe txntype:unknown n/a
2010-03-16 23:40:17,460 WARN org.apache.zookeeper.server.Request: Ignoring exception during toString
java.nio.BufferUnderflowException
        at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:127)
        at java.nio.ByteBuffer.get(ByteBuffer.java:675)
        at org.apache.zookeeper.server.Request.toString(Request.java:199)
        at org.apache.log4j.or.DefaultRenderer.doRender(DefaultRenderer.java:36)
        at org.apache.log4j.or.RendererMap.findAndRender(RendererMap.java:80)
        at org.apache.log4j.spi.LoggingEvent.getRenderedMessage(LoggingEvent.java:362)
        at org.apache.log4j.helpers.PatternParser$BasicPatternConverter.convert(PatternParser.java:403)
        at org.apache.log4j.helpers.PatternConverter.format(PatternConverter.java:65)
        at org.apache.log4j.PatternLayout.format(PatternLayout.java:502)
        at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:302)
        at org.apache.log4j.DailyRollingFileAppender.subAppend(DailyRollingFileAppender.java:359)
        at org.apache.log4j.WriterAppender.append(WriterAppender.java:160)
        at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
        at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
        at org.apache.log4j.Category.callAppenders(Category.java:206)
        at org.apache.log4j.Category.forcedLog(Category.java:391)
        at org.apache.log4j.Category.debug(Category.java:260)
        at org.apache.zookeeper.server.FinalRequestProcessor.processRequest(FinalRequestProcessor.java:147)
        at org.apache.zookeeper.server.quorum.CommitProcessor.run(CommitProcessor.java:73)
2010-03-16 23:40:17,460 WARN org.apache.zookeeper.server.Request: Ignoring exception during toString
sessionid:0x2274dd2909f0036 type:setWatches cxid:0xfffffffffffffff8 zxid:0xfffffffffffffffe txntype:unknown n/a
2010-03-16 23:40:17,634 DEBUG org.apache.zookeeper.server.quorum.CommitProcessor: Committing request:: sessionid:0x276a6c04f00000 type:createSession cxid:0x0 zxid:0x8900000006 txntype:-10 n/a
2010-03-16 23:40:17,634 DEBUG org.apache.zookeeper.server.FinalRequestProcessor: Processing request:: sessionid:0x276a6c04f00000 type:createSession cxid:0x0 zxid:0x8900000006 txntype:-10 n/a


> Date: Wed, 17 Mar 2010 09:34:48 -0700
> Subject: Re: Weird HBase Shell issue with count
> From: jdcryans@apache.org
> To: hbase-user@hadoop.apache.org
> 
> Zookeeper doesn't need _that_ much ;)
> 
> You say you are loosing your zk server... can we see the error? Pastebin?
> 
> Thx
> 
> J-D
> 
> On Tue, Mar 16, 2010 at 11:48 PM, Michael Segel
> <mi...@hotmail.com> wrote:
> >
> > Unfortunately I can't up the ulimit easily. :-( I'll have to get an admin to do that.
> >
> > I did update the xceivers and set it to 2048 based on something I saw.
> > But I'm losing my zookeeper on the node. Getting an IO error.
> > I had the handler count high at 50 but reset it back down to 25 (default value)
> >
> > From what I've read, I definitely will move the zookeeper nodes when I can find additional machines to add to the cluster.
> >
> > Again any input welcome.
> >
> > Thx
> >
> > -Mike
> >
> >
> >
> >
> >> Date: Tue, 16 Mar 2010 20:30:27 -0800
> >> Subject: Re: Weird HBase Shell issue with count
> >> From: stack@duboce.net
> >> To: hbase-user@hadoop.apache.org
> >>
> >> Oh, you've read the 'getting started' and the hbase requirements where
> >> it specifies upping ulimit and xceivers in your cluster?
> >> St.Ack
> >>
> >> On Tue, Mar 16, 2010 at 8:29 PM, Stack <st...@duboce.net> wrote:
> >> > Is DEBUG enabled in the log4j.properties that the client can see?  If
> >> > not, enable it.  If so, can you see the regions loading as the count
> >> > progresses?  Which region does it stop at?  Can you try to do a get on
> >> > its startkey?  Does it work?
> >> >
> >> > St.Ack
> >> >
> >> > On Tue, Mar 16, 2010 at 8:25 PM, Michael Segel
> >> > <mi...@hotmail.com> wrote:
> >> >>
> >> >> Ok,
> >> >>
> >> >> Still trying to track down some issues.
> >> >>
> >> >> I opened up an hbase shell and decided to use count  to count the number of rows in a table.
> >> >>
> >> >> As it was running, count was flying along until it hit 150,000 then stopped.
> >> >> Just stood there, nothing.
> >> >>
> >> >> I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.
> >> >>
> >> >> I'm going to check the logs, but has anyone seen something like this?
> >> >>
> >> >> Thx
> >> >>
> >> >> -Mike
> >> >>
> >> >>
> >> >> _________________________________________________________________
> >> >> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
> >> >> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1
> >> >
> >
> > _________________________________________________________________
> > The New Busy is not the old busy. Search, chat and e-mail from your inbox.
> > http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_3
 		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1

Re: Weird HBase Shell issue with count

Posted by Jean-Daniel Cryans <jd...@apache.org>.
Zookeeper doesn't need _that_ much ;)

You say you are loosing your zk server... can we see the error? Pastebin?

Thx

J-D

On Tue, Mar 16, 2010 at 11:48 PM, Michael Segel
<mi...@hotmail.com> wrote:
>
> Unfortunately I can't up the ulimit easily. :-( I'll have to get an admin to do that.
>
> I did update the xceivers and set it to 2048 based on something I saw.
> But I'm losing my zookeeper on the node. Getting an IO error.
> I had the handler count high at 50 but reset it back down to 25 (default value)
>
> From what I've read, I definitely will move the zookeeper nodes when I can find additional machines to add to the cluster.
>
> Again any input welcome.
>
> Thx
>
> -Mike
>
>
>
>
>> Date: Tue, 16 Mar 2010 20:30:27 -0800
>> Subject: Re: Weird HBase Shell issue with count
>> From: stack@duboce.net
>> To: hbase-user@hadoop.apache.org
>>
>> Oh, you've read the 'getting started' and the hbase requirements where
>> it specifies upping ulimit and xceivers in your cluster?
>> St.Ack
>>
>> On Tue, Mar 16, 2010 at 8:29 PM, Stack <st...@duboce.net> wrote:
>> > Is DEBUG enabled in the log4j.properties that the client can see?  If
>> > not, enable it.  If so, can you see the regions loading as the count
>> > progresses?  Which region does it stop at?  Can you try to do a get on
>> > its startkey?  Does it work?
>> >
>> > St.Ack
>> >
>> > On Tue, Mar 16, 2010 at 8:25 PM, Michael Segel
>> > <mi...@hotmail.com> wrote:
>> >>
>> >> Ok,
>> >>
>> >> Still trying to track down some issues.
>> >>
>> >> I opened up an hbase shell and decided to use count  to count the number of rows in a table.
>> >>
>> >> As it was running, count was flying along until it hit 150,000 then stopped.
>> >> Just stood there, nothing.
>> >>
>> >> I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.
>> >>
>> >> I'm going to check the logs, but has anyone seen something like this?
>> >>
>> >> Thx
>> >>
>> >> -Mike
>> >>
>> >>
>> >> _________________________________________________________________
>> >> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
>> >> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1
>> >
>
> _________________________________________________________________
> The New Busy is not the old busy. Search, chat and e-mail from your inbox.
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_3

RE: Weird HBase Shell issue with count

Posted by Michael Segel <mi...@hotmail.com>.
Unfortunately I can't up the ulimit easily. :-( I'll have to get an admin to do that.

I did update the xceivers and set it to 2048 based on something I saw.
But I'm losing my zookeeper on the node. Getting an IO error. 
I had the handler count high at 50 but reset it back down to 25 (default value)

>From what I've read, I definitely will move the zookeeper nodes when I can find additional machines to add to the cluster.

Again any input welcome.

Thx

-Mike




> Date: Tue, 16 Mar 2010 20:30:27 -0800
> Subject: Re: Weird HBase Shell issue with count
> From: stack@duboce.net
> To: hbase-user@hadoop.apache.org
> 
> Oh, you've read the 'getting started' and the hbase requirements where
> it specifies upping ulimit and xceivers in your cluster?
> St.Ack
> 
> On Tue, Mar 16, 2010 at 8:29 PM, Stack <st...@duboce.net> wrote:
> > Is DEBUG enabled in the log4j.properties that the client can see?  If
> > not, enable it.  If so, can you see the regions loading as the count
> > progresses?  Which region does it stop at?  Can you try to do a get on
> > its startkey?  Does it work?
> >
> > St.Ack
> >
> > On Tue, Mar 16, 2010 at 8:25 PM, Michael Segel
> > <mi...@hotmail.com> wrote:
> >>
> >> Ok,
> >>
> >> Still trying to track down some issues.
> >>
> >> I opened up an hbase shell and decided to use count  to count the number of rows in a table.
> >>
> >> As it was running, count was flying along until it hit 150,000 then stopped.
> >> Just stood there, nothing.
> >>
> >> I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.
> >>
> >> I'm going to check the logs, but has anyone seen something like this?
> >>
> >> Thx
> >>
> >> -Mike
> >>
> >>
> >> _________________________________________________________________
> >> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
> >> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1
> >
 		 	   		  
_________________________________________________________________
The New Busy is not the old busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_3

RE: Weird HBase Shell issue with count

Posted by Michael Segel <mi...@hotmail.com>.

LOL...
It sticks in the same place.
I didn't do the initial install, so I have to check to see what was changed in the parameters.
I did the first cluster and didn't have this issue.

This could explain some of the other issues.
I'll go back in and check it out.

Interestingly enough after a while there looks to be some cascading 'failures' the load peaks up and then I lost zookeeper.

-Mike

> Date: Tue, 16 Mar 2010 20:30:27 -0800
> Subject: Re: Weird HBase Shell issue with count
> From: stack@duboce.net
> To: hbase-user@hadoop.apache.org
> 
> Oh, you've read the 'getting started' and the hbase requirements where
> it specifies upping ulimit and xceivers in your cluster?
> St.Ack
> 
> On Tue, Mar 16, 2010 at 8:29 PM, Stack <st...@duboce.net> wrote:
> > Is DEBUG enabled in the log4j.properties that the client can see?  If
> > not, enable it.  If so, can you see the regions loading as the count
> > progresses?  Which region does it stop at?  Can you try to do a get on
> > its startkey?  Does it work?
> >
> > St.Ack
> >
> > On Tue, Mar 16, 2010 at 8:25 PM, Michael Segel
> > <mi...@hotmail.com> wrote:
> >>
> >> Ok,
> >>
> >> Still trying to track down some issues.
> >>
> >> I opened up an hbase shell and decided to use count  to count the number of rows in a table.
> >>
> >> As it was running, count was flying along until it hit 150,000 then stopped.
> >> Just stood there, nothing.
> >>
> >> I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.
> >>
> >> I'm going to check the logs, but has anyone seen something like this?
> >>
> >> Thx
> >>
> >> -Mike
> >>
> >>
> >> _________________________________________________________________
> >> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
> >> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1
> >
 		 	   		  
_________________________________________________________________
Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1

Re: Weird HBase Shell issue with count

Posted by Stack <st...@duboce.net>.
Oh, you've read the 'getting started' and the hbase requirements where
it specifies upping ulimit and xceivers in your cluster?
St.Ack

On Tue, Mar 16, 2010 at 8:29 PM, Stack <st...@duboce.net> wrote:
> Is DEBUG enabled in the log4j.properties that the client can see?  If
> not, enable it.  If so, can you see the regions loading as the count
> progresses?  Which region does it stop at?  Can you try to do a get on
> its startkey?  Does it work?
>
> St.Ack
>
> On Tue, Mar 16, 2010 at 8:25 PM, Michael Segel
> <mi...@hotmail.com> wrote:
>>
>> Ok,
>>
>> Still trying to track down some issues.
>>
>> I opened up an hbase shell and decided to use count  to count the number of rows in a table.
>>
>> As it was running, count was flying along until it hit 150,000 then stopped.
>> Just stood there, nothing.
>>
>> I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.
>>
>> I'm going to check the logs, but has anyone seen something like this?
>>
>> Thx
>>
>> -Mike
>>
>>
>> _________________________________________________________________
>> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
>> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1
>

Re: Weird HBase Shell issue with count

Posted by Stack <st...@duboce.net>.
Is DEBUG enabled in the log4j.properties that the client can see?  If
not, enable it.  If so, can you see the regions loading as the count
progresses?  Which region does it stop at?  Can you try to do a get on
its startkey?  Does it work?

St.Ack

On Tue, Mar 16, 2010 at 8:25 PM, Michael Segel
<mi...@hotmail.com> wrote:
>
> Ok,
>
> Still trying to track down some issues.
>
> I opened up an hbase shell and decided to use count  to count the number of rows in a table.
>
> As it was running, count was flying along until it hit 150,000 then stopped.
> Just stood there, nothing.
>
> I started to check the other nodes in the cloud to see what is happening and the load on the data nodes, which are also region servers jumped up, where one 1 node jumped up to 2.71 ... other nodes saw some jump but again it doesn't make sense why the count suddenly died.
>
> I'm going to check the logs, but has anyone seen something like this?
>
> Thx
>
> -Mike
>
>
> _________________________________________________________________
> Hotmail has tools for the New Busy. Search, chat and e-mail from your inbox.
> http://www.windowslive.com/campaign/thenewbusy?ocid=PID27925::T:WLMTAGL:ON:WL:en-US:WM_HMP:032010_1