You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Neutron sharc <ne...@gmail.com> on 2015/06/17 19:41:56 UTC

hbase 0.94.26 hangs when a datanode is suspended via SIGSTOP

Hi community,

I am testing the time to recovery of hbase 0.94.26.   0.94.26 can tolerate
region server failures very smoothly.   However, it seems  client cannot
recover when a datanode is suspended (through kill -SIGSTOP).   When DN is
paused,  I expect hbase client will timeout and pick the next DN with
data.  However my hbase client keeps stuck there, until I resume DN with
SIGCONT.

Are there any special parameters I should tune regarding this scenario?
Thanks!



-Neutron

Re: hbase 0.94.26 hangs when a datanode is suspended via SIGSTOP

Posted by Neutron sharc <ne...@gmail.com>.
Some update.

It turns out we were using a wrong HDFS version.  The issue is gone once we
pull in the right hadoop-hdfs jar.


On Mon, Jun 22, 2015 at 10:29 AM, Ted Yu <yu...@gmail.com> wrote:

> bq. my hbase client keeps stuck
>
> Can you provide stack trace for the client ?
>
> Were region servers operating properly ? Can you check server logs during
> that time frame ?
>
> Cheers
>
> On Thu, Jun 18, 2015 at 1:54 AM, Neutron sharc <ne...@gmail.com>
> wrote:
>
> > Btw,  hbase 0.94.26 is on top of HDFS 2.5.0-chd5.3.2.
> >
> > On Wed, Jun 17, 2015 at 10:41 AM, Neutron sharc <ne...@gmail.com>
> > wrote:
> >
> > > Hi community,
> > >
> > > I am testing the time to recovery of hbase 0.94.26.   0.94.26 can
> > tolerate
> > > region server failures very smoothly.   However, it seems  client
> cannot
> > > recover when a datanode is suspended (through kill -SIGSTOP).   When DN
> > is
> > > paused,  I expect hbase client will timeout and pick the next DN with
> > > data.  However my hbase client keeps stuck there, until I resume DN
> with
> > > SIGCONT.
> > >
> > > Are there any special parameters I should tune regarding this scenario?
> > > Thanks!
> > >
> > >
> > >
> > > -Neutron
> > >
> >
>

Re: hbase 0.94.26 hangs when a datanode is suspended via SIGSTOP

Posted by Ted Yu <yu...@gmail.com>.
bq. my hbase client keeps stuck

Can you provide stack trace for the client ?

Were region servers operating properly ? Can you check server logs during
that time frame ?

Cheers

On Thu, Jun 18, 2015 at 1:54 AM, Neutron sharc <ne...@gmail.com>
wrote:

> Btw,  hbase 0.94.26 is on top of HDFS 2.5.0-chd5.3.2.
>
> On Wed, Jun 17, 2015 at 10:41 AM, Neutron sharc <ne...@gmail.com>
> wrote:
>
> > Hi community,
> >
> > I am testing the time to recovery of hbase 0.94.26.   0.94.26 can
> tolerate
> > region server failures very smoothly.   However, it seems  client cannot
> > recover when a datanode is suspended (through kill -SIGSTOP).   When DN
> is
> > paused,  I expect hbase client will timeout and pick the next DN with
> > data.  However my hbase client keeps stuck there, until I resume DN with
> > SIGCONT.
> >
> > Are there any special parameters I should tune regarding this scenario?
> > Thanks!
> >
> >
> >
> > -Neutron
> >
>

Re: hbase 0.94.26 hangs when a datanode is suspended via SIGSTOP

Posted by Neutron sharc <ne...@gmail.com>.
Btw,  hbase 0.94.26 is on top of HDFS 2.5.0-chd5.3.2.

On Wed, Jun 17, 2015 at 10:41 AM, Neutron sharc <ne...@gmail.com>
wrote:

> Hi community,
>
> I am testing the time to recovery of hbase 0.94.26.   0.94.26 can tolerate
> region server failures very smoothly.   However, it seems  client cannot
> recover when a datanode is suspended (through kill -SIGSTOP).   When DN is
> paused,  I expect hbase client will timeout and pick the next DN with
> data.  However my hbase client keeps stuck there, until I resume DN with
> SIGCONT.
>
> Are there any special parameters I should tune regarding this scenario?
> Thanks!
>
>
>
> -Neutron
>