You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@accumulo.apache.org by John Vines <vi...@apache.org> on 2013/06/22 00:13:52 UTC

Tserver zookeeper locks

Eric probably knows the answer to this, but if anyone else can chime in it
would be great.


LiveTServerSet is what is used to maintain the set of all functional
tservers the master is aware of. Part of the logic for it is when it finds
that a znode for a tserver (tserver lock) goes missing, it will mark is as
lockless as well as reporting it as doomed to the master. The latter action
has the master try to phase it out of everything, etc.

The lockless part of it though seems a little foreign, but it may be
because my branch has changed substantially around ZooCache. It appears to
wait 10 minutes and if it is still lockless it will delete the lock
explicitly. But isn't the lock already gone? Or is the lock znode empty vs.
nonexistant?

Re: Tserver zookeeper locks

Posted by John Vines <vi...@apache.org>.
Nope, that is sufficient. That solves that mystery.

Thanks Keith.


On Fri, Jun 21, 2013 at 6:53 PM, Keith Turner <ke...@deenlo.com> wrote:

> John,
>
> I think this code is just to avoid race conditions between the master and
> tserver.
>
> The tserver will create a node X and then create an ephemeral node Y under
> X.   Y is the lock.   So the code helps avoid the following situation.
>
>  * tserver creates X
>  * master sees X has no children/locks and deletes X
>  * tserver tries to create Y as a child of X and fails because X does not
> exist
>
> It seems like above situation may have caused some sort of problem, but I
> can not remember.  So I think the delayed delete may have been added to
> avoid a problem.  I can try to dig up tickets if you need more info.
>
> Keith
>
> On Fri, Jun 21, 2013 at 6:13 PM, John Vines <vi...@apache.org> wrote:
>
> > Eric probably knows the answer to this, but if anyone else can chime in
> it
> > would be great.
> >
> >
> > LiveTServerSet is what is used to maintain the set of all functional
> > tservers the master is aware of. Part of the logic for it is when it
> finds
> > that a znode for a tserver (tserver lock) goes missing, it will mark is
> as
> > lockless as well as reporting it as doomed to the master. The latter
> action
> > has the master try to phase it out of everything, etc.
> >
> > The lockless part of it though seems a little foreign, but it may be
> > because my branch has changed substantially around ZooCache. It appears
> to
> > wait 10 minutes and if it is still lockless it will delete the lock
> > explicitly. But isn't the lock already gone? Or is the lock znode empty
> vs.
> > nonexistant?
> >
>

Re: Tserver zookeeper locks

Posted by Keith Turner <ke...@deenlo.com>.
John,

I think this code is just to avoid race conditions between the master and
tserver.

The tserver will create a node X and then create an ephemeral node Y under
X.   Y is the lock.   So the code helps avoid the following situation.

 * tserver creates X
 * master sees X has no children/locks and deletes X
 * tserver tries to create Y as a child of X and fails because X does not
exist

It seems like above situation may have caused some sort of problem, but I
can not remember.  So I think the delayed delete may have been added to
avoid a problem.  I can try to dig up tickets if you need more info.

Keith

On Fri, Jun 21, 2013 at 6:13 PM, John Vines <vi...@apache.org> wrote:

> Eric probably knows the answer to this, but if anyone else can chime in it
> would be great.
>
>
> LiveTServerSet is what is used to maintain the set of all functional
> tservers the master is aware of. Part of the logic for it is when it finds
> that a znode for a tserver (tserver lock) goes missing, it will mark is as
> lockless as well as reporting it as doomed to the master. The latter action
> has the master try to phase it out of everything, etc.
>
> The lockless part of it though seems a little foreign, but it may be
> because my branch has changed substantially around ZooCache. It appears to
> wait 10 minutes and if it is still lockless it will delete the lock
> explicitly. But isn't the lock already gone? Or is the lock znode empty vs.
> nonexistant?
>