You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sentry.apache.org by Sravya Tirukkovalur <sr...@cloudera.com> on 2015/08/29 05:15:32 UTC

[DISCUSS] Sentry HDFS Sync with HMS HA and Sentry HA

Hi fellow developer,

Looks like there are some problems with current design when Sentry is in HA
/ HMS is in HA. Here are some of the problems I have identified some
problems and would like to propose some solutions. Please let me know what
you think.

Problem 1: Zookeeper might blow up if HMS meta data is too big
See https://cwiki.apache.org/confluence/display/CURATOR/TN4

Problem 2: Both HMSs send full updates to sentry
There is a chance that these two full updates might actually be different.
This is true if there are some meta data operations while the full update
is being built on one server.

Proposed design:
For HMS HA:
We will pick a leader using curator's Leader latch and only this HMS would
be responsible for sending the path updates to Sentry
For the propagation of path updates from the follower, we will use
PathChildrenCacheListener recipe of curator, where the follower can post
the updates it sees to ZK path. And the leader listens in this path, and
processes these updates and sends to Sentry.

For Sentry HA:
- Leader sends the path updates to both the sentry servers.
- And for permission updates, sentry servers use zookeeper similar to HMS
to propagate updates to each other.

Regards,
-- 
Sravya Tirukkovalur

Re: [DISCUSS] Sentry HDFS Sync with HMS HA and Sentry HA

Posted by Sravya Tirukkovalur <sr...@cloudera.com>.
Here is the uber jira: https://issues.apache.org/jira/browse/SENTRY-872

I have linked all the smaller items from this jira. Will post a design doc
shortly.


On Wed, Sep 2, 2015 at 4:34 PM, Lenni Kuff <ls...@cloudera.com> wrote:

> Ah great. I did not realize LeaderLatch supported callbacks. That's should
> be sufficient then.
>
> Thanks,
> Lenni
>
> On Wed, Sep 2, 2015 at 3:18 PM, Sravya Tirukkovalur <sr...@cloudera.com>
> wrote:
>
> > Thanks for your response Lenni!
> >
> > Yes LeaderSelector is more flexible, but LeaderLatch also supports call
> > backs on leadership changes through LeaderLatchListeners. So unless we
> need
> > more flexibility, I think we can just use LeaderLatch.
> >
> >
> https://curator.apache.org/apidocs/org/apache/curator/framework/recipes/leader/LeaderLatchListener.html
> >
> > I will work on the design doc shortly and open a jira.
> >
> > Regards,
> >
> >
> > On Mon, Aug 31, 2015 at 11:29 PM, Lenni Kuff <ls...@cloudera.com>
> wrote:
> >
> > > Good catch Sravya - these do sound like significant problems. Great job
> > > thinking through the design, I'll need to think through it a bit more
> but
> > > at a high level it makes sense. It might be good to put together a
> small
> > > design doc on this to review and so we all understand the protocol and
> > > failures points.
> > > A more specific point on the HMS leader election implementation - I
> think
> > > you need to use LeaderSelector rather than LeaderLatch so HMS plugin
> can
> > > get notified (via callback) when the leader changes. I don't think
> that's
> > > possible with the LeaderLatch. Are there any JIRAs I can follow to
> > > track/review this work?
> > >
> > > Thanks,
> > > Lenni
> > >
> > > On Fri, Aug 28, 2015 at 8:15 PM, Sravya Tirukkovalur <
> > sravya@cloudera.com>
> > > wrote:
> > >
> > > > Hi fellow developer,
> > > >
> > > > Looks like there are some problems with current design when Sentry is
> > in
> > > HA
> > > > / HMS is in HA. Here are some of the problems I have identified some
> > > > problems and would like to propose some solutions. Please let me know
> > > what
> > > > you think.
> > > >
> > > > Problem 1: Zookeeper might blow up if HMS meta data is too big
> > > > See https://cwiki.apache.org/confluence/display/CURATOR/TN4
> > > >
> > > > Problem 2: Both HMSs send full updates to sentry
> > > > There is a chance that these two full updates might actually be
> > > different.
> > > > This is true if there are some meta data operations while the full
> > update
> > > > is being built on one server.
> > > >
> > > > Proposed design:
> > > > For HMS HA:
> > > > We will pick a leader using curator's Leader latch and only this HMS
> > > would
> > > > be responsible for sending the path updates to Sentry
> > > > For the propagation of path updates from the follower, we will use
> > > > PathChildrenCacheListener recipe of curator, where the follower can
> > post
> > > > the updates it sees to ZK path. And the leader listens in this path,
> > and
> > > > processes these updates and sends to Sentry.
> > > >
> > > > For Sentry HA:
> > > > - Leader sends the path updates to both the sentry servers.
> > > > - And for permission updates, sentry servers use zookeeper similar to
> > HMS
> > > > to propagate updates to each other.
> > > >
> > > > Regards,
> > > > --
> > > > Sravya Tirukkovalur
> > > >
> > >
> >
> >
> >
> > --
> > Sravya Tirukkovalur
> >
>



-- 
Sravya Tirukkovalur

Re: [DISCUSS] Sentry HDFS Sync with HMS HA and Sentry HA

Posted by Lenni Kuff <ls...@cloudera.com>.
Ah great. I did not realize LeaderLatch supported callbacks. That's should
be sufficient then.

Thanks,
Lenni

On Wed, Sep 2, 2015 at 3:18 PM, Sravya Tirukkovalur <sr...@cloudera.com>
wrote:

> Thanks for your response Lenni!
>
> Yes LeaderSelector is more flexible, but LeaderLatch also supports call
> backs on leadership changes through LeaderLatchListeners. So unless we need
> more flexibility, I think we can just use LeaderLatch.
>
> https://curator.apache.org/apidocs/org/apache/curator/framework/recipes/leader/LeaderLatchListener.html
>
> I will work on the design doc shortly and open a jira.
>
> Regards,
>
>
> On Mon, Aug 31, 2015 at 11:29 PM, Lenni Kuff <ls...@cloudera.com> wrote:
>
> > Good catch Sravya - these do sound like significant problems. Great job
> > thinking through the design, I'll need to think through it a bit more but
> > at a high level it makes sense. It might be good to put together a small
> > design doc on this to review and so we all understand the protocol and
> > failures points.
> > A more specific point on the HMS leader election implementation - I think
> > you need to use LeaderSelector rather than LeaderLatch so HMS plugin can
> > get notified (via callback) when the leader changes. I don't think that's
> > possible with the LeaderLatch. Are there any JIRAs I can follow to
> > track/review this work?
> >
> > Thanks,
> > Lenni
> >
> > On Fri, Aug 28, 2015 at 8:15 PM, Sravya Tirukkovalur <
> sravya@cloudera.com>
> > wrote:
> >
> > > Hi fellow developer,
> > >
> > > Looks like there are some problems with current design when Sentry is
> in
> > HA
> > > / HMS is in HA. Here are some of the problems I have identified some
> > > problems and would like to propose some solutions. Please let me know
> > what
> > > you think.
> > >
> > > Problem 1: Zookeeper might blow up if HMS meta data is too big
> > > See https://cwiki.apache.org/confluence/display/CURATOR/TN4
> > >
> > > Problem 2: Both HMSs send full updates to sentry
> > > There is a chance that these two full updates might actually be
> > different.
> > > This is true if there are some meta data operations while the full
> update
> > > is being built on one server.
> > >
> > > Proposed design:
> > > For HMS HA:
> > > We will pick a leader using curator's Leader latch and only this HMS
> > would
> > > be responsible for sending the path updates to Sentry
> > > For the propagation of path updates from the follower, we will use
> > > PathChildrenCacheListener recipe of curator, where the follower can
> post
> > > the updates it sees to ZK path. And the leader listens in this path,
> and
> > > processes these updates and sends to Sentry.
> > >
> > > For Sentry HA:
> > > - Leader sends the path updates to both the sentry servers.
> > > - And for permission updates, sentry servers use zookeeper similar to
> HMS
> > > to propagate updates to each other.
> > >
> > > Regards,
> > > --
> > > Sravya Tirukkovalur
> > >
> >
>
>
>
> --
> Sravya Tirukkovalur
>

Re: [DISCUSS] Sentry HDFS Sync with HMS HA and Sentry HA

Posted by Sravya Tirukkovalur <sr...@cloudera.com>.
Thanks for your response Lenni!

Yes LeaderSelector is more flexible, but LeaderLatch also supports call
backs on leadership changes through LeaderLatchListeners. So unless we need
more flexibility, I think we can just use LeaderLatch.
https://curator.apache.org/apidocs/org/apache/curator/framework/recipes/leader/LeaderLatchListener.html

I will work on the design doc shortly and open a jira.

Regards,


On Mon, Aug 31, 2015 at 11:29 PM, Lenni Kuff <ls...@cloudera.com> wrote:

> Good catch Sravya - these do sound like significant problems. Great job
> thinking through the design, I'll need to think through it a bit more but
> at a high level it makes sense. It might be good to put together a small
> design doc on this to review and so we all understand the protocol and
> failures points.
> A more specific point on the HMS leader election implementation - I think
> you need to use LeaderSelector rather than LeaderLatch so HMS plugin can
> get notified (via callback) when the leader changes. I don't think that's
> possible with the LeaderLatch. Are there any JIRAs I can follow to
> track/review this work?
>
> Thanks,
> Lenni
>
> On Fri, Aug 28, 2015 at 8:15 PM, Sravya Tirukkovalur <sr...@cloudera.com>
> wrote:
>
> > Hi fellow developer,
> >
> > Looks like there are some problems with current design when Sentry is in
> HA
> > / HMS is in HA. Here are some of the problems I have identified some
> > problems and would like to propose some solutions. Please let me know
> what
> > you think.
> >
> > Problem 1: Zookeeper might blow up if HMS meta data is too big
> > See https://cwiki.apache.org/confluence/display/CURATOR/TN4
> >
> > Problem 2: Both HMSs send full updates to sentry
> > There is a chance that these two full updates might actually be
> different.
> > This is true if there are some meta data operations while the full update
> > is being built on one server.
> >
> > Proposed design:
> > For HMS HA:
> > We will pick a leader using curator's Leader latch and only this HMS
> would
> > be responsible for sending the path updates to Sentry
> > For the propagation of path updates from the follower, we will use
> > PathChildrenCacheListener recipe of curator, where the follower can post
> > the updates it sees to ZK path. And the leader listens in this path, and
> > processes these updates and sends to Sentry.
> >
> > For Sentry HA:
> > - Leader sends the path updates to both the sentry servers.
> > - And for permission updates, sentry servers use zookeeper similar to HMS
> > to propagate updates to each other.
> >
> > Regards,
> > --
> > Sravya Tirukkovalur
> >
>



-- 
Sravya Tirukkovalur

Re: [DISCUSS] Sentry HDFS Sync with HMS HA and Sentry HA

Posted by Lenni Kuff <ls...@cloudera.com>.
Good catch Sravya - these do sound like significant problems. Great job
thinking through the design, I'll need to think through it a bit more but
at a high level it makes sense. It might be good to put together a small
design doc on this to review and so we all understand the protocol and
failures points.
A more specific point on the HMS leader election implementation - I think
you need to use LeaderSelector rather than LeaderLatch so HMS plugin can
get notified (via callback) when the leader changes. I don't think that's
possible with the LeaderLatch. Are there any JIRAs I can follow to
track/review this work?

Thanks,
Lenni

On Fri, Aug 28, 2015 at 8:15 PM, Sravya Tirukkovalur <sr...@cloudera.com>
wrote:

> Hi fellow developer,
>
> Looks like there are some problems with current design when Sentry is in HA
> / HMS is in HA. Here are some of the problems I have identified some
> problems and would like to propose some solutions. Please let me know what
> you think.
>
> Problem 1: Zookeeper might blow up if HMS meta data is too big
> See https://cwiki.apache.org/confluence/display/CURATOR/TN4
>
> Problem 2: Both HMSs send full updates to sentry
> There is a chance that these two full updates might actually be different.
> This is true if there are some meta data operations while the full update
> is being built on one server.
>
> Proposed design:
> For HMS HA:
> We will pick a leader using curator's Leader latch and only this HMS would
> be responsible for sending the path updates to Sentry
> For the propagation of path updates from the follower, we will use
> PathChildrenCacheListener recipe of curator, where the follower can post
> the updates it sees to ZK path. And the leader listens in this path, and
> processes these updates and sends to Sentry.
>
> For Sentry HA:
> - Leader sends the path updates to both the sentry servers.
> - And for permission updates, sentry servers use zookeeper similar to HMS
> to propagate updates to each other.
>
> Regards,
> --
> Sravya Tirukkovalur
>