You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ranger.apache.org by Don Bosco Durai <bo...@apache.org> on 2017/07/25 08:20:30 UTC

Re: HDFS encryption zone policies with Ranger

Billy

 

I was going through the Ranger emails and came across this and found it interesting.

 

You might have already figured out, but anyway let me answer from my point of view:

 

> We’d basically want to configure N master Kerberos with RW access and a list of Hadoop groups (thousands) for the readers. All these would need to be able to access encrypted files, the union of the groups need to read, the masters need to write.

By design, it should just work as you are expecting. Previously we had UI issues for showing 1000s of groups in a single policy. But if you are using REST API, then it should work for you. Let us know otherwise.

 

> Can Ranger use the HDFS group provider as a source of groups for example?

The requirements are different. HDFS group provider will give the groups for the given user, while Ranger needs to get all the users and groups. There were discussion some time ago to get the users from LDAP, but do the group lookup using Hadoop UGI utility. This eliminates the requirement to configure group filtering in Ranger UserSync. Sailaja might be able to provide more insights regarding what happened to this suggestion. I am not able to find the JIRA for this.

 

Regardless, you should be able to sync the groups using your own from your custom flat file or even API. With this, you don’t have to depend on the groups from AD/LDAP.

 

Bosco

 

 

 

From: "Newport, Billy" <Bi...@gs.com>
Reply-To: <us...@ranger.apache.org>
Date: Wednesday, March 29, 2017 at 6:52 AM
To: "'user@ranger.apache.org'" <us...@ranger.apache.org>
Subject: HDFS encryption zone policies with Ranger

 

We have a current setup with HW 2.5 using our own group provider for HDFS permissions. We did this because we manage permissions for thousands of users in thousands or groups which are recalculated every 15 minutes. We had namenode stability problems when we were using an LDAP provider (slow responses caused NN to fall over resulting in outages), once we switched to our group provider, we now get sub second response times and it’s stable.

 

My question is we want to add HDFS encryption to this cluster. Our ‘data lake’ implementation maintains meta data on all data that we keep in the cluster. Right now, we have a single master Kerberos which owns every file in our lake folder. Every dataset that we store in a folder tree is owned by that user and uses a dataset specific group to control read access.

 

We are moving to a ‘ring’ system where instead of having a single Kerberos, we have a Kerberos per ring. Each ring owns an exclusive set of datasets. So, it’s exactly the same but now ownership of the files is divided between the rings. The ring Kerberos must be a member of the dataset groups in order for chgrp to work.

 

Now, back to the question. We want to enable encryption. For now, simply using a single encryption zone for all rings is sufficient. How do we automatically configure the ranger side of this using the entitlement meta data in our lake. We’d basically want to configure N master Kerberos with RW access and a list of Hadoop groups (thousands) for the readers. All these would need to be able to access encrypted files, the union of the groups need to read, the masters need to write.

 

Is it possible to do this using REST APIs, efficiently. Ranger would be maintaining these information on behalf of HDFS as far as I can see. Can Ranger use the HDFS group provider as a source of groups for example? This would make this very easy.

 

Thanks for help

Billy