You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Dietrich, Paul" <pa...@honeywell.com> on 2016/01/27 18:41:40 UTC

WebHDFS use of proxy to read encrypted file

I am using Apache Hadoop 2.7.1 with Kerberos SPNEGO authentication (i.e. security on) and data at-rest encryption enabled. I am using Ranger 0.5.0 kms for the keyserver implementation.

My end goal is to allow users to browse files from within encryption zones for which they are permitted using HUE.

Since I am using HUE, I have entered the configuration parameter hadoop.proxyusers.hue.users to value '*' for now for testing.

Ignoring the HUE application for the moment and doing some testing, the behavior I see with WebHDFS using curl to read an encrypted file is that whatever user tries to open the file they get proxied to user hue. Here is the series of commands run as user 'paul' and /zone was setup as an encryption zone using key 'testkey':
> kinit
> hdfs dfs -cat /zone/helloWorld
Hello World
> curl --tlsv1.2 -i --location-trusted --cacert /etc/pki /cacerts.pem --negotiate -u : "https://myhost.example.com:50470/webhdfs/v1/zone/helloWorld?op=OPEN"

And the result:
{"RemoteException":{"exception":"AuthorizationException","javaClassName":"org.apache.hadoop.security.authorize.AuthorizationException","message":"User:hue not allowed to do 'DECRYPT_EEK' on 'testkey'"}}

If I instead run curl with a proxy request adding 'doas=paul' I get a somewhat expected result:
{"RemoteException":{"exception":"SecurityException","javaClassName":"java.lang.SecurityException","message":"Failed to obtain user group information: org.apache.hadoop.security.authorize.AuthorizationException: User: paul is not allowed to impersonate paul"}}

>From that step it leads me to something related, but slightly disturbing. If I now add hadoop.proxyusers.paul.users to value 'paul' allowing paul to proxy for paul it works, but it works too well. Running the original curl command as user paul returns the file contents; however, running the curl command as any other user also returns the file contents!

The audit log for kms shows the access, but it thinks every access is being done as user paul
2016-01-27 10:09:08,552 OK[op=DECRYPT_EEK, key=testkey, user=paul, accessCount=1, interval=10873ms]
The audit log for hdfs shows the access with the user performing the request shown:
2016-01-27 10:08:57,347 INFO FSNamesystem.audit: allowed=true   ugi=tom (auth:KERBEROS)     ip=/192.168.3.173     cmd=open        src=/zone/helloWorld    dst=null        perm=null       proto=webhdfs
2016-01-27 10:08:57,482 INFO FSNamesystem.audit: allowed=true   ugi=tom (auth:TOKEN)        ip=/192.168.3.97      cmd=open        src=/zone/helloWorld    dst=null        perm=null       proto=rpc

I'm not sure where to dig in to find what exactly is going on, so if someone knows if this is expected behavior or where to start looking that would be a great help. Thanks.

Paul

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@hadoop.apache.org
For additional commands, e-mail: user-help@hadoop.apache.org