You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Jhon Anderson Cardenas Diaz <jh...@gmail.com> on 2018/05/08 21:48:24 UTC

Zeppelin code can access FileSystem

Dear Zeppelin Community,

Currently when a Zeppelin paragraph is executed, the code in it can read
sensitive config files, change them, including web app pages and etc. Like
in this example:

%python
f = open("/usr/zeppelin/conf/credentials.json", "r")
f.read()

Do you know if is there a way to configure the user used to start the
interpreters or run the paragraph's code ?, so that user can not access the
File System where zeppelin is running, or has  more restricted access.

Thank you.

Re: Zeppelin code can access FileSystem

Posted by Sam Nicholson <sa...@ogt11.com>.
Yes, I believe that jira report was about keeping users isolated from each
other.
And with user impersonation, and the method I outlined just now, this works
well.

AND this keeps the shell you fire up from accessing the zeppelin files.

BUT, this is not a zeppelin problem.  This is a JEE problem.  Java has no
native mechanism
to set/change userID.  So, while you can sudo / su -c the web application
upon startup, it cannot
change itself later.  So, if it needs the filesystem for ANY reason, it'll
have to start with a userid
that has filesystem permissions.

This is, IMO, the real problem behind the Spring breakage at Equifax.  If
the app server default
userID is leaked, not only can you login, but you can MODIFY the
application filesystem if you
can get a shell.

So, I think the Zeppelin team has done an excellent job of mitigating the
problem as best as
can be done within the JEE system.  (This is true of tomcat, jetty,
whathaveyou servlet container.)

Because, by default, Zeppelin gives you shells.  R, Python, sh, all have
full UNIX abilities, as
do many other shells.

I'm going to write up a Jira request to have the default interpreter
settings in a config file.  If
one is truly paranoid, then just having the server running while one sets
the interpreter settings
seems risky.

In short:

Enable user impersonation
Put zeppelin users in a zeppelin group
Allow zeppelin sudo to only zeppelin group members
Ensure zeppelin group members cannot sudo without password and cannot ssh
without password
Set shell context as per-user in isolated process
Set shell.working.directory.user.home to true

And do this for all compatible interpreters.

Cheers!
-sam

On Wed, May 9, 2018 at 10:17 AM, Jhon Anderson Cardenas Diaz <
jhonderson2007@gmail.com> wrote:

> Thank you Sam. Reviewing the jira issues, I found that issue was
> previously identified in this jira ticket ZEPPELIN-1320
> <https://issues.apache.org/jira/browse/ZEPPELIN-1320>, but i don't know
> if is my impression but it seems like they focused more on the fact that
> the processes could not access the directories of other users than on the
> problem that a process could access the zeppelin file system. Am i right ?
>
> 2018-05-08 17:46 GMT-05:00 Sam Nicholson <sa...@ogt11.com>:
>
>> And warning!
>>
>> Trying to answer the above, I've disconnected my websocket.
>> I'll figure it out and report back
>>
>> On Tue, May 8, 2018 at 6:28 PM, Sam Nicholson <sa...@ogt11.com> wrote:
>>
>>> So,
>>>
>>> I run the zeppelin process as the web user on my system.  There is no
>>> other web process, so why not.
>>>
>>> Then, UNIX permissions keep it from running, accessing, deleting
>>> anything else.  EXCEPT items that are world writeable.
>>>
>>> There shouldn't be any of those, other than /tmp, but still /tmp is a
>>> hotbed of nefarious activity on hacked machines.  :)
>>>
>>> For example:
>>>
>>> %sh
>>>
>>> pwd
>>> ls
>>> touch bazzot
>>> ls -l bazzot
>>> rm bazzot
>>>
>>> Gives:
>>>
>>> /var/www/zeppelin
>>> derby.log
>>> figure
>>> metastore_db
>>> Rgraphics
>>> Rgraphics.zip
>>> -rw-r--r-- 1 www-data www-data 0 May 8 18:04 bazzot
>>> ls: cannot access 'bazzot': No such file or directory
>>> ExitValue: 2
>>>
>>> For another example:
>>>
>>> %sh
>>> id
>>> cd /home/samcn2
>>> touch bazzot
>>> ls -l bazzot
>>> rm bazzot
>>>
>>> Gives:
>>>
>>> uid=33(www-data) gid=33(www-data) groups=33(www-data)
>>> touch: cannot touch 'bazzot': Permission denied
>>> ls: cannot access 'bazzot': No such file or directory
>>> rm: cannot remove 'bazzot': No such file or directory
>>> ExitValue: 1
>>>
>>>
>>> So, you can't access other users' files.
>>>
>>> But you CAN access the web user's files.  That may be a bug.  I'm going
>>> to try changing the zeppelin  running user.  Wait one...
>>>
>>> OK.  So you can run zeppelin as some other user, the logs and the run
>>> directory must be owned by that user..
>>> I do this with symlinks.  But the websocket is failing.  So no joy
>>> there...
>>>
>>> So, for now, you can set things up so that zeppelin can't access any
>>> other files from other users on the system,
>>> but zeppelin web can access the zeppelin executable.  So, don't put this
>>> up for untrusted users!!!
>>>
>>> Here is my zeppelin start script:
>>> #!/bin/sh
>>>
>>> cd /var/www/zeppelin/home
>>>
>>> sudo -u zeppelin /opt/apache/zeppelin/zeppelin-
>>> 0.7.3-bin-all/bin/zeppelin-daemon.sh $*
>>>
>>>
>>> If /var/www/zeppelin/home is owned by zeppelin, as is
>>> /opt/apache/zeppelin/*, then this works with the caveat above.
>>>
>>> Cheers!
>>> -sam
>>>
>>>
>>> On Tue, May 8, 2018 at 5:48 PM, Jhon Anderson Cardenas Diaz <
>>> jhonderson2007@gmail.com> wrote:
>>>
>>>> Dear Zeppelin Community,
>>>>
>>>> Currently when a Zeppelin paragraph is executed, the code in it can
>>>> read sensitive config files, change them, including web app pages and etc.
>>>> Like in this example:
>>>>
>>>> %python
>>>> f = open("/usr/zeppelin/conf/credentials.json", "r")
>>>> f.read()
>>>>
>>>> Do you know if is there a way to configure the user used to start the
>>>> interpreters or run the paragraph's code ?, so that user can not access the
>>>> File System where zeppelin is running, or has  more restricted access.
>>>>
>>>> Thank you..
>>>>
>>>
>>>
>>
>

Re: Zeppelin code can access FileSystem

Posted by Jhon Anderson Cardenas Diaz <jh...@gmail.com>.
Thank you Sam. Reviewing the jira issues, I found that issue was previously
identified in this jira ticket ZEPPELIN-1320
<https://issues.apache.org/jira/browse/ZEPPELIN-1320>, but i don't know if
is my impression but it seems like they focused more on the fact that the
processes could not access the directories of other users than on the
problem that a process could access the zeppelin file system. Am i right ?

2018-05-08 17:46 GMT-05:00 Sam Nicholson <sa...@ogt11.com>:

> And warning!
>
> Trying to answer the above, I've disconnected my websocket.
> I'll figure it out and report back
>
> On Tue, May 8, 2018 at 6:28 PM, Sam Nicholson <sa...@ogt11.com> wrote:
>
>> So,
>>
>> I run the zeppelin process as the web user on my system.  There is no
>> other web process, so why not.
>>
>> Then, UNIX permissions keep it from running, accessing, deleting anything
>> else.  EXCEPT items that are world writeable.
>>
>> There shouldn't be any of those, other than /tmp, but still /tmp is a
>> hotbed of nefarious activity on hacked machines.  :)
>>
>> For example:
>>
>> %sh
>>
>> pwd
>> ls
>> touch bazzot
>> ls -l bazzot
>> rm bazzot
>>
>> Gives:
>>
>> /var/www/zeppelin
>> derby.log
>> figure
>> metastore_db
>> Rgraphics
>> Rgraphics.zip
>> -rw-r--r-- 1 www-data www-data 0 May 8 18:04 bazzot
>> ls: cannot access 'bazzot': No such file or directory
>> ExitValue: 2
>>
>> For another example:
>>
>> %sh
>> id
>> cd /home/samcn2
>> touch bazzot
>> ls -l bazzot
>> rm bazzot
>>
>> Gives:
>>
>> uid=33(www-data) gid=33(www-data) groups=33(www-data)
>> touch: cannot touch 'bazzot': Permission denied
>> ls: cannot access 'bazzot': No such file or directory
>> rm: cannot remove 'bazzot': No such file or directory
>> ExitValue: 1
>>
>>
>> So, you can't access other users' files.
>>
>> But you CAN access the web user's files.  That may be a bug.  I'm going
>> to try changing the zeppelin  running user.  Wait one...
>>
>> OK.  So you can run zeppelin as some other user, the logs and the run
>> directory must be owned by that user.
>> I do this with symlinks.  But the websocket is failing.  So no joy
>> there...
>>
>> So, for now, you can set things up so that zeppelin can't access any
>> other files from other users on the system,
>> but zeppelin web can access the zeppelin executable.  So, don't put this
>> up for untrusted users!!!
>>
>> Here is my zeppelin start script:
>> #!/bin/sh
>>
>> cd /var/www/zeppelin/home
>>
>> sudo -u zeppelin /opt/apache/zeppelin/zeppelin-
>> 0.7.3-bin-all/bin/zeppelin-daemon.sh $*
>>
>>
>> If /var/www/zeppelin/home is owned by zeppelin, as is
>> /opt/apache/zeppelin/*, then this works with the caveat above.
>>
>> Cheers!
>> -sam
>>
>>
>> On Tue, May 8, 2018 at 5:48 PM, Jhon Anderson Cardenas Diaz <
>> jhonderson2007@gmail.com> wrote:
>>
>>> Dear Zeppelin Community,
>>>
>>> Currently when a Zeppelin paragraph is executed, the code in it can read
>>> sensitive config files, change them, including web app pages and etc. Like
>>> in this example:
>>>
>>> %python
>>> f = open("/usr/zeppelin/conf/credentials.json", "r")
>>> f.read()
>>>
>>> Do you know if is there a way to configure the user used to start the
>>> interpreters or run the paragraph's code ?, so that user can not access the
>>> File System where zeppelin is running, or has  more restricted access.
>>>
>>> Thank you.
>>>
>>
>>
>

Re: Zeppelin code can access FileSystem

Posted by Jhon Anderson Cardenas Diaz <jh...@gmail.com>.
Thank you Sam. Reviewing the jira issues, I found that issue was previously
identified in this jira ticket ZEPPELIN-1320
<https://issues.apache.org/jira/browse/ZEPPELIN-1320>, but i don't know if
is my impression but it seems like they focused more on the fact that the
processes could not access the directories of other users than on the
problem that a process could access the zeppelin file system. Am i right ?

2018-05-08 17:46 GMT-05:00 Sam Nicholson <sa...@ogt11.com>:

> And warning!
>
> Trying to answer the above, I've disconnected my websocket.
> I'll figure it out and report back
>
> On Tue, May 8, 2018 at 6:28 PM, Sam Nicholson <sa...@ogt11.com> wrote:
>
>> So,
>>
>> I run the zeppelin process as the web user on my system.  There is no
>> other web process, so why not.
>>
>> Then, UNIX permissions keep it from running, accessing, deleting anything
>> else.  EXCEPT items that are world writeable.
>>
>> There shouldn't be any of those, other than /tmp, but still /tmp is a
>> hotbed of nefarious activity on hacked machines.  :)
>>
>> For example:
>>
>> %sh
>>
>> pwd
>> ls
>> touch bazzot
>> ls -l bazzot
>> rm bazzot
>>
>> Gives:
>>
>> /var/www/zeppelin
>> derby.log
>> figure
>> metastore_db
>> Rgraphics
>> Rgraphics.zip
>> -rw-r--r-- 1 www-data www-data 0 May 8 18:04 bazzot
>> ls: cannot access 'bazzot': No such file or directory
>> ExitValue: 2
>>
>> For another example:
>>
>> %sh
>> id
>> cd /home/samcn2
>> touch bazzot
>> ls -l bazzot
>> rm bazzot
>>
>> Gives:
>>
>> uid=33(www-data) gid=33(www-data) groups=33(www-data)
>> touch: cannot touch 'bazzot': Permission denied
>> ls: cannot access 'bazzot': No such file or directory
>> rm: cannot remove 'bazzot': No such file or directory
>> ExitValue: 1
>>
>>
>> So, you can't access other users' files.
>>
>> But you CAN access the web user's files.  That may be a bug.  I'm going
>> to try changing the zeppelin  running user.  Wait one...
>>
>> OK.  So you can run zeppelin as some other user, the logs and the run
>> directory must be owned by that user.
>> I do this with symlinks.  But the websocket is failing.  So no joy
>> there...
>>
>> So, for now, you can set things up so that zeppelin can't access any
>> other files from other users on the system,
>> but zeppelin web can access the zeppelin executable.  So, don't put this
>> up for untrusted users!!!
>>
>> Here is my zeppelin start script:
>> #!/bin/sh
>>
>> cd /var/www/zeppelin/home
>>
>> sudo -u zeppelin /opt/apache/zeppelin/zeppelin-
>> 0.7.3-bin-all/bin/zeppelin-daemon.sh $*
>>
>>
>> If /var/www/zeppelin/home is owned by zeppelin, as is
>> /opt/apache/zeppelin/*, then this works with the caveat above.
>>
>> Cheers!
>> -sam
>>
>>
>> On Tue, May 8, 2018 at 5:48 PM, Jhon Anderson Cardenas Diaz <
>> jhonderson2007@gmail.com> wrote:
>>
>>> Dear Zeppelin Community,
>>>
>>> Currently when a Zeppelin paragraph is executed, the code in it can read
>>> sensitive config files, change them, including web app pages and etc. Like
>>> in this example:
>>>
>>> %python
>>> f = open("/usr/zeppelin/conf/credentials.json", "r")
>>> f.read()
>>>
>>> Do you know if is there a way to configure the user used to start the
>>> interpreters or run the paragraph's code ?, so that user can not access the
>>> File System where zeppelin is running, or has  more restricted access.
>>>
>>> Thank you.
>>>
>>
>>
>

Re: Zeppelin code can access FileSystem

Posted by Sam Nicholson <sa...@ogt11.com>.
And warning!

Trying to answer the above, I've disconnected my websocket.
I'll figure it out and report back

On Tue, May 8, 2018 at 6:28 PM, Sam Nicholson <sa...@ogt11.com> wrote:

> So,
>
> I run the zeppelin process as the web user on my system.  There is no
> other web process, so why not.
>
> Then, UNIX permissions keep it from running, accessing, deleting anything
> else.  EXCEPT items that are world writeable.
>
> There shouldn't be any of those, other than /tmp, but still /tmp is a
> hotbed of nefarious activity on hacked machines.  :)
>
> For example:
>
> %sh
>
> pwd
> ls
> touch bazzot
> ls -l bazzot
> rm bazzot
>
> Gives:
>
> /var/www/zeppelin
> derby.log
> figure
> metastore_db
> Rgraphics
> Rgraphics.zip
> -rw-r--r-- 1 www-data www-data 0 May 8 18:04 bazzot
> ls: cannot access 'bazzot': No such file or directory
> ExitValue: 2
>
> For another example:
>
> %sh
> id
> cd /home/samcn2
> touch bazzot
> ls -l bazzot
> rm bazzot
>
> Gives:
>
> uid=33(www-data) gid=33(www-data) groups=33(www-data)
> touch: cannot touch 'bazzot': Permission denied
> ls: cannot access 'bazzot': No such file or directory
> rm: cannot remove 'bazzot': No such file or directory
> ExitValue: 1
>
>
> So, you can't access other users' files.
>
> But you CAN access the web user's files.  That may be a bug.  I'm going to
> try changing the zeppelin  running user.  Wait one...
>
> OK.  So you can run zeppelin as some other user, the logs and the run
> directory must be owned by that user.
> I do this with symlinks.  But the websocket is failing.  So no joy there...
>
> So, for now, you can set things up so that zeppelin can't access any other
> files from other users on the system,
> but zeppelin web can access the zeppelin executable.  So, don't put this
> up for untrusted users!!!
>
> Here is my zeppelin start script:
> #!/bin/sh
>
> cd /var/www/zeppelin/home
>
> sudo -u zeppelin /opt/apache/zeppelin/zeppelin-0.7.3-bin-all/bin/zeppelin-daemon.sh
> $*
>
>
> If /var/www/zeppelin/home is owned by zeppelin, as is
> /opt/apache/zeppelin/*, then this works with the caveat above.
>
> Cheers!
> -sam
>
>
> On Tue, May 8, 2018 at 5:48 PM, Jhon Anderson Cardenas Diaz <
> jhonderson2007@gmail.com> wrote:
>
>> Dear Zeppelin Community,
>>
>> Currently when a Zeppelin paragraph is executed, the code in it can read
>> sensitive config files, change them, including web app pages and etc. Like
>> in this example:
>>
>> %python
>> f = open("/usr/zeppelin/conf/credentials.json", "r")
>> f.read()
>>
>> Do you know if is there a way to configure the user used to start the
>> interpreters or run the paragraph's code ?, so that user can not access the
>> File System where zeppelin is running, or has  more restricted access.
>>
>> Thank you.
>>
>
>

Re: Zeppelin code can access FileSystem

Posted by Sam Nicholson <sa...@ogt11.com>.
So,

I run the zeppelin process as the web user on my system.  There is no other
web process, so why not.

Then, UNIX permissions keep it from running, accessing, deleting anything
else.  EXCEPT items that are world writeable.

There shouldn't be any of those, other than /tmp, but still /tmp is a
hotbed of nefarious activity on hacked machines.  :)

For example:

%sh

pwd
ls
touch bazzot
ls -l bazzot
rm bazzot

Gives:

/var/www/zeppelin
derby.log
figure
metastore_db
Rgraphics
Rgraphics.zip
-rw-r--r-- 1 www-data www-data 0 May 8 18:04 bazzot
ls: cannot access 'bazzot': No such file or directory
ExitValue: 2

For another example:

%sh
id
cd /home/samcn2
touch bazzot
ls -l bazzot
rm bazzot

Gives:

uid=33(www-data) gid=33(www-data) groups=33(www-data)
touch: cannot touch 'bazzot': Permission denied
ls: cannot access 'bazzot': No such file or directory
rm: cannot remove 'bazzot': No such file or directory
ExitValue: 1


So, you can't access other users' files.

But you CAN access the web user's files.  That may be a bug.  I'm going to
try changing the zeppelin  running user.  Wait one...

OK.  So you can run zeppelin as some other user, the logs and the run
directory must be owned by that user.
I do this with symlinks.  But the websocket is failing.  So no joy there...

So, for now, you can set things up so that zeppelin can't access any other
files from other users on the system,
but zeppelin web can access the zeppelin executable.  So, don't put this up
for untrusted users!!!

Here is my zeppelin start script:
#!/bin/sh

cd /var/www/zeppelin/home

sudo -u zeppelin
/opt/apache/zeppelin/zeppelin-0.7.3-bin-all/bin/zeppelin-daemon.sh $*


If /var/www/zeppelin/home is owned by zeppelin, as is
/opt/apache/zeppelin/*, then this works with the caveat above.

Cheers!
-sam


On Tue, May 8, 2018 at 5:48 PM, Jhon Anderson Cardenas Diaz <
jhonderson2007@gmail.com> wrote:

> Dear Zeppelin Community,
>
> Currently when a Zeppelin paragraph is executed, the code in it can read
> sensitive config files, change them, including web app pages and etc. Like
> in this example:
>
> %python
> f = open("/usr/zeppelin/conf/credentials.json", "r")
> f.read()
>
> Do you know if is there a way to configure the user used to start the
> interpreters or run the paragraph's code ?, so that user can not access the
> File System where zeppelin is running, or has  more restricted access.
>
> Thank you.
>

Re: Zeppelin code can access FileSystem

Posted by Sam Nicholson <sa...@ogt11.com>.
OK, after learning way too much about zeppelin and java.  :)

First, re-check the docs at: https://zeppelin.apache.org/docs/0.7.3/manual/
userimpersonation.html

But it's more than that.  To lock things down as much as you can, you also
need to limit
the set of users that can be impersonated, *and* you need to isolate the
running user.

FIrst, set up a "zeppelin" group, and a "zeppelin" user.  This can be
www-data, or any
other web front end user.  But after this exercise, I like
zeppelin:zeppelin.

Then, add the following line to /etc/sudoers with visudo:

zeppelin ALL = (%zeppelin) NOPASSWD: ALL

This lets zeppelin pretend to be any user in the zeppelin group, WITHOUT a
password.

Now, add all of your users to the zeppelin group.  If using LDAP, then you
have to
adjust the LDAP db.

I use password-less with the following uncommented in zeppelin-env.sh

export ZEPPELIN_IMPERSONATE_CMD='sudo -H -u
${ZEPPELIN_IMPERSONATE_USER} bash -c '


If you are using PAM and local files, then do this in /etc/passwd

/etc/passwd:zeppelin:x:999:33::/var/www/zeppelin:

And this in /etc/group

/etc/group:shadow:x:42:zeppelin
/etc/group:zeppelin:x:1002:zeppelin,samcn2

The second allows the zeppelin process to read /etc/shadow.  (one could
also use setfacl)
The third adds the users who can login to zeppelin and write.

Then, follow the instructions in the docs referenced above.
ALSO set in the interpreters page

shell.working.directory.user.home   true

Now, the shell user is me, and it's my home dir I log into.
You can do away with the homedir, but you have to setfacl or group perm
your zeppelin users back into the zeppelin user.


NOTE!!!

This only works for shell.  Python and R also can manipulate the PWD and
local environment.
I'll look into setting those tomorrow.

Also, set the shell.working.directory,user true
so that the use gets their home dir, and not the shared dir.  Because
unless you make the shared
dir be mode 777 and setgid for your OS to force mkdirs to inherit the wide
open perms, then files/dirs
you create won't be shareable, and eventually zeppelin will complain.



On Tue, May 8, 2018 at 5:48 PM, Jhon Anderson Cardenas Diaz <
jhonderson2007@gmail.com> wrote:

> Dear Zeppelin Community,
>
> Currently when a Zeppelin paragraph is executed, the code in it can read
> sensitive config files, change them, including web app pages and etc. Like
> in this example:
>
> %python
> f = open("/usr/zeppelin/conf/credentials.json", "r")
> f.read()
>
> Do you know if is there a way to configure the user used to start the
> interpreters or run the paragraph's code ?, so that user can not access the
> File System where zeppelin is running, or has  more restricted access.
>
> Thank you.
>