You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@livy.apache.org by Jack Leadford <jl...@fb.com> on 2017/09/28 19:39:02 UTC

Livy and Spark and Untrusted Users

Hello!

Just wanted to ask a couple questions. I am working with
a Livy/Spark deployment where the users are untrusted.
Thus, I would like to prevent them from executing _arbitrary_
code (e.g. OS shell commands through PySpark, submitted via Livy's
/sessions/id/statements)
within my Spark/Yarn cluster, and instead
constrain the code that is included in a session or a batch
submission. This would ideally also extend to invoke_* calls
made by libraries such as Sparklyr as well.

I have not found much information online for
how this might be done (besides a similar, unanswered question
on StackOverflow:
https://stackoverflow.com/questions/38333873/securely-running-a-spark-application-inside-a-sandbox).

Do you folks have any guidance w/r/t to this issue? I have
several vague ideas (Security Manager(s) for JVM code, for instance,
but this would not apply to Python and R code as I understand Spark),
but if I have missed an established process for addressing this
scenario, please let me know, or if you have any other questions
or comments.

Thanks.

Jack