You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by John Omernik <jo...@omernik.com> on 2016/09/06 18:39:49 UTC

Zeppelin and Docker Images

Hey all,

I know folks are using Docker with Zeppelin, and I am trying to find a nice
balance between size of images and usefulness of the image.

Basically, when I build Zeppelin, the resulting directory is quite large,
1.2GB.   So the first question is, what is actually needed in the Docker
Image? How can I make it smaller?

At the same time, I am trying to find ways to ensure things like the Conf
directory, the Logs directory etc can exist outside the container for
persistence.

Simple right?  Well I am using some ENV variables as such

"ZEPPELIN_CONF_DIR":"/conf",

"ZEPPELIN_NOTEBOOK_DIR":"/notebooks",

"ZEPPELIN_HOME":"/zeppelin",

"ZEPPELIN_LOG_DIR":"/logs",


and having the root live at Zeppelin, and having conf, notebooks, and logs
be volumes mounted to external persistent storage.


This is working, but what if I want to create a directory for "custom"
jars, interpreters, etc, what would the easier way to do that be?   I know
in the Apache Drill project, they've added the concept of a "site"
directory that is a bit more holistic than a "conf" directory as it allows
them to add libs and keep that directory complete separated from the
released jars  (
https://drill.apache.org/docs/apache-drill-1-8-0-release-notes/) Is there
anything like this in Zeppelin? Is there an easy way to keep customized
things in a directory that is separate?


Any thoughts on how you've optimized Dockerized setups for Zeppelin would
be welcome! Thanks!


John

Re: Zeppelin and Docker Images

Posted by John Omernik <jo...@omernik.com>.
In a lot of ways I want to setup the interpreters with a base level of
includes, but then be able to specify a per user "second" directory for
interpreters, so users can add more as they please, without having to
include every interpreter for every user...

On Tue, Sep 6, 2016 at 2:14 PM, Mohit Jaggi <mo...@gmail.com> wrote:

> +1 to the idea of separating the "code" from the configuration, data, logs
> and notebooks at a high level directory.
>
> BTW, 1.2GB doesn't seem too large. But you could perhaps leave out the
> interpreters you don't want to use.
>
> On Tue, Sep 6, 2016 at 11:39 AM, John Omernik <jo...@omernik.com> wrote:
>
>> Hey all,
>>
>> I know folks are using Docker with Zeppelin, and I am trying to find a
>> nice balance between size of images and usefulness of the image.
>>
>> Basically, when I build Zeppelin, the resulting directory is quite large,
>> 1.2GB.   So the first question is, what is actually needed in the Docker
>> Image? How can I make it smaller?
>>
>> At the same time, I am trying to find ways to ensure things like the Conf
>> directory, the Logs directory etc can exist outside the container for
>> persistence.
>>
>> Simple right?  Well I am using some ENV variables as such
>>
>> "ZEPPELIN_CONF_DIR":"/conf",
>>
>> "ZEPPELIN_NOTEBOOK_DIR":"/notebooks",
>>
>> "ZEPPELIN_HOME":"/zeppelin",
>>
>> "ZEPPELIN_LOG_DIR":"/logs",
>>
>>
>> and having the root live at Zeppelin, and having conf, notebooks, and
>> logs be volumes mounted to external persistent storage.
>>
>>
>> This is working, but what if I want to create a directory for "custom"
>> jars, interpreters, etc, what would the easier way to do that be?   I know
>> in the Apache Drill project, they've added the concept of a "site"
>> directory that is a bit more holistic than a "conf" directory as it allows
>> them to add libs and keep that directory complete separated from the
>> released jars  (https://drill.apache.org/docs/apache-drill-1-8-0-release-
>> notes/) Is there anything like this in Zeppelin? Is there an easy way to
>> keep customized things in a directory that is separate?
>>
>>
>> Any thoughts on how you've optimized Dockerized setups for Zeppelin would
>> be welcome! Thanks!
>>
>>
>> John
>>
>>
>>
>

Re: Zeppelin and Docker Images

Posted by Mohit Jaggi <mo...@gmail.com>.
+1 to the idea of separating the "code" from the configuration, data, logs
and notebooks at a high level directory.

BTW, 1.2GB doesn't seem too large. But you could perhaps leave out the
interpreters you don't want to use.

On Tue, Sep 6, 2016 at 11:39 AM, John Omernik <jo...@omernik.com> wrote:

> Hey all,
>
> I know folks are using Docker with Zeppelin, and I am trying to find a
> nice balance between size of images and usefulness of the image.
>
> Basically, when I build Zeppelin, the resulting directory is quite large,
> 1.2GB.   So the first question is, what is actually needed in the Docker
> Image? How can I make it smaller?
>
> At the same time, I am trying to find ways to ensure things like the Conf
> directory, the Logs directory etc can exist outside the container for
> persistence.
>
> Simple right?  Well I am using some ENV variables as such
>
> "ZEPPELIN_CONF_DIR":"/conf",
>
> "ZEPPELIN_NOTEBOOK_DIR":"/notebooks",
>
> "ZEPPELIN_HOME":"/zeppelin",
>
> "ZEPPELIN_LOG_DIR":"/logs",
>
>
> and having the root live at Zeppelin, and having conf, notebooks, and logs
> be volumes mounted to external persistent storage.
>
>
> This is working, but what if I want to create a directory for "custom"
> jars, interpreters, etc, what would the easier way to do that be?   I know
> in the Apache Drill project, they've added the concept of a "site"
> directory that is a bit more holistic than a "conf" directory as it allows
> them to add libs and keep that directory complete separated from the
> released jars  (https://drill.apache.org/docs/apache-drill-1-8-0-
> release-notes/) Is there anything like this in Zeppelin? Is there an easy
> way to keep customized things in a directory that is separate?
>
>
> Any thoughts on how you've optimized Dockerized setups for Zeppelin would
> be welcome! Thanks!
>
>
> John
>
>
>