You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by moon soo Lee <mo...@apache.org> on 2016/01/01 03:17:04 UTC

Re: How to support multiple notebooks in Zeppelin

Hi Dafeng,

Actually the purpose of binding notebook with one specific instance is
because cached notebooks in memory are not synchronized across other
Zeppelin instances even if they're sharing a single notebook storage.
That's why i thought a request for specific notebook should go to the
specific Zeppelin instance.

Happy new year!

Best,
moon


On Wed, Dec 30, 2015 at 7:19 PM Dafeng Wang <da...@microsoft.com> wrote:

> Hi Moon,
>
>
>
> Thanks again for the answer. Your idea looks great!
>
> The purpose of binding notebook with one specific instance is just to save
> the loading time, right? I’ll do a little bit try with this solution, then
> probably will bring more questions: such as how to map one ZeppelinServer
> to different clusters, and so onJ
>
>
>
> And Happy new year!
>
>
>
> Regards,
>
> Dafeng
>
>
>
> *From:* moon soo Lee [mailto:moon@apache.org]
> *Sent:* Thursday, December 31, 2015 9:57 AM
>
>
> *To:* users@zeppelin.incubator.apache.org
> *Subject:* Re: How to support multiple notebooks in Zeppelin
>
>
>
> Hi Dafeng,
>
>
>
> Right, all informations are cached in memory once loaded. To serve large
> amount of notebooks, we'll need to modify Zeppelin a bit to not keep them
> in the memory.
>
>
>
> For #2, Multiple ZeppelinServer can be configured to use shared notebook
> storage. And loadbalancer in front of them can distribute the REST API and
> websocket traffic.
>
> If loadbalancer redirect the traffic based on notebook id in the path
> (e.g. '2BA6X4HHM' from path 'http://localhost:8080/#/notebook/2BA6X4HHM
> <http://localhost:9000/#/notebook/2BA6X4HHM>'),
>
> and let the same notebook id stick to the same Zeppelin instance,
> ZeppelinServer will able to scale out for many users.
>
>
>
> Will this fits for your use case? Let me know if you have different idea.
>
>
>
> Thanks,
>
> moon
>
>
>
> On Wed, Dec 30, 2015 at 5:37 PM Dafeng Wang <da...@microsoft.com> wrote:
>
> Hi Moon,
>
>
>
> Thanks for your quick response.
>
> 1.       For first question, got your answer. that means all meta info,
> paragraph, and result of one notebook will be cached in memory once loaded,
> right?
>
> 2.       For #2, my real question will be how to serve millions of users
> for Zeppelin. Per my current understanding REST API and websocket
> connection handling will be coupled together within one server, if it’s
> true, then my original question will become to “how to scale out the server
> that servers REST API, Websocket connection”, or not, what’s the
> differences of scaling them when I wanna support more users?
>
>
>
> Regards,
>
> Dafeng
>
>
>
> *From:* moon soo Lee [mailto:moon@apache.org]
> *Sent:* Thursday, December 31, 2015 12:11 AM
> *To:* users@zeppelin.incubator.apache.org
> *Subject:* Re: How to support multiple notebooks in Zeppelin
>
>
>
> Hi Dafeng,
>
>
>
> Zeppelin at the moment keeps every notebook in memory once it's been
> loaded. So number of notebooks supported by instance will be limited by the
> memory on the system.
>
>
>
> Scale out Zeppelin-Server you mean scale out the server that serves REST
> API, Websocket connection?
>
>
>
> Thanks,
>
> moon
>
> On Wed, Dec 30, 2015 at 1:57 AM Dafeng Wang <da...@microsoft.com> wrote:
>
> Hi All,
>
>
>
> I had tried with Zeppelin today, it works perfectly in stand-alone mode,
> my questions now are:
>
> 1.       Capacity limitation of Zeppelin-Server, in other words, how many
> notebook instance one server can support
>
> 2.       If we want to scale out Zepplein-server, is that possible? If
> so, then how? Let’s say the environment it’s yarn + spark
>
>
>
> Regards,
>
> Dafeng
>
>