You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by moon soo Lee <mo...@apache.org> on 2016/10/16 01:14:18 UTC

Re: Zeppelin as a modeling platform

Hi Nirav,

Thanks for sharing your thoughts.
I think idea of reuse notebook make sense.

One possible idea about resuing notebook, is extend current
z.run(PRARAGRAPH_ID) [1] which works for paragraphs only in the same note,
to z.run(NOTE_ID) or z.run(PARAGRAPH_ID) which works any note or paragraph
in the other note.

Deploy notebook in production, there're two approaches. One is improve REST
api from external application, the other is enhance Zeppelin's job
scheduler. I think both valid approach.

Best,
moon

[1]
https://github.com/apache/zeppelin/blob/branch-0.6/spark/src/main/java/org/apache/zeppelin/spark/ZeppelinContext.java#L288


On Tue, Sep 27, 2016 at 2:43 AM Nirav Patel <np...@xactlycorp.com> wrote:

> Hi,
>
> Currently I am using apache zeppelin alongside my eclipse based scala
> project. So basically I use my scala project to spit various intermediate
> files or file I need for analysis and then use zeppelin to create different
> visualization on top of those files. However, many times I find myself to
> be able to dig more into models that I am using. For that I think it's
> easier to just do modeling in zeppelin as well using spark mllib or any
> other imported library. Is this a proper use case for zeppelin?
>
> If it is then I think there are some enhancement should be added to
> notebook. e.g. Ability to reuse notebook (treat them as a class or package
> ) so it can be imported into other notebooks at least. That way we can
> define common imports, variables, files, objects (filesystem, connection
> pool) etc.
>
> Another thing to consider is how to deploy such notebooks in production.
> e.g. how to parameterize zeppelin notebook and call it via REST or
> something.
>
> Thanks
> Nirav
>
>
>
> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>
> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
> <https://twitter.com/Xactly>  [image: Facebook]
> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
> <http://www.youtube.com/xactlycorporation>

Re: Zeppelin as a modeling platform

Posted by Nirav Patel <np...@xactlycorp.com>.
Hi Moon,

Regarding re-using notebook, I noticed one thing in past week that if you
used 'shared' spark interpreter/context then all your variables (at least
scala based) are shared across multiple notes. I think solution you are
mentioning will give user better control. i.e. they can run spark
interpreter in any mode and still be able to control what is shared across
notebooks.

However I see some other minor problems. (problems that exists in any shell
based programming hence may not necessary a problem but just to share)

   - User has to be careful using same variable declaration across
   notebooks. e.g. if I do `val abc = bla bla ` in one notebook and `val abc =
   123` in another, last one executed will override previous one.


Regarding REST api, I strongly think that REST api is better alternative
then a sheduler. Here are few reason why not to rely on scheduler:

   1. Often times your model is a part of complex pipeline. i.e. it has to
   be part of a complex workflow and has to rely on certain external events
   state
   2. Almost every component in your pipeline requires *parameterization*
   or some kind of config (static or dynamic). In case of dynamic
   configuration it is easier to call component (here zeppelin notebook) with
   parameters.
   3. You can't rely on stand alone scheduler to get triggered at right
   time unless you can make it configurable based on external events. even
   though I think it's not as reliable as calling via REST api
   4. with REST api at hand, user can design their own scheduling however
   they want.

I just developed my first model with notebook. I will have more thoughts
once I'll think more about how to deploy it, retrain it time to time or
even re-evaluate it etc.

Thanks,
Nirav



On Sat, Oct 15, 2016 at 6:14 PM, moon soo Lee <mo...@apache.org> wrote:

> Hi Nirav,
>
> Thanks for sharing your thoughts.
> I think idea of reuse notebook make sense.
>
> One possible idea about resuing notebook, is extend current
> z.run(PRARAGRAPH_ID) [1] which works for paragraphs only in the same note,
> to z.run(NOTE_ID) or z.run(PARAGRAPH_ID) which works any note or paragraph
> in the other note.
>
> Deploy notebook in production, there're two approaches. One is improve
> REST api from external application, the other is enhance Zeppelin's job
> scheduler. I think both valid approach.
>
> Best,
> moon
>
> [1] https://github.com/apache/zeppelin/blob/branch-0.6/
> spark/src/main/java/org/apache/zeppelin/spark/ZeppelinContext.java#L288
>
>
> On Tue, Sep 27, 2016 at 2:43 AM Nirav Patel <np...@xactlycorp.com> wrote:
>
>> Hi,
>>
>> Currently I am using apache zeppelin alongside my eclipse based scala
>> project. So basically I use my scala project to spit various intermediate
>> files or file I need for analysis and then use zeppelin to create different
>> visualization on top of those files. However, many times I find myself to
>> be able to dig more into models that I am using. For that I think it's
>> easier to just do modeling in zeppelin as well using spark mllib or any
>> other imported library. Is this a proper use case for zeppelin?
>>
>> If it is then I think there are some enhancement should be added to
>> notebook. e.g. Ability to reuse notebook (treat them as a class or package
>> ) so it can be imported into other notebooks at least. That way we can
>> define common imports, variables, files, objects (filesystem, connection
>> pool) etc.
>>
>> Another thing to consider is how to deploy such notebooks in production.
>> e.g. how to parameterize zeppelin notebook and call it via REST or
>> something.
>>
>> Thanks
>> Nirav
>>
>>
>>
>> [image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>
>>
>> <https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn]
>> <https://www.linkedin.com/company/xactly-corporation>  [image: Twitter]
>> <https://twitter.com/Xactly>  [image: Facebook]
>> <https://www.facebook.com/XactlyCorp>  [image: YouTube]
>> <http://www.youtube.com/xactlycorporation>
>
>

-- 


[image: What's New with Xactly] <http://www.xactlycorp.com/email-click/>

<https://www.nyse.com/quote/XNYS:XTLY>  [image: LinkedIn] 
<https://www.linkedin.com/company/xactly-corporation>  [image: Twitter] 
<https://twitter.com/Xactly>  [image: Facebook] 
<https://www.facebook.com/XactlyCorp>  [image: YouTube] 
<http://www.youtube.com/xactlycorporation>