You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@zeppelin.apache.org by Guilherme Silveira <gu...@gmail.com> on 2015/04/02 01:43:50 UTC

Future features

Hi folks,


I saw this project and it really excited me.
I am considering using it, but I have some philosophical questions...

For me, the benchmark is Databricks Cloud, but it only works in AWS and I
need to deploy it onpremise.
It has all features I need to far, and probably more.

Do you have plans to add the features below?

1- (MUST HAVE) Add jobs scheduler -> the ability to schedule periodic jobs
based on time interval (run it every night, so example)
2- Login authentication using LDAP
3- Add support for python


Second part:

What is the relationship between this project and
https://twitter.com/SparkNotebook ?

Re: Future features

Posted by RJ Nowling <rn...@gmail.com>.
I stand corrected on the periodic scheduler. :)


> On Apr 1, 2015, at 6:51 PM, moon soo Lee <mo...@apache.org> wrote:
> 
> Hi,
> 
> Thanks for your interest.
> Zeppelin embeds cron like scheduler, so you can run your notebook periodically. Also recently added support to pyspark.
> 
> About authentication, there're old pullrequest (not merged) that doing basic authentication https://github.com/NFLabs/zeppelin/pull/168.
> but i think many users create their own reverse proxy a head of Zeppelin for authentication.
> 
> Don't have special relation between SparkNotebook project, yet. :-)
> 
> Best,
> moon
> 
> 
>> On Thu, Apr 2, 2015 at 8:45 AM Guilherme Silveira <gu...@gmail.com> wrote:
>> Hi folks, 
>> 
>> 
>> I saw this project and it really excited me.
>> I am considering using it, but I have some philosophical questions...
>> 
>> For me, the benchmark is Databricks Cloud, but it only works in AWS and I need to deploy it onpremise.
>> It has all features I need to far, and probably more.
>> 
>> Do you have plans to add the features below?
>> 
>> 1- (MUST HAVE) Add jobs scheduler -> the ability to schedule periodic jobs based on time interval (run it every night, so example)
>> 2- Login authentication using LDAP
>> 3- Add support for python
>> 
>> 
>> Second part:
>> 
>> What is the relationship between this project and https://twitter.com/SparkNotebook ?

Re: Future features

Posted by moon soo Lee <mo...@apache.org>.
Hi,

Thanks for your interest.
Zeppelin embeds cron like scheduler, so you can run your notebook
periodically. Also recently added support to pyspark.

About authentication, there're old pullrequest (not merged) that doing
basic authentication https://github.com/NFLabs/zeppelin/pull/168.
but i think many users create their own reverse proxy a head of Zeppelin
for authentication.

Don't have special relation between SparkNotebook project, yet. :-)

Best,
moon


On Thu, Apr 2, 2015 at 8:45 AM Guilherme Silveira <
guilhermecgsspam@gmail.com> wrote:

> Hi folks,
>
>
> I saw this project and it really excited me.
> I am considering using it, but I have some philosophical questions...
>
> For me, the benchmark is Databricks Cloud, but it only works in AWS and I
> need to deploy it onpremise.
> It has all features I need to far, and probably more.
>
> Do you have plans to add the features below?
>
> 1- (MUST HAVE) Add jobs scheduler -> the ability to schedule periodic jobs
> based on time interval (run it every night, so example)
> 2- Login authentication using LDAP
> 3- Add support for python
>
>
> Second part:
>
> What is the relationship between this project and
> https://twitter.com/SparkNotebook ?
>
>
>
>
>
>

Re: Future features

Posted by RJ Nowling <rn...@gmail.com>.
From what I've seen so far (as a user):

1. No periodic job support. 

Are you differentiating between exploratory / presentation work vs production pipelines? In my group's work, we would just use cron

2. No built-in authentication -- I created a JIRA for that.

There is a thread about it though. I've had some initial success with nginx as a reverse proxy for authentication. But we are running a separate Zeppelin instance under each user's account.

3. Zeppelin supports PySpark

> On Apr 1, 2015, at 6:43 PM, Guilherme Silveira <gu...@gmail.com> wrote:
> 
> Hi folks, 
> 
> 
> I saw this project and it really excited me.
> I am considering using it, but I have some philosophical questions...
> 
> For me, the benchmark is Databricks Cloud, but it only works in AWS and I need to deploy it onpremise.
> It has all features I need to far, and probably more.
> 
> Do you have plans to add the features below?
> 
> 1- (MUST HAVE) Add jobs scheduler -> the ability to schedule periodic jobs based on time interval (run it every night, so example)
> 2- Login authentication using LDAP
> 3- Add support for python
> 
> 
> Second part:
> 
> What is the relationship between this project and https://twitter.com/SparkNotebook ?
> 
> 
> 
> 
>