You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@spark.apache.org by Jerry Lam <ch...@gmail.com> on 2015/08/18 15:35:31 UTC

Spark + Jupyter (IPython Notebook)

Hi spark users and developers,

Did anyone have IPython Notebook (Jupyter) deployed in production that uses
Spark as the computational engine?

I know Databricks Cloud provides similar features with deeper integration
with Spark. However, Databricks Cloud has to be hosted by Databricks so we
cannot do this.

Other solutions (e.g. Zeppelin) seem to reinvent the wheel that IPython has
already offered years ago. It would be great if someone can educate me the
reason behind this.

Best Regards,

Jerry

Re: Spark + Jupyter (IPython Notebook)

Posted by Jerry Lam <ch...@gmail.com>.
Hi Prabeesh,

That's even better!

Thanks for sharing

Jerry


On Tue, Aug 18, 2015 at 1:31 PM, Prabeesh K. <pr...@gmail.com> wrote:

> Refer this post
> http://blog.prabeeshk.com/blog/2015/06/19/pyspark-notebook-with-docker/
>
> Spark + Jupyter + Docker
>
> On 18 August 2015 at 21:29, Jerry Lam <ch...@gmail.com> wrote:
>
>> Hi Guru,
>>
>> Thanks! Great to hear that someone tried it in production. How do you
>> like it so far?
>>
>> Best Regards,
>>
>> Jerry
>>
>>
>> On Tue, Aug 18, 2015 at 11:38 AM, Guru Medasani <gd...@gmail.com> wrote:
>>
>>> Hi Jerry,
>>>
>>> Yes. I’ve seen customers using this in production for data science work.
>>> I’m currently using this for one of my projects on a cluster as well.
>>>
>>> Also, here is a blog that describes how to configure this.
>>>
>>>
>>> http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/
>>>
>>>
>>> Guru Medasani
>>> gdmeda@gmail.com
>>>
>>>
>>>
>>> On Aug 18, 2015, at 8:35 AM, Jerry Lam <ch...@gmail.com> wrote:
>>>
>>> Hi spark users and developers,
>>>
>>> Did anyone have IPython Notebook (Jupyter) deployed in production that
>>> uses Spark as the computational engine?
>>>
>>> I know Databricks Cloud provides similar features with deeper
>>> integration with Spark. However, Databricks Cloud has to be hosted by
>>> Databricks so we cannot do this.
>>>
>>> Other solutions (e.g. Zeppelin) seem to reinvent the wheel that IPython
>>> has already offered years ago. It would be great if someone can educate me
>>> the reason behind this.
>>>
>>> Best Regards,
>>>
>>> Jerry
>>>
>>>
>>>
>>
>

Re: Spark + Jupyter (IPython Notebook)

Posted by "Prabeesh K." <pr...@gmail.com>.
Refer this post
http://blog.prabeeshk.com/blog/2015/06/19/pyspark-notebook-with-docker/

Spark + Jupyter + Docker

On 18 August 2015 at 21:29, Jerry Lam <ch...@gmail.com> wrote:

> Hi Guru,
>
> Thanks! Great to hear that someone tried it in production. How do you like
> it so far?
>
> Best Regards,
>
> Jerry
>
>
> On Tue, Aug 18, 2015 at 11:38 AM, Guru Medasani <gd...@gmail.com> wrote:
>
>> Hi Jerry,
>>
>> Yes. I’ve seen customers using this in production for data science work.
>> I’m currently using this for one of my projects on a cluster as well.
>>
>> Also, here is a blog that describes how to configure this.
>>
>>
>> http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/
>>
>>
>> Guru Medasani
>> gdmeda@gmail.com
>>
>>
>>
>> On Aug 18, 2015, at 8:35 AM, Jerry Lam <ch...@gmail.com> wrote:
>>
>> Hi spark users and developers,
>>
>> Did anyone have IPython Notebook (Jupyter) deployed in production that
>> uses Spark as the computational engine?
>>
>> I know Databricks Cloud provides similar features with deeper integration
>> with Spark. However, Databricks Cloud has to be hosted by Databricks so we
>> cannot do this.
>>
>> Other solutions (e.g. Zeppelin) seem to reinvent the wheel that IPython
>> has already offered years ago. It would be great if someone can educate me
>> the reason behind this.
>>
>> Best Regards,
>>
>> Jerry
>>
>>
>>
>

Re: Spark + Jupyter (IPython Notebook)

Posted by andy petrella <an...@gmail.com>.
Hey,

Actually, for Scala, I'd better using
https://github.com/andypetrella/spark-notebook/

It's deployed at several places like *Alibaba*, *EBI*, *Cray* and is
supported by both the Scala community and the company Data Fellas.
For instance, it was part of the Big Scala Pipeline training given this
16th August at Galvanize in San Francisco with the collaboration of *Datastax,
Mesosphere, Databricks, Confluent and Typesafe*:
http://scala.bythebay.io/pipeline.html. It was a successful 100+ attendants
training day.

Also, it's the only one fully reactive including a reactive plotting
library in Scala, allowing you to creatively plot a moving average computed
in a DStream, or a D3 Graph layout dynamically updated or even a dynamic
map of the received tweets having geoloc set. Of course, you can plot
lines, pies, bars, hist, boxplot for any kind of data, being Dataframe, SQL
stuffs, Seq, List, Map or whatever of tuples or classes.

Checkout http://spark-notebook.io/, for your specific distro.
Note that you can also use it directly on DCOS.

For any question, I'll be glad helping you on the ~200 crowded gitter
chatroom: https://gitter.im/andypetrella/spark-notebook

cheers and have fun :-)


On Tue, Aug 18, 2015 at 10:24 PM Guru Medasani <gd...@gmail.com> wrote:

> For python it is really great.
>
> There is some work in progress in bringing Scala support to Jupyter as
> well.
>
> https://github.com/hohonuuli/sparknotebook
>
> https://github.com/alexarchambault/jupyter-scala
>
>
> Guru Medasani
> gdmeda@gmail.com
>
>
>
> On Aug 18, 2015, at 12:29 PM, Jerry Lam <ch...@gmail.com> wrote:
>
> Hi Guru,
>
> Thanks! Great to hear that someone tried it in production. How do you like
> it so far?
>
> Best Regards,
>
> Jerry
>
>
> On Tue, Aug 18, 2015 at 11:38 AM, Guru Medasani <gd...@gmail.com> wrote:
>
>> Hi Jerry,
>>
>> Yes. I’ve seen customers using this in production for data science work.
>> I’m currently using this for one of my projects on a cluster as well.
>>
>> Also, here is a blog that describes how to configure this.
>>
>>
>> http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/
>>
>>
>> Guru Medasani
>> gdmeda@gmail.com
>>
>>
>>
>> On Aug 18, 2015, at 8:35 AM, Jerry Lam <ch...@gmail.com> wrote:
>>
>> Hi spark users and developers,
>>
>> Did anyone have IPython Notebook (Jupyter) deployed in production that
>> uses Spark as the computational engine?
>>
>> I know Databricks Cloud provides similar features with deeper integration
>> with Spark. However, Databricks Cloud has to be hosted by Databricks so we
>> cannot do this.
>>
>> Other solutions (e.g. Zeppelin) seem to reinvent the wheel that IPython
>> has already offered years ago. It would be great if someone can educate me
>> the reason behind this.
>>
>> Best Regards,
>>
>> Jerry
>>
>>
>>
>
> --
andy

Re: Spark + Jupyter (IPython Notebook)

Posted by Guru Medasani <gd...@gmail.com>.
For python it is really great. 

There is some work in progress in bringing Scala support to Jupyter as well.

https://github.com/hohonuuli/sparknotebook <https://github.com/hohonuuli/sparknotebook>

https://github.com/alexarchambault/jupyter-scala <https://github.com/alexarchambault/jupyter-scala>


Guru Medasani
gdmeda@gmail.com



> On Aug 18, 2015, at 12:29 PM, Jerry Lam <ch...@gmail.com> wrote:
> 
> Hi Guru,
> 
> Thanks! Great to hear that someone tried it in production. How do you like it so far?
> 
> Best Regards,
> 
> Jerry
> 
> 
> On Tue, Aug 18, 2015 at 11:38 AM, Guru Medasani <gdmeda@gmail.com <ma...@gmail.com>> wrote:
> Hi Jerry,
> 
> Yes. I’ve seen customers using this in production for data science work. I’m currently using this for one of my projects on a cluster as well. 
> 
> Also, here is a blog that describes how to configure this. 
> 
> http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/ <http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/>
> 
> 
> Guru Medasani
> gdmeda@gmail.com <ma...@gmail.com>
> 
> 
> 
>> On Aug 18, 2015, at 8:35 AM, Jerry Lam <chilinglam@gmail.com <ma...@gmail.com>> wrote:
>> 
>> Hi spark users and developers,
>> 
>> Did anyone have IPython Notebook (Jupyter) deployed in production that uses Spark as the computational engine? 
>> 
>> I know Databricks Cloud provides similar features with deeper integration with Spark. However, Databricks Cloud has to be hosted by Databricks so we cannot do this. 
>> 
>> Other solutions (e.g. Zeppelin) seem to reinvent the wheel that IPython has already offered years ago. It would be great if someone can educate me the reason behind this.
>> 
>> Best Regards,
>> 
>> Jerry
> 
> 


Re: Spark + Jupyter (IPython Notebook)

Posted by Jerry Lam <ch...@gmail.com>.
Hi Guru,

Thanks! Great to hear that someone tried it in production. How do you like
it so far?

Best Regards,

Jerry


On Tue, Aug 18, 2015 at 11:38 AM, Guru Medasani <gd...@gmail.com> wrote:

> Hi Jerry,
>
> Yes. I’ve seen customers using this in production for data science work.
> I’m currently using this for one of my projects on a cluster as well.
>
> Also, here is a blog that describes how to configure this.
>
>
> http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/
>
>
> Guru Medasani
> gdmeda@gmail.com
>
>
>
> On Aug 18, 2015, at 8:35 AM, Jerry Lam <ch...@gmail.com> wrote:
>
> Hi spark users and developers,
>
> Did anyone have IPython Notebook (Jupyter) deployed in production that
> uses Spark as the computational engine?
>
> I know Databricks Cloud provides similar features with deeper integration
> with Spark. However, Databricks Cloud has to be hosted by Databricks so we
> cannot do this.
>
> Other solutions (e.g. Zeppelin) seem to reinvent the wheel that IPython
> has already offered years ago. It would be great if someone can educate me
> the reason behind this.
>
> Best Regards,
>
> Jerry
>
>
>

Re: Spark + Jupyter (IPython Notebook)

Posted by Guru Medasani <gd...@gmail.com>.
Hi Jerry,

Yes. I’ve seen customers using this in production for data science work. I’m currently using this for one of my projects on a cluster as well. 

Also, here is a blog that describes how to configure this. 

http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/ <http://blog.cloudera.com/blog/2014/08/how-to-use-ipython-notebook-with-apache-spark/>


Guru Medasani
gdmeda@gmail.com



> On Aug 18, 2015, at 8:35 AM, Jerry Lam <ch...@gmail.com> wrote:
> 
> Hi spark users and developers,
> 
> Did anyone have IPython Notebook (Jupyter) deployed in production that uses Spark as the computational engine? 
> 
> I know Databricks Cloud provides similar features with deeper integration with Spark. However, Databricks Cloud has to be hosted by Databricks so we cannot do this. 
> 
> Other solutions (e.g. Zeppelin) seem to reinvent the wheel that IPython has already offered years ago. It would be great if someone can educate me the reason behind this.
> 
> Best Regards,
> 
> Jerry