You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by RJ Nowling <rn...@gmail.com> on 2015/03/20 13:49:15 UTC

Spark Notebooks

Hi all,

I went to Spark Summit East 2015.  Many of the talks emphasized notebooks
and visualization products for Spark.  My Red Hat team found a few open
source alternatives, which I wanted to share for others who might be
interested:

* Apache Zeppelin (incubating) - https://zeppelin.incubator.apache.org/
* Jove Jupyter - https://github.com/jove-sh/jove-jupyter-frontend
* Spark Notebook - https://github.com/andypetrella/spark-notebook

We're currently trying to get Apache Zeppelin working.  I'll report more as
I learn more.

RJ

Re: Spark Notebooks

Posted by jay vyas <ja...@gmail.com>.
this is an interesting new dimension, will basically allow bigtop to
provide a vehicle for e2e spark consumption.  wow,

On Fri, Mar 20, 2015 at 6:03 PM, Kelly, Jonathan <jo...@amazon.com>
wrote:

> I started working a bit on adding support for both spark-jobserver and
> Zeppelin, but it was more just on-the-side work while waiting for other
> things to compile.  I ran into some issues with each project though and
> haven't had the bandwidth to investigate them further quite yet.  For
> spark-jobserver, I did actually get it working with Bigtop (RPM
> only--haven't done the DEB config), but I was getting test failures, so I
> had to disable them temporarily for the Bigtop build.  For Zeppelin, I was
> getting some build time failure that I wasn't sure about.
>
> ~ Jonathan
>
>
>
>
> On 3/20/15, 2:50 PM, "Roman Shaposhnik" <ro...@shaposhnik.org> wrote:
>
> >I would love to see Zeppelin in Bigtop but personally
> >I'm extremely short on cycles (you'll see at ApacheCON
> >what keeps me busy).
> >
> >Thanks,
> >Roman.
> >
> >On Fri, Mar 20, 2015 at 2:46 PM, Konstantin Boudnik <co...@apache.org>
> >wrote:
> >> Yup! It'd be great if we can start trying to get Zeppelin on board.
> >>They have
> >> just sent in their SGA, so I believe the first release might be coming
> >>soon
> >> too ;) Although, for the experiments we can just probably work off their
> >> github bits.
> >>
> >> On Fri, Mar 20, 2015 at 04:34PM, RJ Nowling wrote:
> >>> I didn't realize that you and Roman were involved with Zeppelin until
> >>>I saw
> >>> your emails on the Zeppelin dev list today.
> >>>
> >>> I see a lot of big data community interest in more user-friendly
> >>>interfaces
> >>> and visualization software -- following what the Python community has
> >>>done
> >>> with iPython notebook and friends.  Unfortunately, many of the popular
> >>> solutions are proprietary.  I would love to push on open source
> >>>efforts and
> >>> open that area up.
> >>>
> >>> Zeppelin could make a great addition to BigTop.  It would provide
> >>>BigTop
> >>> with a competitive feature while also raising the profile of Zeppelin.
> >>>
> >>> Zeppelin also needs more testing and cleaning up.  I've had trouble
> >>>getting
> >>> it working (see the Zeppelin user mailing list).  Good opportunity to
> >>>get
> >>> in there and start hacking on the code.
> >>>
> >>> From Red Hat's perspective, packaging Zeppelin will require some work.
> >>> Fedora, et al. require that each JAR is provided by a single RPM.
> >>> Likewise, we'd have to work on splitting out the Node.js dependencies
> >>>so
> >>> they can depend on Fedora-available JARs.  I realize that BigTop's
> >>> packaging isn't so strict, but I think packaging in BigTop is a good
> >>>first
> >>> step, and we can work with the Zeppelin community to work out any
> >>>kinks in
> >>> their build system related to packaging.
> >>>
> >>> I see a win-win situation.  I also see an opportunity for BigTop to
> >>>once
> >>> again prove that it's on the cutting edge.
> >>>
> >>>
> >>> On Fri, Mar 20, 2015 at 11:53 AM, Konstantin Boudnik <co...@apache.org>
> >>>wrote:
> >>>
> >>> > Thanks RJ! Actually, Zeppelin is quite interesting. If you look at
> >>>their
> >>> > proposal here https://wiki.apache.org/incubator/ZeppelinProposal
> >>> > it says under Initial Goals
> >>> >
> >>> > ".... adding Zeppelin distribution to Apache Bigtop"
> >>> >
> >>> > So, I guess it's mutual interest! Would be nice to hear your
> >>>thoughts on
> >>> > that!
> >>> > Thanks,
> >>> >   Cos
> >>> >
> >>> > On Fri, Mar 20, 2015 at 07:49AM, RJ Nowling wrote:
> >>> > > Hi all,
> >>> > >
> >>> > > I went to Spark Summit East 2015.  Many of the talks emphasized
> >>>notebooks
> >>> > > and visualization products for Spark.  My Red Hat team found a few
> >>>open
> >>> > > source alternatives, which I wanted to share for others who might
> >>>be
> >>> > > interested:
> >>> > >
> >>> > > * Apache Zeppelin (incubating) -
> >>>https://zeppelin.incubator.apache.org/
> >>> > > * Jove Jupyter - https://github.com/jove-sh/jove-jupyter-frontend
> >>> > > * Spark Notebook - https://github.com/andypetrella/spark-notebook
> >>> > >
> >>> > > We're currently trying to get Apache Zeppelin working.  I'll
> >>>report more
> >>> > as
> >>> > > I learn more.
> >>> > >
> >>> > > RJ
> >>> >
>
>


-- 
jay vyas

Re: Spark Notebooks

Posted by RJ Nowling <rn...@gmail.com>.
Hi Jonathan, 

If I can help with the Zeppelin packaging or testing work, please let me know.

Have you created a JIRA?

RJ


> On Mar 20, 2015, at 6:03 PM, Kelly, Jonathan <jo...@amazon.com> wrote:
> 
> I started working a bit on adding support for both spark-jobserver and
> Zeppelin, but it was more just on-the-side work while waiting for other
> things to compile.  I ran into some issues with each project though and
> haven't had the bandwidth to investigate them further quite yet.  For
> spark-jobserver, I did actually get it working with Bigtop (RPM
> only--haven't done the DEB config), but I was getting test failures, so I
> had to disable them temporarily for the Bigtop build.  For Zeppelin, I was
> getting some build time failure that I wasn't sure about.
> 
> ~ Jonathan
> 
> 
> 
> 
>> On 3/20/15, 2:50 PM, "Roman Shaposhnik" <ro...@shaposhnik.org> wrote:
>> 
>> I would love to see Zeppelin in Bigtop but personally
>> I'm extremely short on cycles (you'll see at ApacheCON
>> what keeps me busy).
>> 
>> Thanks,
>> Roman.
>> 
>> On Fri, Mar 20, 2015 at 2:46 PM, Konstantin Boudnik <co...@apache.org>
>> wrote:
>>> Yup! It'd be great if we can start trying to get Zeppelin on board.
>>> They have
>>> just sent in their SGA, so I believe the first release might be coming
>>> soon
>>> too ;) Although, for the experiments we can just probably work off their
>>> github bits.
>>> 
>>>> On Fri, Mar 20, 2015 at 04:34PM, RJ Nowling wrote:
>>>> I didn't realize that you and Roman were involved with Zeppelin until
>>>> I saw
>>>> your emails on the Zeppelin dev list today.
>>>> 
>>>> I see a lot of big data community interest in more user-friendly
>>>> interfaces
>>>> and visualization software -- following what the Python community has
>>>> done
>>>> with iPython notebook and friends.  Unfortunately, many of the popular
>>>> solutions are proprietary.  I would love to push on open source
>>>> efforts and
>>>> open that area up.
>>>> 
>>>> Zeppelin could make a great addition to BigTop.  It would provide
>>>> BigTop
>>>> with a competitive feature while also raising the profile of Zeppelin.
>>>> 
>>>> Zeppelin also needs more testing and cleaning up.  I've had trouble
>>>> getting
>>>> it working (see the Zeppelin user mailing list).  Good opportunity to
>>>> get
>>>> in there and start hacking on the code.
>>>> 
>>>> From Red Hat's perspective, packaging Zeppelin will require some work.
>>>> Fedora, et al. require that each JAR is provided by a single RPM.
>>>> Likewise, we'd have to work on splitting out the Node.js dependencies
>>>> so
>>>> they can depend on Fedora-available JARs.  I realize that BigTop's
>>>> packaging isn't so strict, but I think packaging in BigTop is a good
>>>> first
>>>> step, and we can work with the Zeppelin community to work out any
>>>> kinks in
>>>> their build system related to packaging.
>>>> 
>>>> I see a win-win situation.  I also see an opportunity for BigTop to
>>>> once
>>>> again prove that it's on the cutting edge.
>>>> 
>>>> 
>>>> On Fri, Mar 20, 2015 at 11:53 AM, Konstantin Boudnik <co...@apache.org>
>>>> wrote:
>>>> 
>>>>> Thanks RJ! Actually, Zeppelin is quite interesting. If you look at
>>>> their
>>>>> proposal here https://wiki.apache.org/incubator/ZeppelinProposal
>>>>> it says under Initial Goals
>>>>> 
>>>>> ".... adding Zeppelin distribution to Apache Bigtop"
>>>>> 
>>>>> So, I guess it's mutual interest! Would be nice to hear your
>>>> thoughts on
>>>>> that!
>>>>> Thanks,
>>>>>  Cos
>>>>> 
>>>>>> On Fri, Mar 20, 2015 at 07:49AM, RJ Nowling wrote:
>>>>>> Hi all,
>>>>>> 
>>>>>> I went to Spark Summit East 2015.  Many of the talks emphasized
>>>> notebooks
>>>>>> and visualization products for Spark.  My Red Hat team found a few
>>>> open
>>>>>> source alternatives, which I wanted to share for others who might
>>>> be
>>>>>> interested:
>>>>>> 
>>>>>> * Apache Zeppelin (incubating) -
>>>> https://zeppelin.incubator.apache.org/
>>>>>> * Jove Jupyter - https://github.com/jove-sh/jove-jupyter-frontend
>>>>>> * Spark Notebook - https://github.com/andypetrella/spark-notebook
>>>>>> 
>>>>>> We're currently trying to get Apache Zeppelin working.  I'll
>>>> report more
>>>>> as
>>>>>> I learn more.
>>>>>> 
>>>>>> RJ
> 

Re: Spark Notebooks

Posted by "Kelly, Jonathan" <jo...@amazon.com>.
I started working a bit on adding support for both spark-jobserver and
Zeppelin, but it was more just on-the-side work while waiting for other
things to compile.  I ran into some issues with each project though and
haven't had the bandwidth to investigate them further quite yet.  For
spark-jobserver, I did actually get it working with Bigtop (RPM
only--haven't done the DEB config), but I was getting test failures, so I
had to disable them temporarily for the Bigtop build.  For Zeppelin, I was
getting some build time failure that I wasn't sure about.

~ Jonathan




On 3/20/15, 2:50 PM, "Roman Shaposhnik" <ro...@shaposhnik.org> wrote:

>I would love to see Zeppelin in Bigtop but personally
>I'm extremely short on cycles (you'll see at ApacheCON
>what keeps me busy).
>
>Thanks,
>Roman.
>
>On Fri, Mar 20, 2015 at 2:46 PM, Konstantin Boudnik <co...@apache.org>
>wrote:
>> Yup! It'd be great if we can start trying to get Zeppelin on board.
>>They have
>> just sent in their SGA, so I believe the first release might be coming
>>soon
>> too ;) Although, for the experiments we can just probably work off their
>> github bits.
>>
>> On Fri, Mar 20, 2015 at 04:34PM, RJ Nowling wrote:
>>> I didn't realize that you and Roman were involved with Zeppelin until
>>>I saw
>>> your emails on the Zeppelin dev list today.
>>>
>>> I see a lot of big data community interest in more user-friendly
>>>interfaces
>>> and visualization software -- following what the Python community has
>>>done
>>> with iPython notebook and friends.  Unfortunately, many of the popular
>>> solutions are proprietary.  I would love to push on open source
>>>efforts and
>>> open that area up.
>>>
>>> Zeppelin could make a great addition to BigTop.  It would provide
>>>BigTop
>>> with a competitive feature while also raising the profile of Zeppelin.
>>>
>>> Zeppelin also needs more testing and cleaning up.  I've had trouble
>>>getting
>>> it working (see the Zeppelin user mailing list).  Good opportunity to
>>>get
>>> in there and start hacking on the code.
>>>
>>> From Red Hat's perspective, packaging Zeppelin will require some work.
>>> Fedora, et al. require that each JAR is provided by a single RPM.
>>> Likewise, we'd have to work on splitting out the Node.js dependencies
>>>so
>>> they can depend on Fedora-available JARs.  I realize that BigTop's
>>> packaging isn't so strict, but I think packaging in BigTop is a good
>>>first
>>> step, and we can work with the Zeppelin community to work out any
>>>kinks in
>>> their build system related to packaging.
>>>
>>> I see a win-win situation.  I also see an opportunity for BigTop to
>>>once
>>> again prove that it's on the cutting edge.
>>>
>>>
>>> On Fri, Mar 20, 2015 at 11:53 AM, Konstantin Boudnik <co...@apache.org>
>>>wrote:
>>>
>>> > Thanks RJ! Actually, Zeppelin is quite interesting. If you look at
>>>their
>>> > proposal here https://wiki.apache.org/incubator/ZeppelinProposal
>>> > it says under Initial Goals
>>> >
>>> > ".... adding Zeppelin distribution to Apache Bigtop"
>>> >
>>> > So, I guess it's mutual interest! Would be nice to hear your
>>>thoughts on
>>> > that!
>>> > Thanks,
>>> >   Cos
>>> >
>>> > On Fri, Mar 20, 2015 at 07:49AM, RJ Nowling wrote:
>>> > > Hi all,
>>> > >
>>> > > I went to Spark Summit East 2015.  Many of the talks emphasized
>>>notebooks
>>> > > and visualization products for Spark.  My Red Hat team found a few
>>>open
>>> > > source alternatives, which I wanted to share for others who might
>>>be
>>> > > interested:
>>> > >
>>> > > * Apache Zeppelin (incubating) -
>>>https://zeppelin.incubator.apache.org/
>>> > > * Jove Jupyter - https://github.com/jove-sh/jove-jupyter-frontend
>>> > > * Spark Notebook - https://github.com/andypetrella/spark-notebook
>>> > >
>>> > > We're currently trying to get Apache Zeppelin working.  I'll
>>>report more
>>> > as
>>> > > I learn more.
>>> > >
>>> > > RJ
>>> >


Re: Spark Notebooks

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
I would love to see Zeppelin in Bigtop but personally
I'm extremely short on cycles (you'll see at ApacheCON
what keeps me busy).

Thanks,
Roman.

On Fri, Mar 20, 2015 at 2:46 PM, Konstantin Boudnik <co...@apache.org> wrote:
> Yup! It'd be great if we can start trying to get Zeppelin on board. They have
> just sent in their SGA, so I believe the first release might be coming soon
> too ;) Although, for the experiments we can just probably work off their
> github bits.
>
> On Fri, Mar 20, 2015 at 04:34PM, RJ Nowling wrote:
>> I didn't realize that you and Roman were involved with Zeppelin until I saw
>> your emails on the Zeppelin dev list today.
>>
>> I see a lot of big data community interest in more user-friendly interfaces
>> and visualization software -- following what the Python community has done
>> with iPython notebook and friends.  Unfortunately, many of the popular
>> solutions are proprietary.  I would love to push on open source efforts and
>> open that area up.
>>
>> Zeppelin could make a great addition to BigTop.  It would provide BigTop
>> with a competitive feature while also raising the profile of Zeppelin.
>>
>> Zeppelin also needs more testing and cleaning up.  I've had trouble getting
>> it working (see the Zeppelin user mailing list).  Good opportunity to get
>> in there and start hacking on the code.
>>
>> From Red Hat's perspective, packaging Zeppelin will require some work.
>> Fedora, et al. require that each JAR is provided by a single RPM.
>> Likewise, we'd have to work on splitting out the Node.js dependencies so
>> they can depend on Fedora-available JARs.  I realize that BigTop's
>> packaging isn't so strict, but I think packaging in BigTop is a good first
>> step, and we can work with the Zeppelin community to work out any kinks in
>> their build system related to packaging.
>>
>> I see a win-win situation.  I also see an opportunity for BigTop to once
>> again prove that it's on the cutting edge.
>>
>>
>> On Fri, Mar 20, 2015 at 11:53 AM, Konstantin Boudnik <co...@apache.org> wrote:
>>
>> > Thanks RJ! Actually, Zeppelin is quite interesting. If you look at their
>> > proposal here https://wiki.apache.org/incubator/ZeppelinProposal
>> > it says under Initial Goals
>> >
>> > ".... adding Zeppelin distribution to Apache Bigtop"
>> >
>> > So, I guess it's mutual interest! Would be nice to hear your thoughts on
>> > that!
>> > Thanks,
>> >   Cos
>> >
>> > On Fri, Mar 20, 2015 at 07:49AM, RJ Nowling wrote:
>> > > Hi all,
>> > >
>> > > I went to Spark Summit East 2015.  Many of the talks emphasized notebooks
>> > > and visualization products for Spark.  My Red Hat team found a few open
>> > > source alternatives, which I wanted to share for others who might be
>> > > interested:
>> > >
>> > > * Apache Zeppelin (incubating) - https://zeppelin.incubator.apache.org/
>> > > * Jove Jupyter - https://github.com/jove-sh/jove-jupyter-frontend
>> > > * Spark Notebook - https://github.com/andypetrella/spark-notebook
>> > >
>> > > We're currently trying to get Apache Zeppelin working.  I'll report more
>> > as
>> > > I learn more.
>> > >
>> > > RJ
>> >

Re: Spark Notebooks

Posted by Konstantin Boudnik <co...@apache.org>.
Yup! It'd be great if we can start trying to get Zeppelin on board. They have
just sent in their SGA, so I believe the first release might be coming soon
too ;) Although, for the experiments we can just probably work off their
github bits.

On Fri, Mar 20, 2015 at 04:34PM, RJ Nowling wrote:
> I didn't realize that you and Roman were involved with Zeppelin until I saw
> your emails on the Zeppelin dev list today.
> 
> I see a lot of big data community interest in more user-friendly interfaces
> and visualization software -- following what the Python community has done
> with iPython notebook and friends.  Unfortunately, many of the popular
> solutions are proprietary.  I would love to push on open source efforts and
> open that area up.
> 
> Zeppelin could make a great addition to BigTop.  It would provide BigTop
> with a competitive feature while also raising the profile of Zeppelin.
> 
> Zeppelin also needs more testing and cleaning up.  I've had trouble getting
> it working (see the Zeppelin user mailing list).  Good opportunity to get
> in there and start hacking on the code.
> 
> From Red Hat's perspective, packaging Zeppelin will require some work.
> Fedora, et al. require that each JAR is provided by a single RPM.
> Likewise, we'd have to work on splitting out the Node.js dependencies so
> they can depend on Fedora-available JARs.  I realize that BigTop's
> packaging isn't so strict, but I think packaging in BigTop is a good first
> step, and we can work with the Zeppelin community to work out any kinks in
> their build system related to packaging.
> 
> I see a win-win situation.  I also see an opportunity for BigTop to once
> again prove that it's on the cutting edge.
> 
> 
> On Fri, Mar 20, 2015 at 11:53 AM, Konstantin Boudnik <co...@apache.org> wrote:
> 
> > Thanks RJ! Actually, Zeppelin is quite interesting. If you look at their
> > proposal here https://wiki.apache.org/incubator/ZeppelinProposal
> > it says under Initial Goals
> >
> > ".... adding Zeppelin distribution to Apache Bigtop"
> >
> > So, I guess it's mutual interest! Would be nice to hear your thoughts on
> > that!
> > Thanks,
> >   Cos
> >
> > On Fri, Mar 20, 2015 at 07:49AM, RJ Nowling wrote:
> > > Hi all,
> > >
> > > I went to Spark Summit East 2015.  Many of the talks emphasized notebooks
> > > and visualization products for Spark.  My Red Hat team found a few open
> > > source alternatives, which I wanted to share for others who might be
> > > interested:
> > >
> > > * Apache Zeppelin (incubating) - https://zeppelin.incubator.apache.org/
> > > * Jove Jupyter - https://github.com/jove-sh/jove-jupyter-frontend
> > > * Spark Notebook - https://github.com/andypetrella/spark-notebook
> > >
> > > We're currently trying to get Apache Zeppelin working.  I'll report more
> > as
> > > I learn more.
> > >
> > > RJ
> >

Re: Spark Notebooks

Posted by RJ Nowling <rn...@gmail.com>.
I didn't realize that you and Roman were involved with Zeppelin until I saw
your emails on the Zeppelin dev list today.

I see a lot of big data community interest in more user-friendly interfaces
and visualization software -- following what the Python community has done
with iPython notebook and friends.  Unfortunately, many of the popular
solutions are proprietary.  I would love to push on open source efforts and
open that area up.

Zeppelin could make a great addition to BigTop.  It would provide BigTop
with a competitive feature while also raising the profile of Zeppelin.

Zeppelin also needs more testing and cleaning up.  I've had trouble getting
it working (see the Zeppelin user mailing list).  Good opportunity to get
in there and start hacking on the code.

>From Red Hat's perspective, packaging Zeppelin will require some work.
Fedora, et al. require that each JAR is provided by a single RPM.
Likewise, we'd have to work on splitting out the Node.js dependencies so
they can depend on Fedora-available JARs.  I realize that BigTop's
packaging isn't so strict, but I think packaging in BigTop is a good first
step, and we can work with the Zeppelin community to work out any kinks in
their build system related to packaging.

I see a win-win situation.  I also see an opportunity for BigTop to once
again prove that it's on the cutting edge.


On Fri, Mar 20, 2015 at 11:53 AM, Konstantin Boudnik <co...@apache.org> wrote:

> Thanks RJ! Actually, Zeppelin is quite interesting. If you look at their
> proposal here https://wiki.apache.org/incubator/ZeppelinProposal
> it says under Initial Goals
>
> ".... adding Zeppelin distribution to Apache Bigtop"
>
> So, I guess it's mutual interest! Would be nice to hear your thoughts on
> that!
> Thanks,
>   Cos
>
> On Fri, Mar 20, 2015 at 07:49AM, RJ Nowling wrote:
> > Hi all,
> >
> > I went to Spark Summit East 2015.  Many of the talks emphasized notebooks
> > and visualization products for Spark.  My Red Hat team found a few open
> > source alternatives, which I wanted to share for others who might be
> > interested:
> >
> > * Apache Zeppelin (incubating) - https://zeppelin.incubator.apache.org/
> > * Jove Jupyter - https://github.com/jove-sh/jove-jupyter-frontend
> > * Spark Notebook - https://github.com/andypetrella/spark-notebook
> >
> > We're currently trying to get Apache Zeppelin working.  I'll report more
> as
> > I learn more.
> >
> > RJ
>

Re: Spark Notebooks

Posted by Konstantin Boudnik <co...@apache.org>.
Thanks RJ! Actually, Zeppelin is quite interesting. If you look at their
proposal here https://wiki.apache.org/incubator/ZeppelinProposal
it says under Initial Goals

".... adding Zeppelin distribution to Apache Bigtop"

So, I guess it's mutual interest! Would be nice to hear your thoughts on that!
Thanks,
  Cos

On Fri, Mar 20, 2015 at 07:49AM, RJ Nowling wrote:
> Hi all,
> 
> I went to Spark Summit East 2015.  Many of the talks emphasized notebooks
> and visualization products for Spark.  My Red Hat team found a few open
> source alternatives, which I wanted to share for others who might be
> interested:
> 
> * Apache Zeppelin (incubating) - https://zeppelin.incubator.apache.org/
> * Jove Jupyter - https://github.com/jove-sh/jove-jupyter-frontend
> * Spark Notebook - https://github.com/andypetrella/spark-notebook
> 
> We're currently trying to get Apache Zeppelin working.  I'll report more as
> I learn more.
> 
> RJ