You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@toree.apache.org by Luciano Resende <lu...@gmail.com> on 2018/09/25 16:04:33 UTC

Removing Toree specific support for Python and R

Toree currently has some support for Python and R which are a thin
wrapper around what is provided in Spark. While this enables sharing
the Spark session and SQL Context between Scala and Python code, it
brings a huge gap on functionality when comparing with other Python
kernels like IPython and IRKernel. Adding to that, there is no
community member that is actively enhancing and contributing to these
two areas.

Based on these, I would like to suggest the following to be done for
Toree 0.3.0 release timeframe:
- Remove Python and R support from Toree
- Document possible suggestions on Toree website (e.g. IPython,
IRKernel or others more active on their respective community)

Some of the benefits:
- Avoid user confusion and frustration
- Improve kernel startup performance
- Cleaner code to maintain

Please let me know your thoughts.

[1] https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/sqlcontext_sharing.ipynb

-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/

Re: Removing Toree specific support for Python and R

Posted by Chip Senkbeil <ch...@gmail.com>.
They served as experiments in a time where the functionality may have been
useful, but that time has long since passed.

+1

On Tue, Sep 25, 2018, 9:19 PM Gino Bustelo <lb...@gmail.com> wrote:

> +1
>
> Spark’s support to share temp tables across sessions voids any need for
> this multi-Lang support.
>
> Gino B.
>
> > On Sep 25, 2018, at 2:20 PM, Corey Stubbs <ca...@gmail.com> wrote:
> >
> > +1
> >
> > On Tue, Sep 25, 2018, 13:10 Marius van Niekerk <
> marius.v.niekerk@gmail.com>
> > wrote:
> >
> >> +1
> >>
> >> This would simplify usage, and remove a large class of python questions
> and
> >> confusion on our mailing list.  Several users assume that our R and
> Python
> >> featureset has parity with the primary kernels.
> >>
> >>> On Tue, 25 Sep 2018 at 12:46 kbates4@gmail.com <kb...@gmail.com>
> wrote:
> >>>
> >>> +1
> >>> The benefits that Luciano points out far outweigh the, probably
> >>> seldom-used, context sharing capabilities with limited features, IMHO.
> >>> This will eliminate an entire class of issues and questions while
> >> allowing
> >>> contributors to narrow their focus only on scala functionality and
> >>> improvements.
> >>>
> >>>> On 2018/09/25 16:04:33, Luciano Resende <lu...@gmail.com> wrote:
> >>>> Toree currently has some support for Python and R which are a thin
> >>>> wrapper around what is provided in Spark. While this enables sharing
> >>>> the Spark session and SQL Context between Scala and Python code, it
> >>>> brings a huge gap on functionality when comparing with other Python
> >>>> kernels like IPython and IRKernel. Adding to that, there is no
> >>>> community member that is actively enhancing and contributing to these
> >>>> two areas.
> >>>>
> >>>> Based on these, I would like to suggest the following to be done for
> >>>> Toree 0.3.0 release timeframe:
> >>>> - Remove Python and R support from Toree
> >>>> - Document possible suggestions on Toree website (e.g. IPython,
> >>>> IRKernel or others more active on their respective community)
> >>>>
> >>>> Some of the benefits:
> >>>> - Avoid user confusion and frustration
> >>>> - Improve kernel startup performance
> >>>> - Cleaner code to maintain
> >>>>
> >>>> Please let me know your thoughts.
> >>>>
> >>>> [1]
> >>>
> >>
> https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/sqlcontext_sharing.ipynb
> >>>>
> >>>> --
> >>>> Luciano Resende
> >>>> http://twitter.com/lresende1975
> >>>> http://lresende.blogspot.com/
> >>>>
> >>>
> >> --
> >> regards
> >> Marius van Niekerk
> >>
>

Re: Removing Toree specific support for Python and R

Posted by Gino Bustelo <lb...@gmail.com>.
+1

Spark’s support to share temp tables across sessions voids any need for this multi-Lang support. 

Gino B.

> On Sep 25, 2018, at 2:20 PM, Corey Stubbs <ca...@gmail.com> wrote:
> 
> +1
> 
> On Tue, Sep 25, 2018, 13:10 Marius van Niekerk <ma...@gmail.com>
> wrote:
> 
>> +1
>> 
>> This would simplify usage, and remove a large class of python questions and
>> confusion on our mailing list.  Several users assume that our R and Python
>> featureset has parity with the primary kernels.
>> 
>>> On Tue, 25 Sep 2018 at 12:46 kbates4@gmail.com <kb...@gmail.com> wrote:
>>> 
>>> +1
>>> The benefits that Luciano points out far outweigh the, probably
>>> seldom-used, context sharing capabilities with limited features, IMHO.
>>> This will eliminate an entire class of issues and questions while
>> allowing
>>> contributors to narrow their focus only on scala functionality and
>>> improvements.
>>> 
>>>> On 2018/09/25 16:04:33, Luciano Resende <lu...@gmail.com> wrote:
>>>> Toree currently has some support for Python and R which are a thin
>>>> wrapper around what is provided in Spark. While this enables sharing
>>>> the Spark session and SQL Context between Scala and Python code, it
>>>> brings a huge gap on functionality when comparing with other Python
>>>> kernels like IPython and IRKernel. Adding to that, there is no
>>>> community member that is actively enhancing and contributing to these
>>>> two areas.
>>>> 
>>>> Based on these, I would like to suggest the following to be done for
>>>> Toree 0.3.0 release timeframe:
>>>> - Remove Python and R support from Toree
>>>> - Document possible suggestions on Toree website (e.g. IPython,
>>>> IRKernel or others more active on their respective community)
>>>> 
>>>> Some of the benefits:
>>>> - Avoid user confusion and frustration
>>>> - Improve kernel startup performance
>>>> - Cleaner code to maintain
>>>> 
>>>> Please let me know your thoughts.
>>>> 
>>>> [1]
>>> 
>> https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/sqlcontext_sharing.ipynb
>>>> 
>>>> --
>>>> Luciano Resende
>>>> http://twitter.com/lresende1975
>>>> http://lresende.blogspot.com/
>>>> 
>>> 
>> --
>> regards
>> Marius van Niekerk
>> 

Re: Removing Toree specific support for Python and R

Posted by Corey Stubbs <ca...@gmail.com>.
+1

On Tue, Sep 25, 2018, 13:10 Marius van Niekerk <ma...@gmail.com>
wrote:

> +1
>
> This would simplify usage, and remove a large class of python questions and
> confusion on our mailing list.  Several users assume that our R and Python
> featureset has parity with the primary kernels.
>
> On Tue, 25 Sep 2018 at 12:46 kbates4@gmail.com <kb...@gmail.com> wrote:
>
> > +1
> > The benefits that Luciano points out far outweigh the, probably
> > seldom-used, context sharing capabilities with limited features, IMHO.
> > This will eliminate an entire class of issues and questions while
> allowing
> > contributors to narrow their focus only on scala functionality and
> > improvements.
> >
> > On 2018/09/25 16:04:33, Luciano Resende <lu...@gmail.com> wrote:
> > > Toree currently has some support for Python and R which are a thin
> > > wrapper around what is provided in Spark. While this enables sharing
> > > the Spark session and SQL Context between Scala and Python code, it
> > > brings a huge gap on functionality when comparing with other Python
> > > kernels like IPython and IRKernel. Adding to that, there is no
> > > community member that is actively enhancing and contributing to these
> > > two areas.
> > >
> > > Based on these, I would like to suggest the following to be done for
> > > Toree 0.3.0 release timeframe:
> > > - Remove Python and R support from Toree
> > > - Document possible suggestions on Toree website (e.g. IPython,
> > > IRKernel or others more active on their respective community)
> > >
> > > Some of the benefits:
> > > - Avoid user confusion and frustration
> > > - Improve kernel startup performance
> > > - Cleaner code to maintain
> > >
> > > Please let me know your thoughts.
> > >
> > > [1]
> >
> https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/sqlcontext_sharing.ipynb
> > >
> > > --
> > > Luciano Resende
> > > http://twitter.com/lresende1975
> > > http://lresende.blogspot.com/
> > >
> >
> --
> regards
> Marius van Niekerk
>

Re: Removing Toree specific support for Python and R

Posted by Marius van Niekerk <ma...@gmail.com>.
+1

This would simplify usage, and remove a large class of python questions and
confusion on our mailing list.  Several users assume that our R and Python
featureset has parity with the primary kernels.

On Tue, 25 Sep 2018 at 12:46 kbates4@gmail.com <kb...@gmail.com> wrote:

> +1
> The benefits that Luciano points out far outweigh the, probably
> seldom-used, context sharing capabilities with limited features, IMHO.
> This will eliminate an entire class of issues and questions while allowing
> contributors to narrow their focus only on scala functionality and
> improvements.
>
> On 2018/09/25 16:04:33, Luciano Resende <lu...@gmail.com> wrote:
> > Toree currently has some support for Python and R which are a thin
> > wrapper around what is provided in Spark. While this enables sharing
> > the Spark session and SQL Context between Scala and Python code, it
> > brings a huge gap on functionality when comparing with other Python
> > kernels like IPython and IRKernel. Adding to that, there is no
> > community member that is actively enhancing and contributing to these
> > two areas.
> >
> > Based on these, I would like to suggest the following to be done for
> > Toree 0.3.0 release timeframe:
> > - Remove Python and R support from Toree
> > - Document possible suggestions on Toree website (e.g. IPython,
> > IRKernel or others more active on their respective community)
> >
> > Some of the benefits:
> > - Avoid user confusion and frustration
> > - Improve kernel startup performance
> > - Cleaner code to maintain
> >
> > Please let me know your thoughts.
> >
> > [1]
> https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/sqlcontext_sharing.ipynb
> >
> > --
> > Luciano Resende
> > http://twitter.com/lresende1975
> > http://lresende.blogspot.com/
> >
>
-- 
regards
Marius van Niekerk

Re: Removing Toree specific support for Python and R

Posted by kb...@gmail.com, kb...@gmail.com.
+1
The benefits that Luciano points out far outweigh the, probably seldom-used, context sharing capabilities with limited features, IMHO.  This will eliminate an entire class of issues and questions while allowing contributors to narrow their focus only on scala functionality and improvements.

On 2018/09/25 16:04:33, Luciano Resende <lu...@gmail.com> wrote: 
> Toree currently has some support for Python and R which are a thin
> wrapper around what is provided in Spark. While this enables sharing
> the Spark session and SQL Context between Scala and Python code, it
> brings a huge gap on functionality when comparing with other Python
> kernels like IPython and IRKernel. Adding to that, there is no
> community member that is actively enhancing and contributing to these
> two areas.
> 
> Based on these, I would like to suggest the following to be done for
> Toree 0.3.0 release timeframe:
> - Remove Python and R support from Toree
> - Document possible suggestions on Toree website (e.g. IPython,
> IRKernel or others more active on their respective community)
> 
> Some of the benefits:
> - Avoid user confusion and frustration
> - Improve kernel startup performance
> - Cleaner code to maintain
> 
> Please let me know your thoughts.
> 
> [1] https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/sqlcontext_sharing.ipynb
> 
> -- 
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
> 

Re: Removing Toree specific support for Python and R

Posted by Ryan Blue <rb...@netflix.com.INVALID>.
+1

After investigating the python support, we concluded that it's better to
use a normal ipython kernel. I think that the python Toree kernel would
take a lot of work to be on par with what already exists in Python.

On Tue, Sep 25, 2018 at 9:04 AM Luciano Resende <lu...@gmail.com>
wrote:

> Toree currently has some support for Python and R which are a thin
> wrapper around what is provided in Spark. While this enables sharing
> the Spark session and SQL Context between Scala and Python code, it
> brings a huge gap on functionality when comparing with other Python
> kernels like IPython and IRKernel. Adding to that, there is no
> community member that is actively enhancing and contributing to these
> two areas.
>
> Based on these, I would like to suggest the following to be done for
> Toree 0.3.0 release timeframe:
> - Remove Python and R support from Toree
> - Document possible suggestions on Toree website (e.g. IPython,
> IRKernel or others more active on their respective community)
>
> Some of the benefits:
> - Avoid user confusion and frustration
> - Improve kernel startup performance
> - Cleaner code to maintain
>
> Please let me know your thoughts.
>
> [1]
> https://github.com/apache/incubator-toree/blob/master/etc/examples/notebooks/sqlcontext_sharing.ipynb
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>


-- 
Ryan Blue
Software Engineer
Netflix