You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@zeppelin.apache.org by "Goodman, Alexander (398K)" <al...@jpl.nasa.gov> on 2016/08/18 18:57:35 UTC

improving matplotlib integration in zeppelin

Hi all,

As per previous discussion I had with Alex Bezzubov on the users mailing
list, I have created two new JIRA issues ([1] and [2]) explaining in more
detail what I think we should ultimately strive for in our ongoing work to
improve matplotlib integration in zeppelin. For now I think I will be able
to handle the bulk of the work for the static images backend issue
[ZEPPELIN-1345] on my own, but more collaboration will be needed to get
interactive plotting to work.  Please feel free to discuss any thoughts or
suggestions you may have here.

[1] - https://issues.apache.org/jira/browse/ZEPPELIN-1344
[2] - https://issues.apache.org/jira/browse/ZEPPELIN-1345

Thanks,
Alex
-- 
Alex Goodman
Data Scientist I
Science Data Modeling and Computing (398K)
Jet Propulsion Laboratory
California Institute of Technology
Tel: +1-818-354-6012

Re: improving matplotlib integration in zeppelin

Posted by moon soo Lee <mo...@apache.org>.
Hi,

It's great to see improving matplotlib integration. Thanks a lot.

In my understanding, in interactive mode, the graph supposed to be updated
even if some matplotlib methods are called in the other paragraph(cell).
That means the result of a paragraph need to be updated by running another
paragraph.

Currently, i think there're two different facilities in Zeppelin to do that.

One possible way is using InterpreterContextRunner [1]. InterpreterContext
provides InterpreterContextRunner[2] and it gives ability to run other
paragraphs in the same note. However this approach does have some
limitations. Like if paragraph (cell) is in the other notebook, interactive
update of graph will not work anymore. And because it's not only update the
result of the other paragraph, but also run the other paragraph, it'll be
difficult to make interactive mode work correctly depends on user code in
each paragraphs.

Second possible approach is using AngularDisplay system.
Which allows interpreter send/receive some data and event from/to front-end
side.
So without rerun another paragraph, it's possible to update result of a
paragraph from another.
Any interpreter can get AngularObjectRegistry[3] from InterpreterContext
[4], and AngularObjectRegistry allows create object / add event hook to
communicate with front-end. I think this is more feasible approach.

Thanks,
moon

[1]
https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/InterpreterContextRunner.java
[2]
https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/InterpreterContext.java#L123
[3]
https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/display/AngularObjectRegistry.java
[4]
https://github.com/apache/zeppelin/blob/master/zeppelin-interpreter/src/main/java/org/apache/zeppelin/interpreter/InterpreterContext.java#L115


On Thu, Aug 18, 2016 at 11:57 AM Goodman, Alexander (398K) <
alexander.goodman@jpl.nasa.gov> wrote:

> Hi all,
>
> As per previous discussion I had with Alex Bezzubov on the users mailing
> list, I have created two new JIRA issues ([1] and [2]) explaining in more
> detail what I think we should ultimately strive for in our ongoing work to
> improve matplotlib integration in zeppelin. For now I think I will be able
> to handle the bulk of the work for the static images backend issue
> [ZEPPELIN-1345] on my own, but more collaboration will be needed to get
> interactive plotting to work.  Please feel free to discuss any thoughts or
> suggestions you may have here.
>
> [1] - https://issues.apache.org/jira/browse/ZEPPELIN-1344
> [2] - https://issues.apache.org/jira/browse/ZEPPELIN-1345
>
> Thanks,
> Alex
> --
> Alex Goodman
> Data Scientist I
> Science Data Modeling and Computing (398K)
> Jet Propulsion Laboratory
> California Institute of Technology
> Tel: +1-818-354-6012
>

Re: improving matplotlib integration in zeppelin

Posted by Alexander Bezzubov <bz...@apache.org>.
Right, thank you Moon for explaining both approaches in detail.

I will be happy to implement AngularDisplay support for Python interpreter
later on, to enable it to interact with AngularObjects the way
SparkInterpreter does. I have created ZEPPELIN-1361
<https://issues.apache.org/jira/browse/ZEPPELIN-1361> to track the progress.

This mechanism could then be used in more involved, interactive Matplotlib
backend implementation.

--
Alex

On Sat, Aug 20, 2016, 13:23 Goodman, Alexander (398K) <
alexander.goodman@jpl.nasa.gov> wrote:

> Hi Moon,
>
> Thank you for the informative response. You are right, this is all in fact
> explicitly stated in the most recent set of matplotlib release notes[1].
> This won't really apply to the static inline plotting backend that I will
> be tackling first which will mostly be pure python, but it will probably be
> good to keep this in mind when we begin work on the interactive plotting.
>
> [1] - http://matplotlib.org/users/whats_new.html#interactive-oo-usage
>
> Thanks,
> Alex
>
> On Thu, Aug 18, 2016 at 11:13 PM, moon soo Lee <mo...@apache.org> wrote:
>
> > Hi,
> >
> > It's great to see improving matplotlib integration. Thanks a lot.
> >
> > In my understanding, in interactive mode, the graph supposed to be
> updated
> > even if some matplotlib methods are called in the other paragraph(cell).
> > That means the result of a paragraph need to be updated by running
> another
> > paragraph.
> >
> > Currently, i think there're two different facilities in Zeppelin to do
> > that.
> >
> > One possible way is using InterpreterContextRunner [1].
> InterpreterContext
> > provides InterpreterContextRunner[2] and it gives ability to run other
> > paragraphs in the same note. However this approach does have some
> > limitations. Like if paragraph (cell) is in the other notebook,
> interactive
> > update of graph will not work anymore. And because it's not only update
> the
> > result of the other paragraph, but also run the other paragraph, it'll be
> > difficult to make interactive mode work correctly depends on user code in
> > each paragraphs.
> >
> > Second possible approach is using AngularDisplay system.
> > Which allows interpreter send/receive some data and event from/to
> front-end
> > side.
> > So without rerun another paragraph, it's possible to update result of a
> > paragraph from another.
> > Any interpreter can get AngularObjectRegistry[3] from InterpreterContext
> > [4], and AngularObjectRegistry allows create object / add event hook to
> > communicate with front-end. I think this is more feasible approach.
> >
> > Thanks,
> > moon
> >
> > [1]
> > https://github.com/apache/zeppelin/blob/master/zeppelin-
> > interpreter/src/main/java/org/apache/zeppelin/interpreter/
> > InterpreterContextRunner.java
> > [2]
> > https://github.com/apache/zeppelin/blob/master/zeppelin-
> > interpreter/src/main/java/org/apache/zeppelin/interpreter/
> > InterpreterContext.java#L123
> > [3]
> > https://github.com/apache/zeppelin/blob/master/zeppelin-
> > interpreter/src/main/java/org/apache/zeppelin/display/
> > AngularObjectRegistry.java
> > [4]
> > https://github.com/apache/zeppelin/blob/master/zeppelin-
> > interpreter/src/main/java/org/apache/zeppelin/interpreter/
> > InterpreterContext.java#L115
> >
> >
> > On Thu, Aug 18, 2016 at 11:57 AM Goodman, Alexander (398K) <
> > alexander.goodman@jpl.nasa.gov> wrote:
> >
> > > Hi all,
> > >
> > > As per previous discussion I had with Alex Bezzubov on the users
> mailing
> > > list, I have created two new JIRA issues ([1] and [2]) explaining in
> more
> > > detail what I think we should ultimately strive for in our ongoing work
> > to
> > > improve matplotlib integration in zeppelin. For now I think I will be
> > able
> > > to handle the bulk of the work for the static images backend issue
> > > [ZEPPELIN-1345] on my own, but more collaboration will be needed to get
> > > interactive plotting to work.  Please feel free to discuss any thoughts
> > or
> > > suggestions you may have here.
> > >
> > > [1] - https://issues.apache.org/jira/browse/ZEPPELIN-1344
> > > [2] - https://issues.apache.org/jira/browse/ZEPPELIN-1345
> > >
> > > Thanks,
> > > Alex
> > > --
> > > Alex Goodman
> > > Data Scientist I
> > > Science Data Modeling and Computing (398K)
> > > Jet Propulsion Laboratory
> > > California Institute of Technology
> > > Tel: +1-818-354-6012
> > >
> >
>
>
>
> --
> Alex Goodman
> Data Scientist I
> Science Data Modeling and Computing (398K)
> Jet Propulsion Laboratory
> California Institute of Technology
> Tel: +1-818-354-6012
>

Re: improving matplotlib integration in zeppelin

Posted by "Goodman, Alexander (398K)" <al...@jpl.nasa.gov>.
Hi all,

I have recently finished implementing an inline plotting backend (for
static images, see ZEPPELIN-1345) on my local machine as well as on a spark
cluster. I will soon make a PR, but before that I have a few questions
where I would like some insight from more experienced developers on this
project:

1) What would be the best way to package the backend with the project?
Currently it exists as a lone python source file, and using it requires
that it can be imported through the user's local PYTHONPATH. Additionally,
I think it should be packaged as a standalone product from the interpreters
as it is something that can be used by both the pyspark and python
interpreters.

2) What is the best way to implement pre/post-execute callbacks?
Specifically in this case it would be useful to implement code that the
interpreter only executes after the paragraph is finished processing its
user-entered code into the interpreter, as then it would be possible to
display images after the last executed plotting command rather than waiting
for explicit calls to show(). For example in Jupyter, plt.plot(x) will
display the plot even without calling plt.show(), that's precisely what I
am aiming for here.

Thanks,
Alex

On Tue, Aug 23, 2016 at 7:28 PM, Alexander Bezzubov <bz...@apache.org> wrote:

> Right, thank you Moon for explaining both approaches in detail.
>
> I will be happy to implement AngularDisplay support for Python interpreter
> later on, to enable it to interact with AngularObjects the way
> SparkInterpreter does. I have created ZEPPELIN-1361
> <https://issues.apache.org/jira/browse/ZEPPELIN-1361> to track the
> progress.
>
> This mechanism could then be used in more involved, interactive Matplotlib
> backend implementation.
>
> --
> Alex
>
> On Sat, Aug 20, 2016, 13:23 Goodman, Alexander (398K) <
> alexander.goodman@jpl.nasa.gov> wrote:
>
> > Hi Moon,
> >
> > Thank you for the informative response. You are right, this is all in
> fact
> > explicitly stated in the most recent set of matplotlib release notes[1].
> > This won't really apply to the static inline plotting backend that I will
> > be tackling first which will mostly be pure python, but it will probably
> be
> > good to keep this in mind when we begin work on the interactive plotting.
> >
> > [1] - http://matplotlib.org/users/whats_new.html#interactive-oo-usage
> >
> > Thanks,
> > Alex
> >
> > On Thu, Aug 18, 2016 at 11:13 PM, moon soo Lee <mo...@apache.org> wrote:
> >
> > > Hi,
> > >
> > > It's great to see improving matplotlib integration. Thanks a lot.
> > >
> > > In my understanding, in interactive mode, the graph supposed to be
> > updated
> > > even if some matplotlib methods are called in the other
> paragraph(cell).
> > > That means the result of a paragraph need to be updated by running
> > another
> > > paragraph.
> > >
> > > Currently, i think there're two different facilities in Zeppelin to do
> > > that.
> > >
> > > One possible way is using InterpreterContextRunner [1].
> > InterpreterContext
> > > provides InterpreterContextRunner[2] and it gives ability to run other
> > > paragraphs in the same note. However this approach does have some
> > > limitations. Like if paragraph (cell) is in the other notebook,
> > interactive
> > > update of graph will not work anymore. And because it's not only update
> > the
> > > result of the other paragraph, but also run the other paragraph, it'll
> be
> > > difficult to make interactive mode work correctly depends on user code
> in
> > > each paragraphs.
> > >
> > > Second possible approach is using AngularDisplay system.
> > > Which allows interpreter send/receive some data and event from/to
> > front-end
> > > side.
> > > So without rerun another paragraph, it's possible to update result of a
> > > paragraph from another.
> > > Any interpreter can get AngularObjectRegistry[3] from
> InterpreterContext
> > > [4], and AngularObjectRegistry allows create object / add event hook to
> > > communicate with front-end. I think this is more feasible approach.
> > >
> > > Thanks,
> > > moon
> > >
> > > [1]
> > > https://github.com/apache/zeppelin/blob/master/zeppelin-
> > > interpreter/src/main/java/org/apache/zeppelin/interpreter/
> > > InterpreterContextRunner.java
> > > [2]
> > > https://github.com/apache/zeppelin/blob/master/zeppelin-
> > > interpreter/src/main/java/org/apache/zeppelin/interpreter/
> > > InterpreterContext.java#L123
> > > [3]
> > > https://github.com/apache/zeppelin/blob/master/zeppelin-
> > > interpreter/src/main/java/org/apache/zeppelin/display/
> > > AngularObjectRegistry.java
> > > [4]
> > > https://github.com/apache/zeppelin/blob/master/zeppelin-
> > > interpreter/src/main/java/org/apache/zeppelin/interpreter/
> > > InterpreterContext.java#L115
> > >
> > >
> > > On Thu, Aug 18, 2016 at 11:57 AM Goodman, Alexander (398K) <
> > > alexander.goodman@jpl.nasa.gov> wrote:
> > >
> > > > Hi all,
> > > >
> > > > As per previous discussion I had with Alex Bezzubov on the users
> > mailing
> > > > list, I have created two new JIRA issues ([1] and [2]) explaining in
> > more
> > > > detail what I think we should ultimately strive for in our ongoing
> work
> > > to
> > > > improve matplotlib integration in zeppelin. For now I think I will be
> > > able
> > > > to handle the bulk of the work for the static images backend issue
> > > > [ZEPPELIN-1345] on my own, but more collaboration will be needed to
> get
> > > > interactive plotting to work.  Please feel free to discuss any
> thoughts
> > > or
> > > > suggestions you may have here.
> > > >
> > > > [1] - https://issues.apache.org/jira/browse/ZEPPELIN-1344
> > > > [2] - https://issues.apache.org/jira/browse/ZEPPELIN-1345
> > > >
> > > > Thanks,
> > > > Alex
> > > > --
> > > > Alex Goodman
> > > > Data Scientist I
> > > > Science Data Modeling and Computing (398K)
> > > > Jet Propulsion Laboratory
> > > > California Institute of Technology
> > > > Tel: +1-818-354-6012
> > > >
> > >
> >
> >
> >
> > --
> > Alex Goodman
> > Data Scientist I
> > Science Data Modeling and Computing (398K)
> > Jet Propulsion Laboratory
> > California Institute of Technology
> > Tel: +1-818-354-6012
> >
>



-- 
Alex Goodman
Data Scientist I
Science Data Modeling and Computing (398K)
Jet Propulsion Laboratory
California Institute of Technology
Tel: +1-818-354-6012

Re: improving matplotlib integration in zeppelin

Posted by "Goodman, Alexander (398K)" <al...@jpl.nasa.gov>.
Hi Moon,

Thank you for the informative response. You are right, this is all in fact
explicitly stated in the most recent set of matplotlib release notes[1].
This won't really apply to the static inline plotting backend that I will
be tackling first which will mostly be pure python, but it will probably be
good to keep this in mind when we begin work on the interactive plotting.

[1] - http://matplotlib.org/users/whats_new.html#interactive-oo-usage

Thanks,
Alex

On Thu, Aug 18, 2016 at 11:13 PM, moon soo Lee <mo...@apache.org> wrote:

> Hi,
>
> It's great to see improving matplotlib integration. Thanks a lot.
>
> In my understanding, in interactive mode, the graph supposed to be updated
> even if some matplotlib methods are called in the other paragraph(cell).
> That means the result of a paragraph need to be updated by running another
> paragraph.
>
> Currently, i think there're two different facilities in Zeppelin to do
> that.
>
> One possible way is using InterpreterContextRunner [1]. InterpreterContext
> provides InterpreterContextRunner[2] and it gives ability to run other
> paragraphs in the same note. However this approach does have some
> limitations. Like if paragraph (cell) is in the other notebook, interactive
> update of graph will not work anymore. And because it's not only update the
> result of the other paragraph, but also run the other paragraph, it'll be
> difficult to make interactive mode work correctly depends on user code in
> each paragraphs.
>
> Second possible approach is using AngularDisplay system.
> Which allows interpreter send/receive some data and event from/to front-end
> side.
> So without rerun another paragraph, it's possible to update result of a
> paragraph from another.
> Any interpreter can get AngularObjectRegistry[3] from InterpreterContext
> [4], and AngularObjectRegistry allows create object / add event hook to
> communicate with front-end. I think this is more feasible approach.
>
> Thanks,
> moon
>
> [1]
> https://github.com/apache/zeppelin/blob/master/zeppelin-
> interpreter/src/main/java/org/apache/zeppelin/interpreter/
> InterpreterContextRunner.java
> [2]
> https://github.com/apache/zeppelin/blob/master/zeppelin-
> interpreter/src/main/java/org/apache/zeppelin/interpreter/
> InterpreterContext.java#L123
> [3]
> https://github.com/apache/zeppelin/blob/master/zeppelin-
> interpreter/src/main/java/org/apache/zeppelin/display/
> AngularObjectRegistry.java
> [4]
> https://github.com/apache/zeppelin/blob/master/zeppelin-
> interpreter/src/main/java/org/apache/zeppelin/interpreter/
> InterpreterContext.java#L115
>
>
> On Thu, Aug 18, 2016 at 11:57 AM Goodman, Alexander (398K) <
> alexander.goodman@jpl.nasa.gov> wrote:
>
> > Hi all,
> >
> > As per previous discussion I had with Alex Bezzubov on the users mailing
> > list, I have created two new JIRA issues ([1] and [2]) explaining in more
> > detail what I think we should ultimately strive for in our ongoing work
> to
> > improve matplotlib integration in zeppelin. For now I think I will be
> able
> > to handle the bulk of the work for the static images backend issue
> > [ZEPPELIN-1345] on my own, but more collaboration will be needed to get
> > interactive plotting to work.  Please feel free to discuss any thoughts
> or
> > suggestions you may have here.
> >
> > [1] - https://issues.apache.org/jira/browse/ZEPPELIN-1344
> > [2] - https://issues.apache.org/jira/browse/ZEPPELIN-1345
> >
> > Thanks,
> > Alex
> > --
> > Alex Goodman
> > Data Scientist I
> > Science Data Modeling and Computing (398K)
> > Jet Propulsion Laboratory
> > California Institute of Technology
> > Tel: +1-818-354-6012
> >
>



-- 
Alex Goodman
Data Scientist I
Science Data Modeling and Computing (398K)
Jet Propulsion Laboratory
California Institute of Technology
Tel: +1-818-354-6012