You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@systemml.apache.org by Niketan Pansare <np...@us.ibm.com> on 2016/10/27 23:20:08 UTC

[DISCUSS] Adding tensorboard-like functionality to SystemML


Hi all,

To give every context, I am working on a new deep learning API for SystemML
that is backed by the NN library (
https://github.com/apache/incubator-systemml/tree/master/scripts/staging/SystemML-NN/nn
). This API allows the users to express their model using Caffe
specification and perform fit/predict similar to scikit-learn APIs. I have
created a sample notebook explaining the usage of the API:
https://github.com/niketanpansare/incubator-systemml/blob/1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-notebooks/Barista-API-Demo.ipynb
. This API also allows the user to load and store pre-trained models. See
https://github.com/niketanpansare/model_zoo/tree/master/caffe/vision/vgg/ilsvrc12

As part of this API, I added a mini-tensorboard like functionality (see
step 6 and 7) using matplotlib. If there is enough interest, we can extend
and standardize the visualization functionality across all over algorithms.
Here are some initial discussion points:
1. Primary visualization mechanism (Jupyter or a standalone app or both =>
former is useful for cloud offering such as DSX and latter provides the
design team more creative control)
2. What to plot for each algorithm (data scientists and algorithms
developers will help us here).
3. Standardize UI (if we decide to go with Jupyter, we need to extend the
code in _visualize method:
https://github.com/niketanpansare/incubator-systemml/blob/1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/src/main/python/systemml/mllearn/estimators.py#L621
)
4. Primary APIs to target (python, scala, command-line or all)

Thanks,

Niketan Pansare
IBM Almaden Research Center
E-mail: npansar At us.ibm.com
http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar

Re: [DISCUSS] Adding tensorboard-like functionality to SystemML

Posted by Jeremy Anderson <je...@objectadjective.com>.
Thanks Deron. Let's move forward with this. Several of us are interested in
initiating research in this area, so I'll reach out.

...........................

Jeremy Anderson

Github: https://github.com/objectadjective
Twitter: https://twitter.com/ObjectAdjective
LinkedIn: http://www.linkedin.com/in/objectadjective

On 1 November 2016 at 21:05, Madison Myers <ma...@gmail.com> wrote:

> +1 to all. Really believe that visualization is a problem area that needs
> to be improved. Let me know if I can help as well.
>
> On Mon, Oct 31, 2016 at 1:05 PM, Deron Eriksson <de...@gmail.com>
> wrote:
>
> > Hi Jeremy,
> >
> > I think moving forward with visualization and design is a great idea,
> > especially since I feel there is currently momentum after the great
> design
> > refactoring of the project website. Mike and Jeremy, please let me know
> if
> > there's any way in which I can help.
> >
> > Deron
> >
> >
> > On Fri, Oct 28, 2016 at 8:03 PM, Jeremy Anderson <
> > jeremy@objectadjective.com
> > > wrote:
> >
> > > >
> > > > Visualization is a good topic to bring up for the project. I would
> like
> > > to
> > > > add another possible option of using TensorBoard directly. I have not
> > > > looked into the file format used for TensorBoard, but it may be
> > possible
> > > to
> > > > simple adopt that format, and simply write our stats to that type of
> > > file.
> > > > That would allow us to reuse that project without having to write our
> > > own.
> > >
> > >
> > > Mike, I think this is a great place to start. I'd love to collaborate
> > from
> > > a design perspective, with anyone  that wants to technical side.
> > >
> > > ...........................
> > >
> > > Jeremy Anderson
> > > Github: https://github.com/objectadjective
> > > Twitter: https://twitter.com/ObjectAdjective
> > > LinkedIN: http://www.linkedin.com/in/objectadjective
> > >
> > > On 29 October 2016 at 02:46, <du...@gmail.com> wrote:
> > >
> > > > Visualization is a good topic to bring up for the project. I would
> like
> > > to
> > > > add another possible option of using TensorBoard directly. I have not
> > > > looked into the file format used for TensorBoard, but it may be
> > possible
> > > to
> > > > simple adopt that format, and simply write our stats to that type of
> > > file.
> > > > That would allow us to reuse that project without having to write our
> > > own.
> > > >
> > > > --
> > > >
> > > > Mike Dusenberry
> > > > GitHub: github.com/dusenberrymw
> > > > LinkedIn: linkedin.com/in/mikedusenberry
> > > >
> > > > Sent from my iPhone.
> > > >
> > > >
> > > > > On Oct 28, 2016, at 8:13 AM, Niketan Pansare <np...@us.ibm.com>
> > > wrote:
> > > > >
> > > > > Hi Matthias,
> > > > >
> > > > > Thanks for your feedback.
> > > > >
> > > > > There is a tradeoff between keeping a feature in-house until it is
> > > > stable, v/s continually getting community feedback as the work is
> > getting
> > > > done via PR and discussions. I am for the latter as it encourages
> > > community
> > > > feedback as well as participation.
> > > > >
> > > > > I agree that our goal should be to complete the features you
> > mentioned
> > > > asap and yes, we are working hard towards making the GPU backend, the
> > > deep
> > > > learning built-in functions and the algorithm wrappers (ones that are
> > > > already added) to be 'non-experimental' in the 1.0 release :) ...
> Also,
> > > > like you hinted, it is important to explicitly mark the experimental
> > > > features in the documentation to avoid the 'bad impression'. The
> Python
> > > DSL
> > > > will remain experimental until there is more interest from the
> > > community. I
> > > > am fine with deleting the debugger since it is rarely used, if at
> all.
> > > > >
> > > > > Keeping inline with the Apache guidelines, this discussion is to
> > allow
> > > > community to decide on whether SystemML community should consider
> > adding
> > > > new visualization functionality (since this feature is user facing).
> If
> > > > there is no interest, we can either postpone or discard this
> discussion
> > > :)
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Niketan.
> > > > >
> > > > >> On Oct 28, 2016, at 1:24 AM, Matthias Boehm <
> mboehm7@googlemail.com
> > >
> > > > wrote:
> > > > >>
> > > > >> Thanks for putting this together Niketan. However, could we please
> > > > >> postpone this discussion after our 1.0 release? Right now, I'm
> > > concerned
> > > > >> to see that we're adding many experimental features without really
> > > > >> getting them done. This includes for example, the GPU backend, the
> > new
> > > > >> MLContext API, the Python DSL, the deep learning builtin
> functions,
> > > the
> > > > >> Scala algorithm wrappers, the old Spark debugger interface, and
> > > > >> compressed linear algebra. I think we should finish these features
> > > first
> > > > >> before moving on. If we're not careful about that, it would
> quickly
> > > > >> create a very bad impression for new users.
> > > > >>
> > > > >> Regards,
> > > > >> Matthias
> > > > >>
> > > > >>> On 10/28/2016 1:20 AM, Niketan Pansare wrote:
> > > > >>>
> > > > >>>
> > > > >>> Hi all,
> > > > >>>
> > > > >>> To give every context, I am working on a new deep learning API
> for
> > > > SystemML
> > > > >>> that is backed by the NN library (
> > > > >>> https://github.com/apache/incubator-systemml/tree/
> > > > master/scripts/staging/SystemML-NN/nn
> > > > >>> ). This API allows the users to express their model using Caffe
> > > > >>> specification and perform fit/predict similar to scikit-learn
> > APIs. I
> > > > have
> > > > >>> created a sample notebook explaining the usage of the API:
> > > > >>> https://github.com/niketanpansare/incubator-systemml/blob/
> > > > 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-
> > > > notebooks/Barista-API-Demo.ipynb
> > > > >>> . This API also allows the user to load and store pre-trained
> > models.
> > > > See
> > > > >>> https://github.com/niketanpansare/model_zoo/tree/
> > > > master/caffe/vision/vgg/ilsvrc12
> > > > >>>
> > > > >>> As part of this API, I added a mini-tensorboard like
> functionality
> > > (see
> > > > >>> step 6 and 7) using matplotlib. If there is enough interest, we
> can
> > > > extend
> > > > >>> and standardize the visualization functionality across all over
> > > > algorithms.
> > > > >>> Here are some initial discussion points:
> > > > >>> 1. Primary visualization mechanism (Jupyter or a standalone app
> or
> > > > both =>
> > > > >>> former is useful for cloud offering such as DSX and latter
> provides
> > > the
> > > > >>> design team more creative control)
> > > > >>> 2. What to plot for each algorithm (data scientists and
> algorithms
> > > > >>> developers will help us here).
> > > > >>> 3. Standardize UI (if we decide to go with Jupyter, we need to
> > extend
> > > > the
> > > > >>> code in _visualize method:
> > > > >>> https://github.com/niketanpansare/incubator-systemml/blob/
> > > > 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/src/main/python/
> > > > systemml/mllearn/estimators.py#L621
> > > > >>> )
> > > > >>> 4. Primary APIs to target (python, scala, command-line or all)
> > > > >>>
> > > > >>> Thanks,
> > > > >>>
> > > > >>> Niketan Pansare
> > > > >>> IBM Almaden Research Center
> > > > >>> E-mail: npansar At us.ibm.com
> > > > >>> http://researcher.watson.ibm.com/researcher/view.php?
> > > person=us-npansar
> > > > >>>
> > > > >>
> > > > >
> > > >
> > >
> >
>
>
>
> --
> *Madison J. Myers*
> *UC Berkeley, Master of Information & Data Science '17*
>
> *King's College London, MA Political Science '14*
> *New York University, BA Political Science '12*
>
>    -
>       LinkedIn <http://linkedin.com/in/madisonjmyers>
>

Re: [DISCUSS] Adding tensorboard-like functionality to SystemML

Posted by Madison Myers <ma...@gmail.com>.
+1 to all. Really believe that visualization is a problem area that needs
to be improved. Let me know if I can help as well.

On Mon, Oct 31, 2016 at 1:05 PM, Deron Eriksson <de...@gmail.com>
wrote:

> Hi Jeremy,
>
> I think moving forward with visualization and design is a great idea,
> especially since I feel there is currently momentum after the great design
> refactoring of the project website. Mike and Jeremy, please let me know if
> there's any way in which I can help.
>
> Deron
>
>
> On Fri, Oct 28, 2016 at 8:03 PM, Jeremy Anderson <
> jeremy@objectadjective.com
> > wrote:
>
> > >
> > > Visualization is a good topic to bring up for the project. I would like
> > to
> > > add another possible option of using TensorBoard directly. I have not
> > > looked into the file format used for TensorBoard, but it may be
> possible
> > to
> > > simple adopt that format, and simply write our stats to that type of
> > file.
> > > That would allow us to reuse that project without having to write our
> > own.
> >
> >
> > Mike, I think this is a great place to start. I'd love to collaborate
> from
> > a design perspective, with anyone  that wants to technical side.
> >
> > ...........................
> >
> > Jeremy Anderson
> > Github: https://github.com/objectadjective
> > Twitter: https://twitter.com/ObjectAdjective
> > LinkedIN: http://www.linkedin.com/in/objectadjective
> >
> > On 29 October 2016 at 02:46, <du...@gmail.com> wrote:
> >
> > > Visualization is a good topic to bring up for the project. I would like
> > to
> > > add another possible option of using TensorBoard directly. I have not
> > > looked into the file format used for TensorBoard, but it may be
> possible
> > to
> > > simple adopt that format, and simply write our stats to that type of
> > file.
> > > That would allow us to reuse that project without having to write our
> > own.
> > >
> > > --
> > >
> > > Mike Dusenberry
> > > GitHub: github.com/dusenberrymw
> > > LinkedIn: linkedin.com/in/mikedusenberry
> > >
> > > Sent from my iPhone.
> > >
> > >
> > > > On Oct 28, 2016, at 8:13 AM, Niketan Pansare <np...@us.ibm.com>
> > wrote:
> > > >
> > > > Hi Matthias,
> > > >
> > > > Thanks for your feedback.
> > > >
> > > > There is a tradeoff between keeping a feature in-house until it is
> > > stable, v/s continually getting community feedback as the work is
> getting
> > > done via PR and discussions. I am for the latter as it encourages
> > community
> > > feedback as well as participation.
> > > >
> > > > I agree that our goal should be to complete the features you
> mentioned
> > > asap and yes, we are working hard towards making the GPU backend, the
> > deep
> > > learning built-in functions and the algorithm wrappers (ones that are
> > > already added) to be 'non-experimental' in the 1.0 release :) ... Also,
> > > like you hinted, it is important to explicitly mark the experimental
> > > features in the documentation to avoid the 'bad impression'. The Python
> > DSL
> > > will remain experimental until there is more interest from the
> > community. I
> > > am fine with deleting the debugger since it is rarely used, if at all.
> > > >
> > > > Keeping inline with the Apache guidelines, this discussion is to
> allow
> > > community to decide on whether SystemML community should consider
> adding
> > > new visualization functionality (since this feature is user facing). If
> > > there is no interest, we can either postpone or discard this discussion
> > :)
> > > >
> > > > Thanks,
> > > >
> > > > Niketan.
> > > >
> > > >> On Oct 28, 2016, at 1:24 AM, Matthias Boehm <mboehm7@googlemail.com
> >
> > > wrote:
> > > >>
> > > >> Thanks for putting this together Niketan. However, could we please
> > > >> postpone this discussion after our 1.0 release? Right now, I'm
> > concerned
> > > >> to see that we're adding many experimental features without really
> > > >> getting them done. This includes for example, the GPU backend, the
> new
> > > >> MLContext API, the Python DSL, the deep learning builtin functions,
> > the
> > > >> Scala algorithm wrappers, the old Spark debugger interface, and
> > > >> compressed linear algebra. I think we should finish these features
> > first
> > > >> before moving on. If we're not careful about that, it would quickly
> > > >> create a very bad impression for new users.
> > > >>
> > > >> Regards,
> > > >> Matthias
> > > >>
> > > >>> On 10/28/2016 1:20 AM, Niketan Pansare wrote:
> > > >>>
> > > >>>
> > > >>> Hi all,
> > > >>>
> > > >>> To give every context, I am working on a new deep learning API for
> > > SystemML
> > > >>> that is backed by the NN library (
> > > >>> https://github.com/apache/incubator-systemml/tree/
> > > master/scripts/staging/SystemML-NN/nn
> > > >>> ). This API allows the users to express their model using Caffe
> > > >>> specification and perform fit/predict similar to scikit-learn
> APIs. I
> > > have
> > > >>> created a sample notebook explaining the usage of the API:
> > > >>> https://github.com/niketanpansare/incubator-systemml/blob/
> > > 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-
> > > notebooks/Barista-API-Demo.ipynb
> > > >>> . This API also allows the user to load and store pre-trained
> models.
> > > See
> > > >>> https://github.com/niketanpansare/model_zoo/tree/
> > > master/caffe/vision/vgg/ilsvrc12
> > > >>>
> > > >>> As part of this API, I added a mini-tensorboard like functionality
> > (see
> > > >>> step 6 and 7) using matplotlib. If there is enough interest, we can
> > > extend
> > > >>> and standardize the visualization functionality across all over
> > > algorithms.
> > > >>> Here are some initial discussion points:
> > > >>> 1. Primary visualization mechanism (Jupyter or a standalone app or
> > > both =>
> > > >>> former is useful for cloud offering such as DSX and latter provides
> > the
> > > >>> design team more creative control)
> > > >>> 2. What to plot for each algorithm (data scientists and algorithms
> > > >>> developers will help us here).
> > > >>> 3. Standardize UI (if we decide to go with Jupyter, we need to
> extend
> > > the
> > > >>> code in _visualize method:
> > > >>> https://github.com/niketanpansare/incubator-systemml/blob/
> > > 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/src/main/python/
> > > systemml/mllearn/estimators.py#L621
> > > >>> )
> > > >>> 4. Primary APIs to target (python, scala, command-line or all)
> > > >>>
> > > >>> Thanks,
> > > >>>
> > > >>> Niketan Pansare
> > > >>> IBM Almaden Research Center
> > > >>> E-mail: npansar At us.ibm.com
> > > >>> http://researcher.watson.ibm.com/researcher/view.php?
> > person=us-npansar
> > > >>>
> > > >>
> > > >
> > >
> >
>



-- 
*Madison J. Myers*
*UC Berkeley, Master of Information & Data Science '17*

*King's College London, MA Political Science '14*
*New York University, BA Political Science '12*

   -
      LinkedIn <http://linkedin.com/in/madisonjmyers>

Re: [DISCUSS] Adding tensorboard-like functionality to SystemML

Posted by Deron Eriksson <de...@gmail.com>.
Hi Jeremy,

I think moving forward with visualization and design is a great idea,
especially since I feel there is currently momentum after the great design
refactoring of the project website. Mike and Jeremy, please let me know if
there's any way in which I can help.

Deron


On Fri, Oct 28, 2016 at 8:03 PM, Jeremy Anderson <jeremy@objectadjective.com
> wrote:

> >
> > Visualization is a good topic to bring up for the project. I would like
> to
> > add another possible option of using TensorBoard directly. I have not
> > looked into the file format used for TensorBoard, but it may be possible
> to
> > simple adopt that format, and simply write our stats to that type of
> file.
> > That would allow us to reuse that project without having to write our
> own.
>
>
> Mike, I think this is a great place to start. I'd love to collaborate from
> a design perspective, with anyone  that wants to technical side.
>
> ...........................
>
> Jeremy Anderson
> Github: https://github.com/objectadjective
> Twitter: https://twitter.com/ObjectAdjective
> LinkedIN: http://www.linkedin.com/in/objectadjective
>
> On 29 October 2016 at 02:46, <du...@gmail.com> wrote:
>
> > Visualization is a good topic to bring up for the project. I would like
> to
> > add another possible option of using TensorBoard directly. I have not
> > looked into the file format used for TensorBoard, but it may be possible
> to
> > simple adopt that format, and simply write our stats to that type of
> file.
> > That would allow us to reuse that project without having to write our
> own.
> >
> > --
> >
> > Mike Dusenberry
> > GitHub: github.com/dusenberrymw
> > LinkedIn: linkedin.com/in/mikedusenberry
> >
> > Sent from my iPhone.
> >
> >
> > > On Oct 28, 2016, at 8:13 AM, Niketan Pansare <np...@us.ibm.com>
> wrote:
> > >
> > > Hi Matthias,
> > >
> > > Thanks for your feedback.
> > >
> > > There is a tradeoff between keeping a feature in-house until it is
> > stable, v/s continually getting community feedback as the work is getting
> > done via PR and discussions. I am for the latter as it encourages
> community
> > feedback as well as participation.
> > >
> > > I agree that our goal should be to complete the features you mentioned
> > asap and yes, we are working hard towards making the GPU backend, the
> deep
> > learning built-in functions and the algorithm wrappers (ones that are
> > already added) to be 'non-experimental' in the 1.0 release :) ... Also,
> > like you hinted, it is important to explicitly mark the experimental
> > features in the documentation to avoid the 'bad impression'. The Python
> DSL
> > will remain experimental until there is more interest from the
> community. I
> > am fine with deleting the debugger since it is rarely used, if at all.
> > >
> > > Keeping inline with the Apache guidelines, this discussion is to allow
> > community to decide on whether SystemML community should consider adding
> > new visualization functionality (since this feature is user facing). If
> > there is no interest, we can either postpone or discard this discussion
> :)
> > >
> > > Thanks,
> > >
> > > Niketan.
> > >
> > >> On Oct 28, 2016, at 1:24 AM, Matthias Boehm <mb...@googlemail.com>
> > wrote:
> > >>
> > >> Thanks for putting this together Niketan. However, could we please
> > >> postpone this discussion after our 1.0 release? Right now, I'm
> concerned
> > >> to see that we're adding many experimental features without really
> > >> getting them done. This includes for example, the GPU backend, the new
> > >> MLContext API, the Python DSL, the deep learning builtin functions,
> the
> > >> Scala algorithm wrappers, the old Spark debugger interface, and
> > >> compressed linear algebra. I think we should finish these features
> first
> > >> before moving on. If we're not careful about that, it would quickly
> > >> create a very bad impression for new users.
> > >>
> > >> Regards,
> > >> Matthias
> > >>
> > >>> On 10/28/2016 1:20 AM, Niketan Pansare wrote:
> > >>>
> > >>>
> > >>> Hi all,
> > >>>
> > >>> To give every context, I am working on a new deep learning API for
> > SystemML
> > >>> that is backed by the NN library (
> > >>> https://github.com/apache/incubator-systemml/tree/
> > master/scripts/staging/SystemML-NN/nn
> > >>> ). This API allows the users to express their model using Caffe
> > >>> specification and perform fit/predict similar to scikit-learn APIs. I
> > have
> > >>> created a sample notebook explaining the usage of the API:
> > >>> https://github.com/niketanpansare/incubator-systemml/blob/
> > 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-
> > notebooks/Barista-API-Demo.ipynb
> > >>> . This API also allows the user to load and store pre-trained models.
> > See
> > >>> https://github.com/niketanpansare/model_zoo/tree/
> > master/caffe/vision/vgg/ilsvrc12
> > >>>
> > >>> As part of this API, I added a mini-tensorboard like functionality
> (see
> > >>> step 6 and 7) using matplotlib. If there is enough interest, we can
> > extend
> > >>> and standardize the visualization functionality across all over
> > algorithms.
> > >>> Here are some initial discussion points:
> > >>> 1. Primary visualization mechanism (Jupyter or a standalone app or
> > both =>
> > >>> former is useful for cloud offering such as DSX and latter provides
> the
> > >>> design team more creative control)
> > >>> 2. What to plot for each algorithm (data scientists and algorithms
> > >>> developers will help us here).
> > >>> 3. Standardize UI (if we decide to go with Jupyter, we need to extend
> > the
> > >>> code in _visualize method:
> > >>> https://github.com/niketanpansare/incubator-systemml/blob/
> > 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/src/main/python/
> > systemml/mllearn/estimators.py#L621
> > >>> )
> > >>> 4. Primary APIs to target (python, scala, command-line or all)
> > >>>
> > >>> Thanks,
> > >>>
> > >>> Niketan Pansare
> > >>> IBM Almaden Research Center
> > >>> E-mail: npansar At us.ibm.com
> > >>> http://researcher.watson.ibm.com/researcher/view.php?
> person=us-npansar
> > >>>
> > >>
> > >
> >
>

Re: [DISCUSS] Adding tensorboard-like functionality to SystemML

Posted by Jeremy Anderson <je...@objectadjective.com>.
>
> Visualization is a good topic to bring up for the project. I would like to
> add another possible option of using TensorBoard directly. I have not
> looked into the file format used for TensorBoard, but it may be possible to
> simple adopt that format, and simply write our stats to that type of file.
> That would allow us to reuse that project without having to write our own.


Mike, I think this is a great place to start. I'd love to collaborate from
a design perspective, with anyone  that wants to technical side.

...........................

Jeremy Anderson
Github: https://github.com/objectadjective
Twitter: https://twitter.com/ObjectAdjective
LinkedIN: http://www.linkedin.com/in/objectadjective

On 29 October 2016 at 02:46, <du...@gmail.com> wrote:

> Visualization is a good topic to bring up for the project. I would like to
> add another possible option of using TensorBoard directly. I have not
> looked into the file format used for TensorBoard, but it may be possible to
> simple adopt that format, and simply write our stats to that type of file.
> That would allow us to reuse that project without having to write our own.
>
> --
>
> Mike Dusenberry
> GitHub: github.com/dusenberrymw
> LinkedIn: linkedin.com/in/mikedusenberry
>
> Sent from my iPhone.
>
>
> > On Oct 28, 2016, at 8:13 AM, Niketan Pansare <np...@us.ibm.com> wrote:
> >
> > Hi Matthias,
> >
> > Thanks for your feedback.
> >
> > There is a tradeoff between keeping a feature in-house until it is
> stable, v/s continually getting community feedback as the work is getting
> done via PR and discussions. I am for the latter as it encourages community
> feedback as well as participation.
> >
> > I agree that our goal should be to complete the features you mentioned
> asap and yes, we are working hard towards making the GPU backend, the deep
> learning built-in functions and the algorithm wrappers (ones that are
> already added) to be 'non-experimental' in the 1.0 release :) ... Also,
> like you hinted, it is important to explicitly mark the experimental
> features in the documentation to avoid the 'bad impression'. The Python DSL
> will remain experimental until there is more interest from the community. I
> am fine with deleting the debugger since it is rarely used, if at all.
> >
> > Keeping inline with the Apache guidelines, this discussion is to allow
> community to decide on whether SystemML community should consider adding
> new visualization functionality (since this feature is user facing). If
> there is no interest, we can either postpone or discard this discussion :)
> >
> > Thanks,
> >
> > Niketan.
> >
> >> On Oct 28, 2016, at 1:24 AM, Matthias Boehm <mb...@googlemail.com>
> wrote:
> >>
> >> Thanks for putting this together Niketan. However, could we please
> >> postpone this discussion after our 1.0 release? Right now, I'm concerned
> >> to see that we're adding many experimental features without really
> >> getting them done. This includes for example, the GPU backend, the new
> >> MLContext API, the Python DSL, the deep learning builtin functions, the
> >> Scala algorithm wrappers, the old Spark debugger interface, and
> >> compressed linear algebra. I think we should finish these features first
> >> before moving on. If we're not careful about that, it would quickly
> >> create a very bad impression for new users.
> >>
> >> Regards,
> >> Matthias
> >>
> >>> On 10/28/2016 1:20 AM, Niketan Pansare wrote:
> >>>
> >>>
> >>> Hi all,
> >>>
> >>> To give every context, I am working on a new deep learning API for
> SystemML
> >>> that is backed by the NN library (
> >>> https://github.com/apache/incubator-systemml/tree/
> master/scripts/staging/SystemML-NN/nn
> >>> ). This API allows the users to express their model using Caffe
> >>> specification and perform fit/predict similar to scikit-learn APIs. I
> have
> >>> created a sample notebook explaining the usage of the API:
> >>> https://github.com/niketanpansare/incubator-systemml/blob/
> 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-
> notebooks/Barista-API-Demo.ipynb
> >>> . This API also allows the user to load and store pre-trained models.
> See
> >>> https://github.com/niketanpansare/model_zoo/tree/
> master/caffe/vision/vgg/ilsvrc12
> >>>
> >>> As part of this API, I added a mini-tensorboard like functionality (see
> >>> step 6 and 7) using matplotlib. If there is enough interest, we can
> extend
> >>> and standardize the visualization functionality across all over
> algorithms.
> >>> Here are some initial discussion points:
> >>> 1. Primary visualization mechanism (Jupyter or a standalone app or
> both =>
> >>> former is useful for cloud offering such as DSX and latter provides the
> >>> design team more creative control)
> >>> 2. What to plot for each algorithm (data scientists and algorithms
> >>> developers will help us here).
> >>> 3. Standardize UI (if we decide to go with Jupyter, we need to extend
> the
> >>> code in _visualize method:
> >>> https://github.com/niketanpansare/incubator-systemml/blob/
> 1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/src/main/python/
> systemml/mllearn/estimators.py#L621
> >>> )
> >>> 4. Primary APIs to target (python, scala, command-line or all)
> >>>
> >>> Thanks,
> >>>
> >>> Niketan Pansare
> >>> IBM Almaden Research Center
> >>> E-mail: npansar At us.ibm.com
> >>> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
> >>>
> >>
> >
>

Re: [DISCUSS] Adding tensorboard-like functionality to SystemML

Posted by du...@gmail.com.
Visualization is a good topic to bring up for the project. I would like to add another possible option of using TensorBoard directly. I have not looked into the file format used for TensorBoard, but it may be possible to simple adopt that format, and simply write our stats to that type of file. That would allow us to reuse that project without having to write our own. 

--

Mike Dusenberry
GitHub: github.com/dusenberrymw
LinkedIn: linkedin.com/in/mikedusenberry

Sent from my iPhone.


> On Oct 28, 2016, at 8:13 AM, Niketan Pansare <np...@us.ibm.com> wrote:
> 
> Hi Matthias,
> 
> Thanks for your feedback.
> 
> There is a tradeoff between keeping a feature in-house until it is stable, v/s continually getting community feedback as the work is getting done via PR and discussions. I am for the latter as it encourages community feedback as well as participation.
> 
> I agree that our goal should be to complete the features you mentioned asap and yes, we are working hard towards making the GPU backend, the deep learning built-in functions and the algorithm wrappers (ones that are already added) to be 'non-experimental' in the 1.0 release :) ... Also, like you hinted, it is important to explicitly mark the experimental features in the documentation to avoid the 'bad impression'. The Python DSL will remain experimental until there is more interest from the community. I am fine with deleting the debugger since it is rarely used, if at all.
> 
> Keeping inline with the Apache guidelines, this discussion is to allow community to decide on whether SystemML community should consider adding new visualization functionality (since this feature is user facing). If there is no interest, we can either postpone or discard this discussion :)
> 
> Thanks,
> 
> Niketan.
> 
>> On Oct 28, 2016, at 1:24 AM, Matthias Boehm <mb...@googlemail.com> wrote:
>> 
>> Thanks for putting this together Niketan. However, could we please 
>> postpone this discussion after our 1.0 release? Right now, I'm concerned 
>> to see that we're adding many experimental features without really 
>> getting them done. This includes for example, the GPU backend, the new 
>> MLContext API, the Python DSL, the deep learning builtin functions, the 
>> Scala algorithm wrappers, the old Spark debugger interface, and 
>> compressed linear algebra. I think we should finish these features first 
>> before moving on. If we're not careful about that, it would quickly 
>> create a very bad impression for new users.
>> 
>> Regards,
>> Matthias
>> 
>>> On 10/28/2016 1:20 AM, Niketan Pansare wrote:
>>> 
>>> 
>>> Hi all,
>>> 
>>> To give every context, I am working on a new deep learning API for SystemML
>>> that is backed by the NN library (
>>> https://github.com/apache/incubator-systemml/tree/master/scripts/staging/SystemML-NN/nn
>>> ). This API allows the users to express their model using Caffe
>>> specification and perform fit/predict similar to scikit-learn APIs. I have
>>> created a sample notebook explaining the usage of the API:
>>> https://github.com/niketanpansare/incubator-systemml/blob/1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-notebooks/Barista-API-Demo.ipynb
>>> . This API also allows the user to load and store pre-trained models. See
>>> https://github.com/niketanpansare/model_zoo/tree/master/caffe/vision/vgg/ilsvrc12
>>> 
>>> As part of this API, I added a mini-tensorboard like functionality (see
>>> step 6 and 7) using matplotlib. If there is enough interest, we can extend
>>> and standardize the visualization functionality across all over algorithms.
>>> Here are some initial discussion points:
>>> 1. Primary visualization mechanism (Jupyter or a standalone app or both =>
>>> former is useful for cloud offering such as DSX and latter provides the
>>> design team more creative control)
>>> 2. What to plot for each algorithm (data scientists and algorithms
>>> developers will help us here).
>>> 3. Standardize UI (if we decide to go with Jupyter, we need to extend the
>>> code in _visualize method:
>>> https://github.com/niketanpansare/incubator-systemml/blob/1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/src/main/python/systemml/mllearn/estimators.py#L621
>>> )
>>> 4. Primary APIs to target (python, scala, command-line or all)
>>> 
>>> Thanks,
>>> 
>>> Niketan Pansare
>>> IBM Almaden Research Center
>>> E-mail: npansar At us.ibm.com
>>> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>>> 
>> 
> 

Re: [DISCUSS] Adding tensorboard-like functionality to SystemML

Posted by Niketan Pansare <np...@us.ibm.com>.
Hi Matthias,

Thanks for your feedback.

There is a tradeoff between keeping a feature in-house until it is stable, v/s continually getting community feedback as the work is getting done via PR and discussions. I am for the latter as it encourages community feedback as well as participation.

I agree that our goal should be to complete the features you mentioned asap and yes, we are working hard towards making the GPU backend, the deep learning built-in functions and the algorithm wrappers (ones that are already added) to be 'non-experimental' in the 1.0 release :) ... Also, like you hinted, it is important to explicitly mark the experimental features in the documentation to avoid the 'bad impression'. The Python DSL will remain experimental until there is more interest from the community. I am fine with deleting the debugger since it is rarely used, if at all.

Keeping inline with the Apache guidelines, this discussion is to allow community to decide on whether SystemML community should consider adding new visualization functionality (since this feature is user facing). If there is no interest, we can either postpone or discard this discussion :)

Thanks,

Niketan.

> On Oct 28, 2016, at 1:24 AM, Matthias Boehm <mb...@googlemail.com> wrote:
> 
> Thanks for putting this together Niketan. However, could we please 
> postpone this discussion after our 1.0 release? Right now, I'm concerned 
> to see that we're adding many experimental features without really 
> getting them done. This includes for example, the GPU backend, the new 
> MLContext API, the Python DSL, the deep learning builtin functions, the 
> Scala algorithm wrappers, the old Spark debugger interface, and 
> compressed linear algebra. I think we should finish these features first 
> before moving on. If we're not careful about that, it would quickly 
> create a very bad impression for new users.
> 
> Regards,
> Matthias
> 
>> On 10/28/2016 1:20 AM, Niketan Pansare wrote:
>> 
>> 
>> Hi all,
>> 
>> To give every context, I am working on a new deep learning API for SystemML
>> that is backed by the NN library (
>> https://github.com/apache/incubator-systemml/tree/master/scripts/staging/SystemML-NN/nn
>> ). This API allows the users to express their model using Caffe
>> specification and perform fit/predict similar to scikit-learn APIs. I have
>> created a sample notebook explaining the usage of the API:
>> https://github.com/niketanpansare/incubator-systemml/blob/1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-notebooks/Barista-API-Demo.ipynb
>> . This API also allows the user to load and store pre-trained models. See
>> https://github.com/niketanpansare/model_zoo/tree/master/caffe/vision/vgg/ilsvrc12
>> 
>> As part of this API, I added a mini-tensorboard like functionality (see
>> step 6 and 7) using matplotlib. If there is enough interest, we can extend
>> and standardize the visualization functionality across all over algorithms.
>> Here are some initial discussion points:
>> 1. Primary visualization mechanism (Jupyter or a standalone app or both =>
>> former is useful for cloud offering such as DSX and latter provides the
>> design team more creative control)
>> 2. What to plot for each algorithm (data scientists and algorithms
>> developers will help us here).
>> 3. Standardize UI (if we decide to go with Jupyter, we need to extend the
>> code in _visualize method:
>> https://github.com/niketanpansare/incubator-systemml/blob/1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/src/main/python/systemml/mllearn/estimators.py#L621
>> )
>> 4. Primary APIs to target (python, scala, command-line or all)
>> 
>> Thanks,
>> 
>> Niketan Pansare
>> IBM Almaden Research Center
>> E-mail: npansar At us.ibm.com
>> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>> 
> 


Re: [DISCUSS] Adding tensorboard-like functionality to SystemML

Posted by Matthias Boehm <mb...@googlemail.com>.
Thanks for putting this together Niketan. However, could we please 
postpone this discussion after our 1.0 release? Right now, I'm concerned 
to see that we're adding many experimental features without really 
getting them done. This includes for example, the GPU backend, the new 
MLContext API, the Python DSL, the deep learning builtin functions, the 
Scala algorithm wrappers, the old Spark debugger interface, and 
compressed linear algebra. I think we should finish these features first 
before moving on. If we're not careful about that, it would quickly 
create a very bad impression for new users.

Regards,
Matthias

On 10/28/2016 1:20 AM, Niketan Pansare wrote:
>
>
> Hi all,
>
> To give every context, I am working on a new deep learning API for SystemML
> that is backed by the NN library (
> https://github.com/apache/incubator-systemml/tree/master/scripts/staging/SystemML-NN/nn
> ). This API allows the users to express their model using Caffe
> specification and perform fit/predict similar to scikit-learn APIs. I have
> created a sample notebook explaining the usage of the API:
> https://github.com/niketanpansare/incubator-systemml/blob/1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/samples/jupyter-notebooks/Barista-API-Demo.ipynb
> . This API also allows the user to load and store pre-trained models. See
> https://github.com/niketanpansare/model_zoo/tree/master/caffe/vision/vgg/ilsvrc12
>
> As part of this API, I added a mini-tensorboard like functionality (see
> step 6 and 7) using matplotlib. If there is enough interest, we can extend
> and standardize the visualization functionality across all over algorithms.
> Here are some initial discussion points:
> 1. Primary visualization mechanism (Jupyter or a standalone app or both =>
> former is useful for cloud offering such as DSX and latter provides the
> design team more creative control)
> 2. What to plot for each algorithm (data scientists and algorithms
> developers will help us here).
> 3. Standardize UI (if we decide to go with Jupyter, we need to extend the
> code in _visualize method:
> https://github.com/niketanpansare/incubator-systemml/blob/1b655ebeec6cdffd66b282eadc4810ecfd39e4f2/src/main/python/systemml/mllearn/estimators.py#L621
> )
> 4. Primary APIs to target (python, scala, command-line or all)
>
> Thanks,
>
> Niketan Pansare
> IBM Almaden Research Center
> E-mail: npansar At us.ibm.com
> http://researcher.watson.ibm.com/researcher/view.php?person=us-npansar
>