You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by Naveen Swamy <mn...@gmail.com> on 2019/03/05 18:13:22 UTC

Re: Gluon fit API- Design proposal

FYI, I have created a branch on the repo to facilitate multiple
collaborators for this feature.
https://github.com/apache/incubator-mxnet/tree/fit-api, they'll create PRs
to this branch and once the api is feature complete, i will rebase and
merge to master to preserve commit history

On Sun, Feb 10, 2019 at 2:43 PM Hagay Lupesko <lu...@gmail.com> wrote:

> Wanted to chime in as well.
> I have reviewed the design shared in the mail offline with Ankit, Lai and
> Naveen (we work in the same team in Amazon).
>
> I think it does a good job at simplifying many low-complexity training use
> cases, which can make MXNet/Gluon even more friendly to so-called "deep
> learning beginners" - so +1 on the proposal!
>
> Hagay
>
> On Fri, Feb 8, 2019 at 10:30 AM Naveen Swamy <mn...@gmail.com> wrote:
>
> > Hi Alfredo,
> > Thanks for your comments, I really like all your suggestions. Here are my
> > answers let me know if it makes sense or have comments.
> >
> > 1) The fit API is targeting novice users covering about 80% of the use
> > cases listed in the document. For advanced users,
> > and complex models, we (Naveen, Ankit and Lai) felt its best use the
> > existing mechanisms due to the imperative nature and the more control it
> > can give, So we did not duplicate the save/load functionality in the
> Hybrid
> > block.
> > We’ll consider and extend the functionality to Estimator.
> > I have had trouble using pickle package which is commonly used for
> > serialization and deserialization, if you have any other suggestions from
> > your experience please let us know.
> >
> > 2) +1, we’ll add this to our backlog and add it in our next iteration.
> >
> > 3) Can you expand a little more on this, how it helps in a production
> > environment (which this API was not target for) ?.
> > I’ll check the TF Estimator to understand further.
> >
> > Thanks, Naveen
> >
> >
> > On Thu, Feb 7, 2019 at 2:32 PM Alfredo Luque
> > <al...@airbnb.com.invalid> wrote:
> >
> > > This is great and something we should all be able to benefit from.
> > >
> > > There are just three pieces I’d like to advocate for that I feel are
> > > shortcomings of some competing APIs on other frameworks (eg; TF
> > Estimators)
> > > and I would love to see in this proposal:
> > >
> > > 1) Make serialization/deserialization of these classifiers/regressors
> > easy
> > > or at least ensure the internal members of the wrapper are easy to
> > > save/load. We’ve hacked around this by only allowing hybrid blocks
> which
> > > have easy save/load functionality, but having a simple
> > > “save_model”/“load_model” function as a 1st class citizen of these
> > proposed
> > > APIs will lead to a vastly improved user experience down the road.
> > >
> > > 2) Allowing the fit/predict/predict_proba functions to take in both
> data
> > > loaders and simple numpy arrays and pandas dataframes is a simple
> change
> > > but a huge usability improvement. Power users and library authors will
> > > appreciate being able to use custom data loaders but a large portion of
> > end
> > > users want to just pass an ndarray or data frame and get some results
> > > quickly.
> > >
> > > 3) Allow lazy construction of the model. This is something I feel TF
> > > Estimators do well: by allowing the user to pass a function that
> > constructs
> > > the net (i.e a model_fn that returns the net) rather than the net
> itself
> > it
> > > allows for more control at runtime and usage of these APIs in a
> > production
> > > environment.
> > >
> > > Would love your thoughts on these three changes/additions.
> > >
> > > —Alfredo Luque
> > > Software Engineer
> > > Machine Learning Infrastructure
> > > Airbnb
> > > San Francisco, CA
> > >
> > > On February 7, 2019 at 1:51:17 PM, Ankit Khedia (
> khedia.ankit@gmail.com)
> > > wrote:
> > >
> > > Hello dev@,
> > >
> > > Training a model in Gluon requires users to write the training loop,
> this
> > > is useful because of its imperative nature, however repeating the same
> > code
> > > across multiple models can become tedious and repetitive with
> boilerplate
> > > code. The training loop can also be overwhelming to some users new to
> > deep
> > > learning. Users have asked in [1] for a simple Fit API, similar to APIs
> > > available in SKLearn and Keras as a way to simplify model training and
> > > reduce boilerplate code and complexity.
> > >
> > > So, I along with other contributor Naveen and Lai came up with a fit
> API
> > > proposal in [2] that covers 80% of the use-cases for beginners, the fit
> > API
> > > does not replace the gluon training loops. The API proposal is inspired
> > by
> > > the Keras fit API. I have discussed and got feedback from a few MXNet
> > > contributors (Sheng, Mu, Aston, Zhi) close by and I am writing to ask
> for
> > > the community’s feedback on the API proposal.
> > >
> > >
> > >
> > > [1]
> > >
> >
> https://discuss.mxnet.io/t/wrapping-gluon-into-scikit-learn-like-api/2112
> > > [2]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Gluon+Fit+API+-+Tech+Design
> > >
> > >
> > > Thanks,
> > > Ankit
> > >
> > >
> > > —
> > > Alfredo Luque
> > > Software Engineer
> > > Machine Learning Infrastructure
> > > Airbnb
> > > San Francisco, CA
> > >
> >
>

Re: Gluon fit API- Design proposal

Posted by Mu Li <mu...@gmail.com>.
It's great to see the proposal has a list of models that this API should
cover. But note that the D2L book has a very simplified train function (I
wrote it). It's oversimplified compared to what we are using in real life.
Kaggle competition solutions and popular github projects are closer to what
people are using. Alfredo also provided three very sensible use cases.

Besides, there are my comments

1. Do we know the limitation of the current API? E.g. which models that
this API does not support. So far, it's hard to say that it covers 80% use
cases.
2. Extendibility. So far it has a single gluon.Estimator class do all
works. Are we considering to allow to extend this class to support future
use cases?
3. It's confusion some variables end with 's' while some don't. For
example,
https://github.com/apache/incubator-mxnet/blob/fit-api/python/mxnet/gluon/estimator/estimator.py#L54
has loss, metrics, trainers and context. All of them support list inputs.
4. Relate to 3, how to map a list of inputs to each other. E.g. if I give a
list of loss functions, then what's their inputs, and what's are the loss
weight? Similarly, what does multiple trainers mean.
5. How to initialize a model with different initializers for different
layers (pretty common use case)
6. How to restart for a previous
7. How about if a model have multiple components, e.g. encode-decode, or
multi-modality training
8. It's strange to specify a batch_size in fit() for new users. (I know the
reason, but need to explain to users)
9. event_handler is quit powerful, but how can users to access internal
states through the handle. BTW, we usually call it callback in Python.
10. Programming flavor. The codes should be easy to read. For example, use
same naming convention as mxnet/pytorch/keras. A function should be short,
otherwise breaks it into several pieces. Currently fit() has >100 lines of
codes.

Despite that there are multiple opportunities to improve, it's great to see
that you already have an implementation so we can dive into details.

On Tue, Mar 5, 2019 at 10:14 AM Naveen Swamy <mn...@gmail.com> wrote:

> FYI, I have created a branch on the repo to facilitate multiple
> collaborators for this feature.
> https://github.com/apache/incubator-mxnet/tree/fit-api, they'll create PRs
> to this branch and once the api is feature complete, i will rebase and
> merge to master to preserve commit history
>
> On Sun, Feb 10, 2019 at 2:43 PM Hagay Lupesko <lu...@gmail.com> wrote:
>
> > Wanted to chime in as well.
> > I have reviewed the design shared in the mail offline with Ankit, Lai and
> > Naveen (we work in the same team in Amazon).
> >
> > I think it does a good job at simplifying many low-complexity training
> use
> > cases, which can make MXNet/Gluon even more friendly to so-called "deep
> > learning beginners" - so +1 on the proposal!
> >
> > Hagay
> >
> > On Fri, Feb 8, 2019 at 10:30 AM Naveen Swamy <mn...@gmail.com> wrote:
> >
> > > Hi Alfredo,
> > > Thanks for your comments, I really like all your suggestions. Here are
> my
> > > answers let me know if it makes sense or have comments.
> > >
> > > 1) The fit API is targeting novice users covering about 80% of the use
> > > cases listed in the document. For advanced users,
> > > and complex models, we (Naveen, Ankit and Lai) felt its best use the
> > > existing mechanisms due to the imperative nature and the more control
> it
> > > can give, So we did not duplicate the save/load functionality in the
> > Hybrid
> > > block.
> > > We’ll consider and extend the functionality to Estimator.
> > > I have had trouble using pickle package which is commonly used for
> > > serialization and deserialization, if you have any other suggestions
> from
> > > your experience please let us know.
> > >
> > > 2) +1, we’ll add this to our backlog and add it in our next iteration.
> > >
> > > 3) Can you expand a little more on this, how it helps in a production
> > > environment (which this API was not target for) ?.
> > > I’ll check the TF Estimator to understand further.
> > >
> > > Thanks, Naveen
> > >
> > >
> > > On Thu, Feb 7, 2019 at 2:32 PM Alfredo Luque
> > > <al...@airbnb.com.invalid> wrote:
> > >
> > > > This is great and something we should all be able to benefit from.
> > > >
> > > > There are just three pieces I’d like to advocate for that I feel are
> > > > shortcomings of some competing APIs on other frameworks (eg; TF
> > > Estimators)
> > > > and I would love to see in this proposal:
> > > >
> > > > 1) Make serialization/deserialization of these classifiers/regressors
> > > easy
> > > > or at least ensure the internal members of the wrapper are easy to
> > > > save/load. We’ve hacked around this by only allowing hybrid blocks
> > which
> > > > have easy save/load functionality, but having a simple
> > > > “save_model”/“load_model” function as a 1st class citizen of these
> > > proposed
> > > > APIs will lead to a vastly improved user experience down the road.
> > > >
> > > > 2) Allowing the fit/predict/predict_proba functions to take in both
> > data
> > > > loaders and simple numpy arrays and pandas dataframes is a simple
> > change
> > > > but a huge usability improvement. Power users and library authors
> will
> > > > appreciate being able to use custom data loaders but a large portion
> of
> > > end
> > > > users want to just pass an ndarray or data frame and get some results
> > > > quickly.
> > > >
> > > > 3) Allow lazy construction of the model. This is something I feel TF
> > > > Estimators do well: by allowing the user to pass a function that
> > > constructs
> > > > the net (i.e a model_fn that returns the net) rather than the net
> > itself
> > > it
> > > > allows for more control at runtime and usage of these APIs in a
> > > production
> > > > environment.
> > > >
> > > > Would love your thoughts on these three changes/additions.
> > > >
> > > > —Alfredo Luque
> > > > Software Engineer
> > > > Machine Learning Infrastructure
> > > > Airbnb
> > > > San Francisco, CA
> > > >
> > > > On February 7, 2019 at 1:51:17 PM, Ankit Khedia (
> > khedia.ankit@gmail.com)
> > > > wrote:
> > > >
> > > > Hello dev@,
> > > >
> > > > Training a model in Gluon requires users to write the training loop,
> > this
> > > > is useful because of its imperative nature, however repeating the
> same
> > > code
> > > > across multiple models can become tedious and repetitive with
> > boilerplate
> > > > code. The training loop can also be overwhelming to some users new to
> > > deep
> > > > learning. Users have asked in [1] for a simple Fit API, similar to
> APIs
> > > > available in SKLearn and Keras as a way to simplify model training
> and
> > > > reduce boilerplate code and complexity.
> > > >
> > > > So, I along with other contributor Naveen and Lai came up with a fit
> > API
> > > > proposal in [2] that covers 80% of the use-cases for beginners, the
> fit
> > > API
> > > > does not replace the gluon training loops. The API proposal is
> inspired
> > > by
> > > > the Keras fit API. I have discussed and got feedback from a few MXNet
> > > > contributors (Sheng, Mu, Aston, Zhi) close by and I am writing to ask
> > for
> > > > the community’s feedback on the API proposal.
> > > >
> > > >
> > > >
> > > > [1]
> > > >
> > >
> >
> https://discuss.mxnet.io/t/wrapping-gluon-into-scikit-learn-like-api/2112
> > > > [2]
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/MXNET/Gluon+Fit+API+-+Tech+Design
> > > >
> > > >
> > > > Thanks,
> > > > Ankit
> > > >
> > > >
> > > > —
> > > > Alfredo Luque
> > > > Software Engineer
> > > > Machine Learning Infrastructure
> > > > Airbnb
> > > > San Francisco, CA
> > > >
> > >
> >
>