You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/09 22:12:40 UTC

[GitHub] [incubator-mxnet] zhreshold opened a new issue #17263: [mxnet 2.0][item 4.8][RFC] Gluon Data API Extension and Fixes(Part 1)

zhreshold opened a new issue #17263: [mxnet 2.0][item 4.8][RFC] Gluon Data API Extension and Fixes(Part 1)
URL: https://github.com/apache/incubator-mxnet/issues/17263
 
 
   ## Description
   
   This is the part 1 of Gluon Data API extension and fixes, which mainly focus on cleaning up diverging usage of mxnet module/gluon.
   Through long time evolution, there's currently two streams of data loading conventions implemented in mxnet
   - Iterator: mxnet.io.DataIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L180)
   - Dataset + DataLoader: gluon.data.Dataset + gluon.data.DataLoader
   
   In order to eliminate the confusion here and to reduce the maintenance efforts, the plan is to drop all old iterators and provide similar Dataset + Dataloader experience in gluon data API. 
   ## Things to be removed
   
   ### iterators
   - Base mxnet.io.DataIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L180)
   - mxnet.io.ResizeIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L282)
   - mxnet.io.PrefetchIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L347)
   - mxnet.io.NDArrayIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L491)
   - mxnet.io.MXDataIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/io/io.py#L800)
   - mxnet.image.ImageIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/image/detection.py#L626)
   - mxnet.image.ImageDetIter(https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/image/detection.py#L626)
   
   ### Augmenters from mxnet.image and mxnet.image.detection module
   
   Random augmenters, e.g. (https://github.com/apache/incubator-mxnet/blob/ac88f1e87aa7da2c33f0cb89524d444ddc3a2ae8/python/mxnet/image/image.py#L615) will be removed.
   
   ### Transform = args in gluon.data.Datasets
   
   transform = is no longer supported, and can be replaced with `dataset.transform` or `dataset.transform_first`
   
   ## Things to be added
   
   ### Gluon Data Datasets
   
   Dataset + Transfrom combo that simulate the removed Iterators
   
   For example, NDArrayIter can be reimplemented as NDArrayDataset + empty transform function.
   
   ### Gluon Data Augmentaters/Transforms
   
   Data augmenters as mxnet.gluon.Block
   
   Candidates TBD, useful candidates from GluonCV(https://github.com/dmlc/gluon-cv/tree/master/gluoncv/data/transforms) and GluonNLP(https://github.com/dmlc/gluon-nlp/blob/v0.8.x/src/gluonnlp/data/transforms.py)
   
   ### mxnet.image
   
   image processing functions will be absorbed from GluonCV(https://github.com/dmlc/gluon-cv/blob/master/gluoncv/data/transforms/image.py)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] zhreshold commented on issue #17263: [mxnet 2.0][item 4.8][RFC] Gluon Data API Extension and Fixes(Part 1)

Posted by GitBox <gi...@apache.org>.
zhreshold commented on issue #17263: [mxnet 2.0][item 4.8][RFC] Gluon Data API Extension and Fixes(Part 1)
URL: https://github.com/apache/incubator-mxnet/issues/17263#issuecomment-573247878
 
 
   The old iterators will get a special gluon dataset wrapper which has no length and forbids random accessing or sampling from dataloader, they keep their original arguments during iteration
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] ptrendx commented on issue #17263: [mxnet 2.0][item 4.8][RFC] Gluon Data API Extension and Fixes(Part 1)

Posted by GitBox <gi...@apache.org>.
ptrendx commented on issue #17263: [mxnet 2.0][item 4.8][RFC] Gluon Data API Extension and Fixes(Part 1)
URL: https://github.com/apache/incubator-mxnet/issues/17263#issuecomment-573246015
 
 
   What about `mx.io.ImageRecordIter`? Also, what about the return type of those iterator - `mx.io` iterators return `mx.io.DataBatch`, will that be changed too?
   
   @JanuszL FYI since DALI MXNet plugin produces `mx.io.DataBatch` and may be affected.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services