You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2020/01/06 12:12:47 UTC

[GitHub] [incubator-mxnet] gengyanlei opened a new issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss

gengyanlei opened a new issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss
URL: https://github.com/apache/incubator-mxnet/issues/17224
 
 
   ## Description
   (A clear and concise description of what the feature is.)
   - If the proposal is about a new model, provide description of what the model is.
   - If the proposal is about an API, provide mock examples if possible.
   
   ## References
   - list reference and related literature
   - list known implementations
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] sxjscience edited a comment on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
sxjscience edited a comment on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571853972
 
 
   @gengyanlei 谢谢您的建议!我觉得.step(batch_size=1)这个操作可以添加一个default value,在这一行默认batch_size为1:
   
   
   We can change the default batch_size in `.step` and `.update` to 1.0 to avoid the cumbersome experience.
   
   https://github.com/apache/incubator-mxnet/blob/f17d19ba663df8694c10dfa56863a7e2a19cadcb/python/mxnet/gluon/trainer.py#L320
   
   https://github.com/apache/incubator-mxnet/blob/f17d19ba663df8694c10dfa56863a7e2a19cadcb/python/mxnet/gluon/trainer.py#L397

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571857421
 
 
   您好,我的重点是能不能将数据concat直接输入网络,将loss汇总成1个变量,不需要像上面的样例一样遍历执行,我不纠结step;
   既然gluon都很高度集成了,那么数据分发,结果预测也应该有这样的操作,便于使用。
   我暂时不使用mxnet了,还是用pytorch吧,抱歉啊,针对这种操作实在是不方便,我不关心哪个gpu上面哪个预测结果,反正都是并行操作,我专门写一个封装API,但是这种操作针对不同的主函数又有不一样的写法,针对精度,损失,各种评价指标,乃至对预测结果进一步操作,再反向传播,都很不方便哈。
   仅仅是我的使用体验,至于那些bug我用到的基本上都是常用的操作。

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei removed a comment on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
gengyanlei removed a comment on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571904498
 
 
   ~~抱歉啊,没用过gluon-cv,这个也是mxnet的高级API么?还是相当于 detection2 VS pytorch的关系呢?
   其实说实话,我也没用过detection,哈哈,一般都尽量只用官方的最正统的包。~~

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] wkcn commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss

Posted by GitBox <gi...@apache.org>.
wkcn commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571165118
 
 
   请问对应是指pytorch里的哪些函数呢?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] wkcn commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
wkcn commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571897451
 
 
   @gengyanlei 谢谢你的建议!确实Gluon的使用方法再简化一些会更好。
   
   @sxjscience 我看到gluoncv里有些函数和MXNet里的函数是重复的,比如:[`split_and_load`](
   https://github.com/dmlc/gluon-cv/blob/5d54877bcf430ea65e5053ef086084f30500c9d9/gluoncv/utils/sync_loader_helper.py#L59);以及有一些增强Gluon功能的模块,如DataParallelModel。是否可以把这些内容加入到MXNet项目呢?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss

Posted by GitBox <gi...@apache.org>.
gengyanlei commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571396357
 
 
   ![1111](https://user-images.githubusercontent.com/22360785/71861912-f95d7980-3132-11ea-9e92-53b65b442467.png)
   可以将这个预测结果合并么,不要每个x预测一次,直接对data预测,那样我就好操作了,每次对结果操作时,遍历特别麻烦(我也可以提前对结果concat,但这个过程繁琐)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] sxjscience commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
sxjscience commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571853972
 
 
   @gengyanlei 谢谢您的建议!我觉得.step(batch_size=1)这个操作可以添加一个default value,在这一行默认batch_size为1:
   https://github.com/apache/incubator-mxnet/blob/f17d19ba663df8694c10dfa56863a7e2a19cadcb/python/mxnet/gluon/trainer.py#L320

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss

Posted by GitBox <gi...@apache.org>.
gengyanlei commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571394479
 
 
   mxnet gluon数据读取时,需要对数据进行分发到每个GPU上,然后再根据每个gpu上面预测的结果,计算loss,再将每个loss反向传播(**这个过程需要for循环操作,比较繁琐**),而且.step(batch_size),这个batch_size有可能不是一样的,那么需要每次step时,获取动态获取batch_size大小。
   数据分发对应nn.DataParallel
   可以将预测结果直接合并成1个么,不管几个gpu,我阅读mxnet API时,没注意到有这个参数。
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] pengzhao-intel commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss

Posted by GitBox <gi...@apache.org>.
pengzhao-intel commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571397502
 
 
   @gengyanlei  可以把标题改短一点嘛:)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-572328097
 
 
   谢谢了。

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] sxjscience commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
sxjscience commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571861765
 
 
   @gengyanlei As pointed out by @Jerryzcn , there is one data parallel module in GluonCV: https://github.com/dmlc/gluon-cv/blob/master/gluoncv/utils/parallel.py

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] kohillyang commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
kohillyang commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-575523668
 
 
   See https://github.com/apache/incubator-mxnet/pull/14344.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571453654
 
 
   @wkcn 主要数据分发之后,预测结果需要像上面代码一样,对结果还需要遍历计算,不利于后续的相关操作,例如精度计算等。
    
   ![微信截图_20200107120038](https://user-images.githubusercontent.com/22360785/71871716-1014c800-3155-11ea-95b2-436fefa8b839.png)
   
   我的意思其实是把输入数据concat,直接输入网络,得到1个预测结果,我不关心哪个结果属于哪个gpu(没有意义),然后计算loss时,只需要操作1次;
   
   像.step(batch_size=1)这样的操作可以完全写死的,这种灵活操作完全没有必要;
   数据分发这个我不清楚,可能部署时需要,但是在学术上面没考虑过这个,应该既可以分发,又可以写个封装API;对于很多人来说,手动数据分发的概率很小,我同学基本上都首选pytorch的开源项目,我从insightface入坑mxnet,从symbol转到gluon,不停的阅读mxnet官方API,感觉symbol数据读取比tf好用(仅数据读取方面),但是损失计算不如tf;然后gluon数据读取和pytorch基本一致(但是一个是PIL,一个是Ndarray,操作细节又很多坑),缺少一些东西(设置某个操作的发生概率,这个缺少,我提过);但是gluon的结果预测,计算损失,学习率设置,都有些别扭,例如,学习率不能按照epoch设置,我查阅了所有的学习率API,都是按照step,不是按照epoch,不过可以通过set_learning_rate方法设置。
   ![微信截图_20200107141931](https://user-images.githubusercontent.com/22360785/71873077-3e94a200-3159-11ea-9ad3-046b0265207c.png)
   
   
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571904498
 
 
   抱歉啊,没用过gluon-cv,这个也是mxnet的高级API么?还是相当于 detection2 VS pytorch的关系呢?
   其实说实话,我也没用过detection,哈哈,一般都尽量只用官方的最正统的包。

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
gengyanlei commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571909036
 
 
   查看了一下gluonCV gluonNLP,这些是新、经典的网络复现,可以提供快速的实现,但是更新频繁,可能会发生类似tf一个月一版本,函数说明变化巨大的情况吧,没入坑。听说评价还可以,后面尝试尝试,哈哈。

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei edited a comment on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
gengyanlei edited a comment on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571453654
 
 
   @wkcn 主要数据分发之后,预测结果需要像上面代码一样,对结果还需要遍历计算,不利于后续的相关操作,例如精度计算等。
    
   ![2](https://user-images.githubusercontent.com/22360785/71873235-aba83780-3159-11ea-971c-555169a67db2.png)
   
   
   我的意思其实是把输入数据concat,直接输入网络,得到1个预测结果,我不关心哪个结果属于哪个gpu(没有意义),然后计算loss时,只需要操作1次;
   
   像.step(batch_size=1)这样的操作可以完全写死的,这种灵活操作完全没有必要;
   数据分发这个我不清楚,可能部署时需要,但是在学术上面没考虑过这个,应该既可以分发,又可以写个封装API;对于很多人来说,手动数据分发的概率很小,我同学基本上都首选pytorch的开源项目,我从insightface入坑mxnet,从symbol转到gluon,不停的阅读mxnet官方API,感觉symbol数据读取比tf好用(仅数据读取方面),但是损失计算不如tf;然后gluon数据读取和pytorch基本一致(但是一个是PIL,一个是Ndarray,操作细节又很多坑),缺少一些东西(设置某个操作的发生概率,这个缺少,我提过);但是gluon的结果预测,计算损失,学习率设置,都有些别扭,例如,学习率不能按照epoch设置,我查阅了所有的学习率API,都是按照step,不是按照epoch,不过可以通过set_learning_rate方法设置。
   
   综上所述,就是希望可以添加一些可以简化的API,提高使用体验;目前使用gluon时,遇到好多坑,一点点踩过来,最不方便的就是数据分发导致的预测结果,计算损失等后续操作,其它已经基本解决。
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] kohillyang commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
kohillyang commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-575523298
 
 
   https://github.com/dmlc/gluon-cv/blob/master/gluoncv/utils/parallel.py is designed for SyncBN,however, there are some thread problems when forward and backward in multi-thread environment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] gengyanlei edited a comment on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
gengyanlei edited a comment on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571904498
 
 
   ~~抱歉啊,没用过gluon-cv,这个也是mxnet的高级API么?还是相当于 detection2 VS pytorch的关系呢?
   其实说实话,我也没用过detection,哈哈,一般都尽量只用官方的最正统的包。~~

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] wkcn commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss

Posted by GitBox <gi...@apache.org>.
wkcn commented on issue #17224: 可以将gluon中数据分发 合成么,像pytorch那样,并且.step()不需要传入batch,以及损失函数可以加reduce么?使用起来不太方便,不过比symbol好用多了,symbol不好记录loss
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-571399148
 
 
   @gengyanlei 
   1. 数据分发
   目前好像还没有类似`nn.DataParallel`的API,需要自己封装一下。
   2. `.step(batch_size)`
   Gluon里的`.step(1)`和PyTorch的`.set()`是等价的,对Loss沿batch轴求均值就可以了。
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [incubator-mxnet] sxjscience commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)

Posted by GitBox <gi...@apache.org>.
sxjscience commented on issue #17224: 关于mxnet.gluon的一些建议(使用方面)
URL: https://github.com/apache/incubator-mxnet/issues/17224#issuecomment-572211203
 
 
   @gengyanlei 我们会尽量统一到mxnet的gluon里

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services