You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mxnet.apache.org by Chaitanya Bapat <ch...@gmail.com> on 2019/04/08 15:48:01 UTC

Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

Greetings!

Great start to a Monday morning, as I came across this news on Import AI,
an AI newsletter.

The newsletter talked about Apache MXNet, hence thought of sharing it with
our community. This seems to be a great achievement worth paying attention
to.

*75 seconds: How long it takes to train a network against ImageNet:*
*...Fujitsu Research claims state-of-the-art ImageNet training scheme...*
Researchers with Fujitsu Laboratories in Japan have further reduced the
time it takes to train large-scale, supervised learning AI models; their
approach lets them train a residual network to around 75% accuracy on the
ImageNet dataset after 74.7 seconds of training time. This is a big leap
from where we were in 2017 (an hour), and is impressive relative to
late-2018 performance (around 4 minutes: see issue #121
<https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=28edafc07a&e=0b77acb987>
).

*How they did it: *The researchers trained their system across *2,048 Tesla
V100 GPUs* via the Amazon-developed MXNet deep learning framework. They
used a large mini-batch size of 81,920, and also implemented layer-wise
adaptive scaling (LARS) and a 'warming up' period to increase learning
efficiency.

*Why it matters:* Training large models on distributed infrastructure is a
key component of modern AI research, and the reduction in time we've seen
on ImageNet training is striking - I think this is emblematic of the
industrialization of AI, as people seek to create systematic approaches to
efficiently training models across large amounts of computers. This trend
ultimately leads to a speedup in the rate of research reliant on
large-scale experimentation, and can unlock new paths of research.
*  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on ImageNet
in 74.7 seconds (Arxiv)
<https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=d2b13c879f&e=0b77acb987>
.

NVIDIA article -
https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/

Hope that gives further impetus to strive harder!
Have a good week!
Chai

 --
*Chaitanya Prakash Bapat*
*+1 (973) 953-6299*

[image: https://www.linkedin.com//in/chaibapat25]
<https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat]
<https://www.facebook.com/chaibapchya>[image:
https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
https://www.linkedin.com//in/chaibapat25]
<https://www.linkedin.com//in/chaibapchya/>

Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

Posted by Hagay Lupesko <lu...@gmail.com>.
Agreed!
I will mention this to my colleagues at Amazon that can help with that.

On Mon, Apr 8, 2019 at 1:32 PM Chaitanya Bapat <ch...@gmail.com> wrote:

> Yes. Moreover, we should be pushing it on our Twitter, Reddit, Medium, etc
> social channels.
>
> On Mon, 8 Apr 2019 at 15:55, Hagay Lupesko <lu...@gmail.com> wrote:
>
> > That's super cool Chai - thanks for sharing!
> > I also noticed that, and was seeing how we can reach out to the Fujitsu
> > guys so they can contribute back into MXNet...
> >
> > On Mon, Apr 8, 2019 at 10:14 AM Lin Yuan <ap...@gmail.com> wrote:
> >
> > > Chai,
> > >
> > > Thanks for sharing. This is awesome news!
> > >
> > > Lin
> > >
> > > On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat <ch...@gmail.com>
> > > wrote:
> > >
> > > > Greetings!
> > > >
> > > > Great start to a Monday morning, as I came across this news on Import
> > AI,
> > > > an AI newsletter.
> > > >
> > > > The newsletter talked about Apache MXNet, hence thought of sharing it
> > > with
> > > > our community. This seems to be a great achievement worth paying
> > > attention
> > > > to.
> > > >
> > > > *75 seconds: How long it takes to train a network against ImageNet:*
> > > > *...Fujitsu Research claims state-of-the-art ImageNet training
> > scheme...*
> > > > Researchers with Fujitsu Laboratories in Japan have further reduced
> the
> > > > time it takes to train large-scale, supervised learning AI models;
> > their
> > > > approach lets them train a residual network to around 75% accuracy on
> > the
> > > > ImageNet dataset after 74.7 seconds of training time. This is a big
> > leap
> > > > from where we were in 2017 (an hour), and is impressive relative to
> > > > late-2018 performance (around 4 minutes: see issue #121
> > > > <
> > > >
> > >
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=28edafc07a&e=0b77acb987
> > > > >
> > > > ).
> > > >
> > > > *How they did it: *The researchers trained their system across *2,048
> > > Tesla
> > > > V100 GPUs* via the Amazon-developed MXNet deep learning framework.
> They
> > > > used a large mini-batch size of 81,920, and also implemented
> layer-wise
> > > > adaptive scaling (LARS) and a 'warming up' period to increase
> learning
> > > > efficiency.
> > > >
> > > > *Why it matters:* Training large models on distributed infrastructure
> > is
> > > a
> > > > key component of modern AI research, and the reduction in time we've
> > seen
> > > > on ImageNet training is striking - I think this is emblematic of the
> > > > industrialization of AI, as people seek to create systematic
> approaches
> > > to
> > > > efficiently training models across large amounts of computers. This
> > trend
> > > > ultimately leads to a speedup in the rate of research reliant on
> > > > large-scale experimentation, and can unlock new paths of research.
> > > > *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on
> > > ImageNet
> > > > in 74.7 seconds (Arxiv)
> > > > <
> > > >
> > >
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=d2b13c879f&e=0b77acb987
> > > > >
> > > > .
> > > >
> > > > NVIDIA article -
> > > >
> > > >
> > >
> >
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
> > > >
> > > > Hope that gives further impetus to strive harder!
> > > > Have a good week!
> > > > Chai
> > > >
> > > >  --
> > > > *Chaitanya Prakash Bapat*
> > > > *+1 (973) 953-6299*
> > > >
> > > > [image: https://www.linkedin.com//in/chaibapat25]
> > > > <https://github.com/ChaiBapchya>[image:
> > > https://www.facebook.com/chaibapat
> > > > ]
> > > > <https://www.facebook.com/chaibapchya>[image:
> > > > https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya
> > > >[image:
> > > > https://www.linkedin.com//in/chaibapat25]
> > > > <https://www.linkedin.com//in/chaibapchya/>
> > > >
> > >
> >
>
>
> --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
>
> [image: https://www.linkedin.com//in/chaibapat25]
> <https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat
> ]
> <https://www.facebook.com/chaibapchya>[image:
> https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
> https://www.linkedin.com//in/chaibapat25]
> <https://www.linkedin.com//in/chaibapchya/>
>

Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

Posted by Chaitanya Bapat <ch...@gmail.com>.
Yes. Moreover, we should be pushing it on our Twitter, Reddit, Medium, etc
social channels.

On Mon, 8 Apr 2019 at 15:55, Hagay Lupesko <lu...@gmail.com> wrote:

> That's super cool Chai - thanks for sharing!
> I also noticed that, and was seeing how we can reach out to the Fujitsu
> guys so they can contribute back into MXNet...
>
> On Mon, Apr 8, 2019 at 10:14 AM Lin Yuan <ap...@gmail.com> wrote:
>
> > Chai,
> >
> > Thanks for sharing. This is awesome news!
> >
> > Lin
> >
> > On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat <ch...@gmail.com>
> > wrote:
> >
> > > Greetings!
> > >
> > > Great start to a Monday morning, as I came across this news on Import
> AI,
> > > an AI newsletter.
> > >
> > > The newsletter talked about Apache MXNet, hence thought of sharing it
> > with
> > > our community. This seems to be a great achievement worth paying
> > attention
> > > to.
> > >
> > > *75 seconds: How long it takes to train a network against ImageNet:*
> > > *...Fujitsu Research claims state-of-the-art ImageNet training
> scheme...*
> > > Researchers with Fujitsu Laboratories in Japan have further reduced the
> > > time it takes to train large-scale, supervised learning AI models;
> their
> > > approach lets them train a residual network to around 75% accuracy on
> the
> > > ImageNet dataset after 74.7 seconds of training time. This is a big
> leap
> > > from where we were in 2017 (an hour), and is impressive relative to
> > > late-2018 performance (around 4 minutes: see issue #121
> > > <
> > >
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=28edafc07a&e=0b77acb987
> > > >
> > > ).
> > >
> > > *How they did it: *The researchers trained their system across *2,048
> > Tesla
> > > V100 GPUs* via the Amazon-developed MXNet deep learning framework. They
> > > used a large mini-batch size of 81,920, and also implemented layer-wise
> > > adaptive scaling (LARS) and a 'warming up' period to increase learning
> > > efficiency.
> > >
> > > *Why it matters:* Training large models on distributed infrastructure
> is
> > a
> > > key component of modern AI research, and the reduction in time we've
> seen
> > > on ImageNet training is striking - I think this is emblematic of the
> > > industrialization of AI, as people seek to create systematic approaches
> > to
> > > efficiently training models across large amounts of computers. This
> trend
> > > ultimately leads to a speedup in the rate of research reliant on
> > > large-scale experimentation, and can unlock new paths of research.
> > > *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on
> > ImageNet
> > > in 74.7 seconds (Arxiv)
> > > <
> > >
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=d2b13c879f&e=0b77acb987
> > > >
> > > .
> > >
> > > NVIDIA article -
> > >
> > >
> >
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
> > >
> > > Hope that gives further impetus to strive harder!
> > > Have a good week!
> > > Chai
> > >
> > >  --
> > > *Chaitanya Prakash Bapat*
> > > *+1 (973) 953-6299*
> > >
> > > [image: https://www.linkedin.com//in/chaibapat25]
> > > <https://github.com/ChaiBapchya>[image:
> > https://www.facebook.com/chaibapat
> > > ]
> > > <https://www.facebook.com/chaibapchya>[image:
> > > https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya
> > >[image:
> > > https://www.linkedin.com//in/chaibapat25]
> > > <https://www.linkedin.com//in/chaibapchya/>
> > >
> >
>


-- 
*Chaitanya Prakash Bapat*
*+1 (973) 953-6299*

[image: https://www.linkedin.com//in/chaibapat25]
<https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat]
<https://www.facebook.com/chaibapchya>[image:
https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
https://www.linkedin.com//in/chaibapat25]
<https://www.linkedin.com//in/chaibapchya/>

Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

Posted by Hagay Lupesko <lu...@gmail.com>.
That's super cool Chai - thanks for sharing!
I also noticed that, and was seeing how we can reach out to the Fujitsu
guys so they can contribute back into MXNet...

On Mon, Apr 8, 2019 at 10:14 AM Lin Yuan <ap...@gmail.com> wrote:

> Chai,
>
> Thanks for sharing. This is awesome news!
>
> Lin
>
> On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat <ch...@gmail.com>
> wrote:
>
> > Greetings!
> >
> > Great start to a Monday morning, as I came across this news on Import AI,
> > an AI newsletter.
> >
> > The newsletter talked about Apache MXNet, hence thought of sharing it
> with
> > our community. This seems to be a great achievement worth paying
> attention
> > to.
> >
> > *75 seconds: How long it takes to train a network against ImageNet:*
> > *...Fujitsu Research claims state-of-the-art ImageNet training scheme...*
> > Researchers with Fujitsu Laboratories in Japan have further reduced the
> > time it takes to train large-scale, supervised learning AI models; their
> > approach lets them train a residual network to around 75% accuracy on the
> > ImageNet dataset after 74.7 seconds of training time. This is a big leap
> > from where we were in 2017 (an hour), and is impressive relative to
> > late-2018 performance (around 4 minutes: see issue #121
> > <
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=28edafc07a&e=0b77acb987
> > >
> > ).
> >
> > *How they did it: *The researchers trained their system across *2,048
> Tesla
> > V100 GPUs* via the Amazon-developed MXNet deep learning framework. They
> > used a large mini-batch size of 81,920, and also implemented layer-wise
> > adaptive scaling (LARS) and a 'warming up' period to increase learning
> > efficiency.
> >
> > *Why it matters:* Training large models on distributed infrastructure is
> a
> > key component of modern AI research, and the reduction in time we've seen
> > on ImageNet training is striking - I think this is emblematic of the
> > industrialization of AI, as people seek to create systematic approaches
> to
> > efficiently training models across large amounts of computers. This trend
> > ultimately leads to a speedup in the rate of research reliant on
> > large-scale experimentation, and can unlock new paths of research.
> > *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on
> ImageNet
> > in 74.7 seconds (Arxiv)
> > <
> >
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=d2b13c879f&e=0b77acb987
> > >
> > .
> >
> > NVIDIA article -
> >
> >
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
> >
> > Hope that gives further impetus to strive harder!
> > Have a good week!
> > Chai
> >
> >  --
> > *Chaitanya Prakash Bapat*
> > *+1 (973) 953-6299*
> >
> > [image: https://www.linkedin.com//in/chaibapat25]
> > <https://github.com/ChaiBapchya>[image:
> https://www.facebook.com/chaibapat
> > ]
> > <https://www.facebook.com/chaibapchya>[image:
> > https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya
> >[image:
> > https://www.linkedin.com//in/chaibapat25]
> > <https://www.linkedin.com//in/chaibapchya/>
> >
>

Re: Fujitsu Breaks ImageNet Record using MXNet (under 75 sec)

Posted by Lin Yuan <ap...@gmail.com>.
Chai,

Thanks for sharing. This is awesome news!

Lin

On Mon, Apr 8, 2019 at 8:48 AM Chaitanya Bapat <ch...@gmail.com> wrote:

> Greetings!
>
> Great start to a Monday morning, as I came across this news on Import AI,
> an AI newsletter.
>
> The newsletter talked about Apache MXNet, hence thought of sharing it with
> our community. This seems to be a great achievement worth paying attention
> to.
>
> *75 seconds: How long it takes to train a network against ImageNet:*
> *...Fujitsu Research claims state-of-the-art ImageNet training scheme...*
> Researchers with Fujitsu Laboratories in Japan have further reduced the
> time it takes to train large-scale, supervised learning AI models; their
> approach lets them train a residual network to around 75% accuracy on the
> ImageNet dataset after 74.7 seconds of training time. This is a big leap
> from where we were in 2017 (an hour), and is impressive relative to
> late-2018 performance (around 4 minutes: see issue #121
> <
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=28edafc07a&e=0b77acb987
> >
> ).
>
> *How they did it: *The researchers trained their system across *2,048 Tesla
> V100 GPUs* via the Amazon-developed MXNet deep learning framework. They
> used a large mini-batch size of 81,920, and also implemented layer-wise
> adaptive scaling (LARS) and a 'warming up' period to increase learning
> efficiency.
>
> *Why it matters:* Training large models on distributed infrastructure is a
> key component of modern AI research, and the reduction in time we've seen
> on ImageNet training is striking - I think this is emblematic of the
> industrialization of AI, as people seek to create systematic approaches to
> efficiently training models across large amounts of computers. This trend
> ultimately leads to a speedup in the rate of research reliant on
> large-scale experimentation, and can unlock new paths of research.
> *  Read more:* Yet Another Accelerated SGD: ResNet-50 Training on ImageNet
> in 74.7 seconds (Arxiv)
> <
> https://twitter.us13.list-manage.com/track/click?u=67bd06787e84d73db24fb0aa5&id=d2b13c879f&e=0b77acb987
> >
> .
>
> NVIDIA article -
>
> https://news.developer.nvidia.com/fujitsu-breaks-imagenet-record-with-v100-tensor-core-gpus/
>
> Hope that gives further impetus to strive harder!
> Have a good week!
> Chai
>
>  --
> *Chaitanya Prakash Bapat*
> *+1 (973) 953-6299*
>
> [image: https://www.linkedin.com//in/chaibapat25]
> <https://github.com/ChaiBapchya>[image: https://www.facebook.com/chaibapat
> ]
> <https://www.facebook.com/chaibapchya>[image:
> https://twitter.com/ChaiBapchya] <https://twitter.com/ChaiBapchya>[image:
> https://www.linkedin.com//in/chaibapat25]
> <https://www.linkedin.com//in/chaibapchya/>
>