You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@singa.apache.org by GitBox <gi...@apache.org> on 2020/09/07 16:18:44 UTC
[GitHub] [singa] dcslin edited a comment on pull request #792: half float update
dcslin edited a comment on pull request #792:
URL: https://github.com/apache/singa/pull/792#issuecomment-688418118
current result:
- [x] training with fp16 ok, with graph, comparable accuracy
- [x] tensor cuda backend generic support on fp16 with broadcast
- [ ] review operations resuing float32
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py mlp mnist -m5
Starting Epoch 0:
Training loss = 446.399231, training accuracy = 0.870331
Evaluation accuracy = 0.922676, Elapsed Time = 4.054065s
Starting Epoch 1:
Training loss = 246.745819, training accuracy = 0.926194
Evaluation accuracy = 0.938301, Elapsed Time = 3.921566s
Starting Epoch 2:
Training loss = 201.893021, training accuracy = 0.939384
Evaluation accuracy = 0.944611, Elapsed Time = 3.735095s
Starting Epoch 3:
Training loss = 171.419769, training accuracy = 0.948289
Evaluation accuracy = 0.952524, Elapsed Time = 3.625971s
Starting Epoch 4:
Training loss = 149.009338, training accuracy = 0.955326
Evaluation accuracy = 0.956530, Elapsed Time = 3.582685s
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py mlp mnist -m5 -pfloat16
Starting Epoch 0:
Training loss = 447.799744, training accuracy = 0.869547
Evaluation accuracy = 0.922075, Elapsed Time = 3.899604s
Starting Epoch 1:
Training loss = 249.704956, training accuracy = 0.925110
Evaluation accuracy = 0.937300, Elapsed Time = 2.524199s
Starting Epoch 2:
Training loss = 206.520721, training accuracy = 0.938334
Evaluation accuracy = 0.942809, Elapsed Time = 2.410751s
Starting Epoch 3:
Training loss = 177.916901, training accuracy = 0.946538
Evaluation accuracy = 0.950120, Elapsed Time = 2.390487s
Starting Epoch 4:
Training loss = 157.046936, training accuracy = 0.952958
Evaluation accuracy = 0.954828, Elapsed Time = 2.396067s
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py cnn mnist -m5 -pfloat32
Starting Epoch 0:
Training loss = 596.964600, training accuracy = 0.789421
Evaluation accuracy = 0.943209, Elapsed Time = 7.073203s
Starting Epoch 1:
Training loss = 234.664322, training accuracy = 0.920758
Evaluation accuracy = 0.960036, Elapsed Time = 6.908865s
Starting Epoch 2:
Training loss = 165.501694, training accuracy = 0.944454
Evaluation accuracy = 0.971254, Elapsed Time = 6.795328s
Starting Epoch 3:
Training loss = 138.790848, training accuracy = 0.953559
Evaluation accuracy = 0.968950, Elapsed Time = 6.864943s
Starting Epoch 4:
Training loss = 119.547195, training accuracy = 0.959595
Evaluation accuracy = 0.970553, Elapsed Time = 10.432533s
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py cnn mnist -m5 -pfloat16
Starting Epoch 0:
Training loss = 598.742554, training accuracy = 0.752268
Evaluation accuracy = 0.941506, Elapsed Time = 13.717912s
Starting Epoch 1:
Training loss = 238.977264, training accuracy = 0.875350
Evaluation accuracy = 0.958934, Elapsed Time = 14.170568s
Starting Epoch 2:
Training loss = 169.415573, training accuracy = 0.898046
Evaluation accuracy = 0.969151, Elapsed Time = 13.457300s
Starting Epoch 3:
Training loss = 142.731216, training accuracy = 0.905600
Evaluation accuracy = 0.968850, Elapsed Time = 13.270982s
Starting Epoch 4:
Training loss = 121.980347, training accuracy = 0.911153
Evaluation accuracy = 0.971254, Elapsed Time = 9.463192s
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org