You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@singa.apache.org by GitBox <gi...@apache.org> on 2020/09/07 16:18:44 UTC

[GitHub] [singa] dcslin edited a comment on pull request #792: half float update

dcslin edited a comment on pull request #792:
URL: https://github.com/apache/singa/pull/792#issuecomment-688418118


   current result:
   - [x] training with fp16 ok, with graph, comparable accuracy
   - [x] tensor cuda backend generic support on fp16 with broadcast
   - [ ] review operations resuing float32 
   ```
   root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py mlp mnist -m5
   Starting Epoch 0:
   Training loss = 446.399231, training accuracy = 0.870331
   Evaluation accuracy = 0.922676, Elapsed Time = 4.054065s
   Starting Epoch 1:
   Training loss = 246.745819, training accuracy = 0.926194
   Evaluation accuracy = 0.938301, Elapsed Time = 3.921566s
   Starting Epoch 2:
   Training loss = 201.893021, training accuracy = 0.939384
   Evaluation accuracy = 0.944611, Elapsed Time = 3.735095s
   Starting Epoch 3:
   Training loss = 171.419769, training accuracy = 0.948289
   Evaluation accuracy = 0.952524, Elapsed Time = 3.625971s
   Starting Epoch 4:
   Training loss = 149.009338, training accuracy = 0.955326
   Evaluation accuracy = 0.956530, Elapsed Time = 3.582685s
   root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py mlp mnist -m5 -pfloat16
   Starting Epoch 0:
   Training loss = 447.799744, training accuracy = 0.869547
   Evaluation accuracy = 0.922075, Elapsed Time = 3.899604s
   Starting Epoch 1:
   Training loss = 249.704956, training accuracy = 0.925110
   Evaluation accuracy = 0.937300, Elapsed Time = 2.524199s
   Starting Epoch 2:
   Training loss = 206.520721, training accuracy = 0.938334
   Evaluation accuracy = 0.942809, Elapsed Time = 2.410751s
   Starting Epoch 3:
   Training loss = 177.916901, training accuracy = 0.946538
   Evaluation accuracy = 0.950120, Elapsed Time = 2.390487s
   Starting Epoch 4:
   Training loss = 157.046936, training accuracy = 0.952958
   Evaluation accuracy = 0.954828, Elapsed Time = 2.396067s
   root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py cnn mnist -m5 -pfloat32
   Starting Epoch 0:
   Training loss = 596.964600, training accuracy = 0.789421
   Evaluation accuracy = 0.943209, Elapsed Time = 7.073203s
   Starting Epoch 1:
   Training loss = 234.664322, training accuracy = 0.920758
   Evaluation accuracy = 0.960036, Elapsed Time = 6.908865s
   Starting Epoch 2:
   Training loss = 165.501694, training accuracy = 0.944454
   Evaluation accuracy = 0.971254, Elapsed Time = 6.795328s
   Starting Epoch 3:
   Training loss = 138.790848, training accuracy = 0.953559
   Evaluation accuracy = 0.968950, Elapsed Time = 6.864943s
   Starting Epoch 4:
   Training loss = 119.547195, training accuracy = 0.959595
   Evaluation accuracy = 0.970553, Elapsed Time = 10.432533s
   root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py cnn mnist -m5 -pfloat16
   Starting Epoch 0:
   Training loss = 598.742554, training accuracy = 0.752268
   Evaluation accuracy = 0.941506, Elapsed Time = 13.717912s
   Starting Epoch 1:
   Training loss = 238.977264, training accuracy = 0.875350
   Evaluation accuracy = 0.958934, Elapsed Time = 14.170568s
   Starting Epoch 2:
   Training loss = 169.415573, training accuracy = 0.898046
   Evaluation accuracy = 0.969151, Elapsed Time = 13.457300s
   Starting Epoch 3:
   Training loss = 142.731216, training accuracy = 0.905600
   Evaluation accuracy = 0.968850, Elapsed Time = 13.270982s
   Starting Epoch 4:
   Training loss = 121.980347, training accuracy = 0.911153
   Evaluation accuracy = 0.971254, Elapsed Time = 9.463192s
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org