You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@singa.apache.org by GitBox <gi...@apache.org> on 2020/08/12 02:22:56 UTC
[GitHub] [singa] dcslin opened a new pull request #779: half float update
dcslin opened a new pull request #779:
URL: https://github.com/apache/singa/pull/779
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] lgtm-com[bot] commented on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-672494487
This pull request **introduces 5 alerts** and **fixes 13** when merging 5b18677291672a206820b8d19897b79e84b070fe into c5769f1cc8a53c1e849c97641d71221bd4fdb5d4 - [view on LGTM.com](https://lgtm.com/projects/g/apache/singa/rev/pr-a3d44a0c8fe96deb14e18efeec185a892d43c3d8)
**new alerts:**
* 2 for Testing equality to None
* 1 for Duplication in regular expression character class
* 1 for Unused local variable
* 1 for Unused import
**fixed alerts:**
* 9 for Missing call to \_\_init\_\_ during object initialization
* 2 for Unreachable code
* 1 for Unnecessary pass
* 1 for Unused local variable
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] lgtm-com[bot] commented on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-675610454
This pull request **introduces 1 alert** when merging 75d8ec1eac270b3d99a889e8722a99181ca457d8 into 3f0997db042a5f0fa91732e25073f1e5afd7c6c8 - [view on LGTM.com](https://lgtm.com/projects/g/apache/singa/rev/pr-691e3f28e68964cca2c6e4eca9ffb0c6adbb6066)
**new alerts:**
* 1 for Unused local variable
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] dcslin commented on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
dcslin commented on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-680390565
tested below examples as a checkpoint:
native.py with fp16
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/mlp/native.py -pfloat16
train_data_shape: (400, 2)
train_label_shape: (400, 2)
training loss = 0.6914
training loss = 0.585
training loss = 0.5596
training loss = 0.539
training loss = 0.4944
training loss = 0.4238
training loss = 0.319
training loss = 0.2502
training loss = 0.2102
training loss = 0.1869
training loss = 0.1671
```
native.py with fp32
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/mlp/native.py
train_data_shape: (400, 2)
train_label_shape: (400, 2)
training loss = 0.6908379
training loss = 0.5781224
training loss = 0.5531873
training loss = 0.5157491
training loss = 0.45046344
training loss = 0.3674125
training loss = 0.2854403
training loss = 0.23216258
training loss = 0.19450127
training loss = 0.16646467
training loss = 0.13695152
```
module.py on fp16 with graph on
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/mlp/module.py -pfloat16
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0826 00:48:40.063864 34058 tensor.cc:223] Check failed: block() && block()->initialized() == true the data of the tensor needs be initialized before casting to another type
*** Check failure stack trace: ***
Aborted (core dumped)
```
module.py on fp16 with graph off
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/mlp/module.py -pfloat16 -g
training loss = 0.6094
training loss = 0.5225
training loss = 0.467
training loss = 0.404
training loss = 0.3582
training loss = 0.328
training loss = 0.3164
training loss = 0.3086
training loss = 0.3108
training loss = 0.3142
training loss = 0.3198
```
module.py on fp32 with graph on
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/mlp/module.py
training loss = 0.61159235
training loss = 0.5169311
training loss = 0.43573818
training loss = 0.34147996
training loss = 0.26603624
training loss = 0.21422084
training loss = 0.17843087
training loss = 0.15283388
training loss = 0.13402645
training loss = 0.11964666
training loss = 0.10839656
```
train cnn with mlp on fp16 with graph on
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py mlp mnist -m2 -pfloat16
Starting Epoch 0:
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0826 00:49:13.757282 34338 tensor.cc:223] Check failed: block() && block()->initialized() == true the data of the tensor needs be initialized before casting to another type
*** Check failure stack trace: ***
Aborted (core dumped)
```
train cnn with mlp on fp16 with graph off
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py mlp mnist -m2 -pfloat16 -g
Starting Epoch 0:
Training loss = 449.630493, training accuracy = 0.869180
Evaluation accuracy = 0.921675, Elapsed Time = 3.134102s
Starting Epoch 1:
Training loss = 250.288086, training accuracy = 0.925110
Evaluation accuracy = 0.937200, Elapsed Time = 3.186108s
root@1c6aaef3db53:~/singa-hp2#
```
train cnn with mlp on fp32 with graph off
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py mlp mnist -m2 -pfloat32 -g
Starting Epoch 0:
Training loss = 446.399231, training accuracy = 0.870331
Evaluation accuracy = 0.922676, Elapsed Time = 2.745227s
Starting Epoch 1:
Training loss = 246.745819, training accuracy = 0.926194
Evaluation accuracy = 0.938301, Elapsed Time = 2.591690s
```
train cnn with cnn on fp16 with graph on
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py cnn mnist -m2 -pfloat16
Starting Epoch 0:
WARNING: Logging before InitGoogleLogging() is written to STDERR
F0826 00:49:58.988692 34502 tensor.cc:223] Check failed: block() && block()->initialized() == true the data of the tensor needs be initialized before casting to another type
*** Check failure stack trace: ***
Aborted (core dumped)
```
train cnn with cnn on fp16 with graph off
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py cnn mnist -m2 -pfloat16 -g
Starting Epoch 0:
Training loss = 599.249878, training accuracy = 0.788737
Evaluation accuracy = 0.940104, Elapsed Time = 9.316158s
Starting Epoch 1:
Training loss = 236.738007, training accuracy = 0.920641
Evaluation accuracy = 0.959335, Elapsed Time = 9.277672s
```
train cnn with cnn on fp32 with graph off
```
root@1c6aaef3db53:~/singa-hp2# PYTHONPATH=build/python/ python3 examples/cnn/train_cnn.py cnn mnist -m2 -pfloat32 -g
Starting Epoch 0:
Training loss = 596.964600, training accuracy = 0.789421
Evaluation accuracy = 0.943209, Elapsed Time = 8.189669s
Starting Epoch 1:
Training loss = 234.664322, training accuracy = 0.920758
Evaluation accuracy = 0.960036, Elapsed Time = 8.101694s
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] dcslin closed pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
dcslin closed pull request #779:
URL: https://github.com/apache/singa/pull/779
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] dcslin edited a comment on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
dcslin edited a comment on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-672713946
refactored https://github.com/apache/singa/pull/775
#### half type:
- `c++` add `half.hpp`
- `c++` `f16<->f32` `singa::TypeCast` (scalar)
- `cmake` add nvcc compilation flag arch 7.0 for fp16
#### f16 tensor:
- creation
- `py` f16 `tensor.from_numpy()`
- IO
- set value
- `c++` `f16` to `tensor::set_value()`
- `c++` cast DType to SType in `Tensor::SetValue`
- copy from numpy
- `py` f16 `tensor.copy_from_numpy()`
- `py` raise exception to `tensor.copy_from_numpy()` for unsupported type
- `SWIG` API `CopyHalfFloatDataFromHostPtr`
- get value
- `c++` `f16` `Transform`: to make contiguous
- `c++` explicit instantiation for `CopyDataFromHostPtr<f16>`
- `SWIG` f16 typemaps
- `SWIG` API `GetHalfFloatValue`
- conversion
- `py` f16 `tensor.as_type()`
- `c++` `f16<->f32` `cpu/cuda` CastCopy
- arithmetic
- `c++` `f16<->f32` `cpu/cuda` `TYPE_LANG_SWITCH`
- `c++` `f16<->f32` `cpu/cuda` `TYPE_TYPE_LANG_SWITCH`
- `c++` `f16` `TYPE_SWITCH`
- `c++` cast DType to SType in `EltwiseTensorScalarFn`
- `c++` cast DType to SType in `Div` (scalar/tensor)
- math
- `c++` `f16` `cpu/cuda` Gaussian
- `c++` use `GetCudnnDataType` in `generate_tensor_nd_desc` generic type support
#### layer
- `layer.dtype_check` to check inputs and params are in same dtype
#### relu
- `c++` f16 cuda math `ReLU`
- `cuda` f16 `KernelRelu`
- `cuda` f16 `cuda::relu` kernel wrapper
#### linear
- `py` f16 init params
- `c++` f16 cuda math `GEMM`
#### softmax cross entropy
- `c++` f16 cuda math `EltwiseMult`
- `cuda` f16 `KernelMult`
- `cuda` f16 `cuda::mult` kernel wrapper
- `c++` f16 cuda math `ComputeCrossEntropy`
- `cuda` f16 `KernelComputeCrossEntropy`
- `cuda` f16 `cuda::ComputeCrossEntropy` kernel wrapper
- `c++` f16 cuda math `SoftMax`
#### train cnn
- add f16 option
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] dcslin commented on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
dcslin commented on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-680395143
Problem:
- Could not train with graph, though `compile()` is ok.
- Some fp16 operations are reusing fp32 implementation for now, but this could be fixed separately in `tensor_math_cuda.h` level.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] dcslin commented on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
dcslin commented on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-680461629
Updating the design as per @chrishkchris 's advice
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] dcslin commented on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
dcslin commented on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-672713946
refactored https://github.com/apache/singa/pull/775
changes:
- cpp
- tensor impl across device
- add `half.hpp`
- IO
- add `f16` to `TYPE_SWITCH`
- add `f16<->f32` to `singa::TypeCast` (scalar)
- add explicit instantiation for `CopyDataFromHostPtr<f16>`
- add cast DType to SType in `Tensor::SetValue`
- add `f16` to `tensor::set_value()`
- math
- add `f16<->f32` for `cpu/cuda` to `TYPE_LANG_SWITCH`
- add `f16<->f32` for `cpu/cuda` to `TYPE_TYPE_LANG_SWITCH`
- add cast DType to SType in `EltwiseTensorScalarFn`
- add cast DType to SType in `Div` (scalar/tensor)
- tensor ops impl on device
- add `GetCudnnDataType` to `generate_tensor_nd_desc`
- add `f16` to `Transform`
- add `f16` on `cpu/cuda` Gaussian
- add `f16<->f32` for `cpu/cuda` to CastCopy
- SWIG
- add f16 typemaps
- add API `GetHalfFloatValue`
- add API `CopyHalfFloatDataFromHostPtr`
- py
- add f16 to `tensor.as_type()`
- add f16 to `tensor.copy_from_numpy()`
- add raise exception to `tensor. copy_from_numpy()` for unsupported type
- add f16 to `tensor. from_numpy()`
-
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] dcslin edited a comment on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
dcslin edited a comment on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-680395143
Problem:
- Could not train with graph, though `compile()` is ok.
- Some fp16 operations are reusing fp32 implementation for now, but this could be fixed separately in `tensor_math_cuda.h` level as follow-ups later.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [singa] lgtm-com[bot] commented on pull request #779: half float update
Posted by GitBox <gi...@apache.org>.
lgtm-com[bot] commented on pull request #779:
URL: https://github.com/apache/singa/pull/779#issuecomment-675828752
This pull request **introduces 1 alert** when merging 29c8b44a177c4ac6f59ea188f3678861b4814444 into 3f0997db042a5f0fa91732e25073f1e5afd7c6c8 - [view on LGTM.com](https://lgtm.com/projects/g/apache/singa/rev/pr-35a837611bbe999ebed50c66b032099b71dc8d6c)
**new alerts:**
* 1 for Unused local variable
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org