You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/06/14 15:28:20 UTC
[GitHub] ZaidQureshi opened a new issue #11284: Issues with Gluon
example/gluon/image_classification.py example
ZaidQureshi opened a new issue #11284: Issues with Gluon example/gluon/image_classification.py example
URL: https://github.com/apache/incubator-mxnet/issues/11284
## Description
When trying to train a network (Alexnet and Resnet50) using --dtype float16 I get an error at some point in the training that data type of float32 was expected but got float16. Also, there doesn't seem to be a way to train using synthetic data with this example, (ie. the script doesn't accept --benchmark 1) although the Read for the example/image_classification claims its possible.
## Environment info (Required)
```
----------Python Info----------
Version : 3.5.2
Compiler : GCC 5.4.0 20160609
Build : ('default', 'Nov 23 2017 16:37:01')
Arch : ('64bit', 'ELF')
------------Pip Info-----------
Version : 10.0.1
Directory : /usr/local/lib/python3.5/dist-packages/pip
----------MXNet Info-----------
Version : 1.3.0
Directory : /usr/local/lib/python3.5/dist-packages/mxnet
Commit Hash : b434b8ec18f774c99b0830bd3ca66859212b4911
----------System Info----------
Platform : Linux-4.13.0-45-generic-x86_64-with-Ubuntu-16.04-xenial
system : Linux
node : css-host-8
release : 4.13.0-45-generic
version : #50~16.04.1-Ubuntu SMP Wed May 30 11:18:27 UTC 2018
----------Hardware Info----------
machine : x86_64
processor : x86_64
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 40
On-line CPU(s) list: 0-39
Thread(s) per core: 2
Core(s) per socket: 10
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz
Stepping: 1
CPU MHz: 1200.189
CPU max MHz: 3400.0000
CPU min MHz: 1200.0000
BogoMIPS: 4799.72
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-9,20-29
NUMA node1 CPU(s): 10-19,30-39
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti retpoline intel_ppin intel_pt spec_ctrl tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
----------Network Test----------
Setting timeout: 10
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0255 sec, LOAD: 0.1334 sec.
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0100 sec, LOAD: 0.5690 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.3449 sec, LOAD: 1.6452 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0099 sec, LOAD: 0.3464 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.2326 sec, LOAD: 0.6122 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.3419 sec, LOAD: 2.3122 sec.
```
Package used (Python/R/Scala/Julia):
I'm using Python3 package mxnet-cu91.
## Build info (Required if built from source)
Compiler (gcc/clang/mingw/visual studio):
gcc
MXNet commit hash:
b434b8ec18f774c99b0830bd3ca66859212b4911
Build config:
I am using the python3 mxnet-cu91 package.
## Error Message:
Resnet50:
```
INFO:root:Starting new image-classification task:, Namespace(batch_norm=False, batch_size=128, builtin_profiler=0, data_dir='', dataset='dummy', dtype='float16', epochs=10, gpus='2', kvstore='device', log_interval=1, lr=0.1, lr_factor=0.1, lr_steps='30,60,90', mode='imperative', model='resnet50_v1', momentum=0.9, num_workers=4, prefix='', profile=False, resume='', save_frequency=10, seed=123, start_epoch=0, use_pretrained=False, use_thumbnail=False, wd=0.0001)
[11:23:39] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
INFO:root:Epoch[0] Batch [0] Speed: 38.761533 samples/sec accuracy=1.000000, top_k_accuracy_5=0.000000
INFO:root:Epoch[0] Batch [1] Speed: 430.549337 samples/sec accuracy=1.000000, top_k_accuracy_5=0.500000
INFO:root:Epoch[0] Batch [2] Speed: 629.449014 samples/sec accuracy=1.000000, top_k_accuracy_5=0.666667
INFO:root:Epoch[0] Batch [3] Speed: 632.102159 samples/sec accuracy=1.000000, top_k_accuracy_5=0.750000
INFO:root:Epoch[0] Batch [4] Speed: 636.633253 samples/sec accuracy=1.000000, top_k_accuracy_5=0.800000
INFO:root:Epoch[0] Batch [5] Speed: 644.789297 samples/sec accuracy=1.000000, top_k_accuracy_5=0.833333
INFO:root:Epoch[0] Batch [6] Speed: 632.178824 samples/sec accuracy=1.000000, top_k_accuracy_5=0.857143
INFO:root:Epoch[0] Batch [7] Speed: 634.513685 samples/sec accuracy=1.000000, top_k_accuracy_5=0.875000
INFO:root:Epoch[0] Batch [8] Speed: 638.223330 samples/sec accuracy=1.000000, top_k_accuracy_5=0.888889
INFO:root:Epoch[0] Batch [9] Speed: 642.008849 samples/sec accuracy=1.000000, top_k_accuracy_5=0.900000
INFO:root:Epoch[0] Batch [10] Speed: 642.620550 samples/sec accuracy=1.000000, top_k_accuracy_5=0.909091
INFO:root:Epoch[0] Batch [11] Speed: 635.800896 samples/sec accuracy=1.000000, top_k_accuracy_5=0.916667
INFO:root:Epoch[0] Batch [12] Speed: 641.673524 samples/sec accuracy=1.000000, top_k_accuracy_5=0.923077
INFO:root:Epoch[0] Batch [13] Speed: 628.907794 samples/sec accuracy=1.000000, top_k_accuracy_5=0.928571
INFO:root:Epoch[0] Batch [14] Speed: 631.297189 samples/sec accuracy=1.000000, top_k_accuracy_5=0.933333
INFO:root:Epoch[0] Batch [15] Speed: 635.057088 samples/sec accuracy=1.000000, top_k_accuracy_5=0.937500
INFO:root:Epoch[0] Batch [16] Speed: 629.546444 samples/sec accuracy=1.000000, top_k_accuracy_5=0.941176
INFO:root:Epoch[0] Batch [17] Speed: 633.887375 samples/sec accuracy=1.000000, top_k_accuracy_5=0.944444
INFO:root:Epoch[0] Batch [18] Speed: 637.994282 samples/sec accuracy=1.000000, top_k_accuracy_5=0.947368
INFO:root:Epoch[0] Batch [19] Speed: 625.316271 samples/sec accuracy=1.000000, top_k_accuracy_5=0.950000
INFO:root:Epoch[0] Batch [20] Speed: 633.214498 samples/sec accuracy=1.000000, top_k_accuracy_5=0.952381
INFO:root:Epoch[0] Batch [21] Speed: 632.507277 samples/sec accuracy=1.000000, top_k_accuracy_5=0.954545
INFO:root:Epoch[0] Batch [22] Speed: 637.411028 samples/sec accuracy=1.000000, top_k_accuracy_5=0.956522
INFO:root:Epoch[0] Batch [23] Speed: 634.509185 samples/sec accuracy=1.000000, top_k_accuracy_5=0.958333
INFO:root:Epoch[0] Batch [24] Speed: 630.860258 samples/sec accuracy=1.000000, top_k_accuracy_5=0.960000
INFO:root:Epoch[0] Batch [25] Speed: 632.370939 samples/sec accuracy=1.000000, top_k_accuracy_5=0.961538
INFO:root:Epoch[0] Batch [26] Speed: 631.902770 samples/sec accuracy=1.000000, top_k_accuracy_5=0.962963
INFO:root:Epoch[0] Batch [27] Speed: 628.879800 samples/sec accuracy=1.000000, top_k_accuracy_5=0.964286
INFO:root:Epoch[0] Batch [28] Speed: 637.116021 samples/sec accuracy=1.000000, top_k_accuracy_5=0.965517
INFO:root:Epoch[0] Batch [29] Speed: 623.945647 samples/sec accuracy=1.000000, top_k_accuracy_5=0.966667
INFO:root:Epoch[0] Batch [30] Speed: 614.313059 samples/sec accuracy=1.000000, top_k_accuracy_5=0.967742
INFO:root:Epoch[0] Batch [31] Speed: 598.348864 samples/sec accuracy=1.000000, top_k_accuracy_5=0.968750
INFO:root:Epoch[0] Batch [32] Speed: 630.654243 samples/sec accuracy=1.000000, top_k_accuracy_5=0.969697
INFO:root:Epoch[0] Batch [33] Speed: 626.203179 samples/sec accuracy=1.000000, top_k_accuracy_5=0.970588
INFO:root:Epoch[0] Batch [34] Speed: 621.371898 samples/sec accuracy=1.000000, top_k_accuracy_5=0.971429
INFO:root:Epoch[0] Batch [35] Speed: 625.962240 samples/sec accuracy=1.000000, top_k_accuracy_5=0.972222
INFO:root:Epoch[0] Batch [36] Speed: 638.778429 samples/sec accuracy=1.000000, top_k_accuracy_5=0.972973
INFO:root:Epoch[0] Batch [37] Speed: 638.853680 samples/sec accuracy=1.000000, top_k_accuracy_5=0.973684
INFO:root:Epoch[0] Batch [38] Speed: 635.063098 samples/sec accuracy=1.000000, top_k_accuracy_5=0.974359
INFO:root:Epoch[0] Batch [39] Speed: 641.104962 samples/sec accuracy=1.000000, top_k_accuracy_5=0.975000
INFO:root:Epoch[0] Batch [40] Speed: 628.048478 samples/sec accuracy=1.000000, top_k_accuracy_5=0.975610
INFO:root:Epoch[0] Batch [41] Speed: 624.718154 samples/sec accuracy=1.000000, top_k_accuracy_5=0.976190
INFO:root:Epoch[0] Batch [42] Speed: 630.666097 samples/sec accuracy=1.000000, top_k_accuracy_5=0.976744
INFO:root:Epoch[0] Batch [43] Speed: 629.468940 samples/sec accuracy=1.000000, top_k_accuracy_5=0.977273
INFO:root:Epoch[0] Batch [44] Speed: 634.288790 samples/sec accuracy=1.000000, top_k_accuracy_5=0.977778
INFO:root:Epoch[0] Batch [45] Speed: 620.335145 samples/sec accuracy=1.000000, top_k_accuracy_5=0.978261
INFO:root:Epoch[0] Batch [46] Speed: 630.456506 samples/sec accuracy=1.000000, top_k_accuracy_5=0.978723
INFO:root:Epoch[0] Batch [47] Speed: 631.097565 samples/sec accuracy=1.000000, top_k_accuracy_5=0.979167
INFO:root:Epoch[0] Batch [48] Speed: 625.456142 samples/sec accuracy=1.000000, top_k_accuracy_5=0.979592
INFO:root:Epoch[0] Batch [49] Speed: 597.499816 samples/sec accuracy=1.000000, top_k_accuracy_5=0.980000
INFO:root:Epoch[0] Batch [50] Speed: 631.840300 samples/sec accuracy=1.000000, top_k_accuracy_5=0.980392
INFO:root:Epoch[0] Batch [51] Speed: 635.177303 samples/sec accuracy=1.000000, top_k_accuracy_5=0.980769
INFO:root:Epoch[0] Batch [52] Speed: 629.745088 samples/sec accuracy=1.000000, top_k_accuracy_5=0.981132
INFO:root:Epoch[0] Batch [53] Speed: 633.966719 samples/sec accuracy=1.000000, top_k_accuracy_5=0.981481
INFO:root:Epoch[0] Batch [54] Speed: 633.290685 samples/sec accuracy=1.000000, top_k_accuracy_5=0.981818
INFO:root:Epoch[0] Batch [55] Speed: 629.980078 samples/sec accuracy=1.000000, top_k_accuracy_5=0.982143
INFO:root:Epoch[0] Batch [56] Speed: 627.281642 samples/sec accuracy=1.000000, top_k_accuracy_5=0.982456
INFO:root:Epoch[0] Batch [57] Speed: 636.660431 samples/sec accuracy=1.000000, top_k_accuracy_5=0.982759
INFO:root:Epoch[0] Batch [58] Speed: 638.425210 samples/sec accuracy=1.000000, top_k_accuracy_5=0.983051
INFO:root:Epoch[0] Batch [59] Speed: 628.249118 samples/sec accuracy=1.000000, top_k_accuracy_5=0.983333
INFO:root:Epoch[0] Batch [60] Speed: 634.437953 samples/sec accuracy=1.000000, top_k_accuracy_5=0.983607
INFO:root:Epoch[0] Batch [61] Speed: 629.313991 samples/sec accuracy=1.000000, top_k_accuracy_5=0.983871
INFO:root:Epoch[0] Batch [62] Speed: 629.671966 samples/sec accuracy=1.000000, top_k_accuracy_5=0.984127
INFO:root:Epoch[0] Batch [63] Speed: 636.321618 samples/sec accuracy=1.000000, top_k_accuracy_5=0.984375
INFO:root:Epoch[0] Batch [64] Speed: 636.397046 samples/sec accuracy=1.000000, top_k_accuracy_5=0.984615
INFO:root:Epoch[0] Batch [65] Speed: 630.730557 samples/sec accuracy=1.000000, top_k_accuracy_5=0.984848
INFO:root:Epoch[0] Batch [66] Speed: 635.813696 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985075
INFO:root:Epoch[0] Batch [67] Speed: 639.850346 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985294
INFO:root:Epoch[0] Batch [68] Speed: 630.247795 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985507
INFO:root:Epoch[0] Batch [69] Speed: 642.182405 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985714
INFO:root:Epoch[0] Batch [70] Speed: 629.762078 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985915
INFO:root:Epoch[0] Batch [71] Speed: 635.816708 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986111
INFO:root:Epoch[0] Batch [72] Speed: 631.053798 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986301
INFO:root:Epoch[0] Batch [73] Speed: 635.151002 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986486
INFO:root:Epoch[0] Batch [74] Speed: 631.234839 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986667
INFO:root:Epoch[0] Batch [75] Speed: 637.794947 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986842
INFO:root:Epoch[0] Batch [76] Speed: 635.627762 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987013
INFO:root:Epoch[0] Batch [77] Speed: 634.704971 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987179
INFO:root:Epoch[0] Batch [78] Speed: 626.428223 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987342
INFO:root:Epoch[0] Batch [79] Speed: 631.866328 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987500
INFO:root:Epoch[0] Batch [80] Speed: 633.852949 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987654
INFO:root:Epoch[0] Batch [81] Speed: 628.484464 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987805
INFO:root:Epoch[0] Batch [82] Speed: 639.696342 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987952
INFO:root:Epoch[0] Batch [83] Speed: 628.357943 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988095
INFO:root:Epoch[0] Batch [84] Speed: 632.414144 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988235
INFO:root:Epoch[0] Batch [85] Speed: 635.123201 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988372
INFO:root:Epoch[0] Batch [86] Speed: 637.747216 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988506
INFO:root:Epoch[0] Batch [87] Speed: 630.293670 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988636
INFO:root:Epoch[0] Batch [88] Speed: 634.680210 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988764
INFO:root:Epoch[0] Batch [89] Speed: 630.405426 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988889
INFO:root:Epoch[0] Batch [90] Speed: 630.142012 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989011
INFO:root:Epoch[0] Batch [91] Speed: 635.331395 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989130
INFO:root:Epoch[0] Batch [92] Speed: 637.355788 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989247
INFO:root:Epoch[0] Batch [93] Speed: 630.492786 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989362
INFO:root:Epoch[0] Batch [94] Speed: 627.613096 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989474
INFO:root:Epoch[0] Batch [95] Speed: 636.513241 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989583
INFO:root:Epoch[0] Batch [96] Speed: 639.884664 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989691
INFO:root:Epoch[0] Batch [97] Speed: 635.479544 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989796
INFO:root:Epoch[0] Batch [98] Speed: 637.682071 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989899
INFO:root:Epoch[0] Batch [99] Speed: 631.076052 samples/sec accuracy=1.000000, top_k_accuracy_5=0.990000
INFO:root:[Epoch 0] training: accuracy=1.000000, top_k_accuracy_5=0.990000
INFO:root:[Epoch 0] time cost: 23.480652
Traceback (most recent call last):
File "../gluon/image_classification.py", line 290, in <module>
main()
File "../gluon/image_classification.py", line 274, in main
train(opt, context)
File "../gluon/image_classification.py", line 242, in train
name, val_acc = test(ctx, val_data)
File "../gluon/image_classification.py", line 166, in test
outputs.append(net(x))
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
out = self.forward(*args)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
return self.hybrid_forward(ndarray, x, *args, **params)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/model_zoo/vision/resnet.py", line 279, in hybrid_forward
x = self.features(x)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
out = self.forward(*args)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
return self.hybrid_forward(ndarray, x, *args, **params)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/nn/basic_layers.py", line 117, in hybrid_forward
x = block(x)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
out = self.forward(*args)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
return self.hybrid_forward(ndarray, x, *args, **params)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/nn/conv_layers.py", line 133, in hybrid_forward
act = getattr(F, self._op_name)(x, weight, name='fwd', **self._kwargs)
File "<string>", line 167, in Convolution
File "/usr/local/lib/python3.5/dist-packages/mxnet/_ctypes/ndarray.py", line 92, in _imperative_invoke
ctypes.byref(out_stypes)))
File "/usr/local/lib/python3.5/dist-packages/mxnet/base.py", line 210, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [11:24:02] src/operator/nn/convolution.cc:283: Check failed: (*in_type)[i] == dtype (2 vs. 0) This layer requires uniform type. Expected 'float32' v.s. given 'float16' at 'weight'
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x34d0ea) [0x7fcc47e1a0ea]
[bt] (1) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x34d711) [0x7fcc47e1a711]
[bt] (2) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x59284c) [0x7fcc4805f84c]
[bt] (3) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x26d963f) [0x7fcc4a1a663f]
[bt] (4) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x26e2cad) [0x7fcc4a1afcad]
[bt] (5) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x25ffe99) [0x7fcc4a0cce99]
[bt] (6) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(MXImperativeInvokeEx+0x6f) [0x7fcc4a0cd48f]
[bt] (7) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7fccd70f7e20]
[bt] (8) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7fccd70f788b]
[bt] (9) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(_ctypes_callproc+0x49a) [0x7fccd70f201a]
```
Alexnet:
```
INFO:root:Starting new image-classification task:, Namespace(batch_norm=False, batch_size=128, builtin_profiler=0, data_dir='', dataset='dummy', dtype='float16', epochs=10, gpus='2', kvstore='device', log_interval=1, lr=0.1, lr_factor=0.1, lr_steps='30,60,90', mode='imperative', model='alexnet', momentum=0.9, num_workers=4, prefix='', profile=False, resume='', save_frequency=10, seed=123, start_epoch=0, use_pretrained=False, use_thumbnail=False, wd=0.0001)
[11:25:47] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
INFO:root:Epoch[0] Batch [0] Speed: 129.885558 samples/sec accuracy=1.000000, top_k_accuracy_5=0.000000
INFO:root:Epoch[0] Batch [1] Speed: 1461.012374 samples/sec accuracy=1.000000, top_k_accuracy_5=0.500000
INFO:root:Epoch[0] Batch [2] Speed: 1709.083278 samples/sec accuracy=1.000000, top_k_accuracy_5=0.666667
INFO:root:Epoch[0] Batch [3] Speed: 2039.070355 samples/sec accuracy=1.000000, top_k_accuracy_5=0.750000
INFO:root:Epoch[0] Batch [4] Speed: 1956.020534 samples/sec accuracy=1.000000, top_k_accuracy_5=0.800000
INFO:root:Epoch[0] Batch [5] Speed: 2144.635572 samples/sec accuracy=1.000000, top_k_accuracy_5=0.833333
INFO:root:Epoch[0] Batch [6] Speed: 2110.291864 samples/sec accuracy=1.000000, top_k_accuracy_5=0.857143
INFO:root:Epoch[0] Batch [7] Speed: 2191.077322 samples/sec accuracy=1.000000, top_k_accuracy_5=0.875000
INFO:root:Epoch[0] Batch [8] Speed: 2126.322487 samples/sec accuracy=1.000000, top_k_accuracy_5=0.888889
INFO:root:Epoch[0] Batch [9] Speed: 2168.509516 samples/sec accuracy=1.000000, top_k_accuracy_5=0.900000
INFO:root:Epoch[0] Batch [10] Speed: 2211.210742 samples/sec accuracy=1.000000, top_k_accuracy_5=0.909091
INFO:root:Epoch[0] Batch [11] Speed: 2181.843317 samples/sec accuracy=1.000000, top_k_accuracy_5=0.916667
INFO:root:Epoch[0] Batch [12] Speed: 1934.356274 samples/sec accuracy=1.000000, top_k_accuracy_5=0.923077
INFO:root:Epoch[0] Batch [13] Speed: 2177.674933 samples/sec accuracy=1.000000, top_k_accuracy_5=0.928571
INFO:root:Epoch[0] Batch [14] Speed: 2229.789643 samples/sec accuracy=1.000000, top_k_accuracy_5=0.933333
INFO:root:Epoch[0] Batch [15] Speed: 2218.034902 samples/sec accuracy=1.000000, top_k_accuracy_5=0.937500
INFO:root:Epoch[0] Batch [16] Speed: 2183.964593 samples/sec accuracy=1.000000, top_k_accuracy_5=0.941176
INFO:root:Epoch[0] Batch [17] Speed: 2190.022648 samples/sec accuracy=1.000000, top_k_accuracy_5=0.944444
INFO:root:Epoch[0] Batch [18] Speed: 2142.803765 samples/sec accuracy=1.000000, top_k_accuracy_5=0.947368
INFO:root:Epoch[0] Batch [19] Speed: 2128.269630 samples/sec accuracy=1.000000, top_k_accuracy_5=0.950000
INFO:root:Epoch[0] Batch [20] Speed: 2138.817161 samples/sec accuracy=1.000000, top_k_accuracy_5=0.952381
INFO:root:Epoch[0] Batch [21] Speed: 2122.069741 samples/sec accuracy=1.000000, top_k_accuracy_5=0.954545
INFO:root:Epoch[0] Batch [22] Speed: 2187.087427 samples/sec accuracy=1.000000, top_k_accuracy_5=0.956522
INFO:root:Epoch[0] Batch [23] Speed: 2153.997336 samples/sec accuracy=1.000000, top_k_accuracy_5=0.958333
INFO:root:Epoch[0] Batch [24] Speed: 2151.511277 samples/sec accuracy=1.000000, top_k_accuracy_5=0.960000
INFO:root:Epoch[0] Batch [25] Speed: 1927.757813 samples/sec accuracy=1.000000, top_k_accuracy_5=0.961538
INFO:root:Epoch[0] Batch [26] Speed: 2316.855017 samples/sec accuracy=1.000000, top_k_accuracy_5=0.962963
INFO:root:Epoch[0] Batch [27] Speed: 2260.004765 samples/sec accuracy=1.000000, top_k_accuracy_5=0.964286
INFO:root:Epoch[0] Batch [28] Speed: 2165.666585 samples/sec accuracy=1.000000, top_k_accuracy_5=0.965517
INFO:root:Epoch[0] Batch [29] Speed: 2325.495692 samples/sec accuracy=1.000000, top_k_accuracy_5=0.966667
INFO:root:Epoch[0] Batch [30] Speed: 2376.797025 samples/sec accuracy=1.000000, top_k_accuracy_5=0.967742
INFO:root:Epoch[0] Batch [31] Speed: 2369.652818 samples/sec accuracy=1.000000, top_k_accuracy_5=0.968750
INFO:root:Epoch[0] Batch [32] Speed: 2460.171438 samples/sec accuracy=1.000000, top_k_accuracy_5=0.969697
INFO:root:Epoch[0] Batch [33] Speed: 2185.253571 samples/sec accuracy=1.000000, top_k_accuracy_5=0.970588
INFO:root:Epoch[0] Batch [34] Speed: 2427.390954 samples/sec accuracy=1.000000, top_k_accuracy_5=0.971429
INFO:root:Epoch[0] Batch [35] Speed: 2398.479758 samples/sec accuracy=1.000000, top_k_accuracy_5=0.972222
INFO:root:Epoch[0] Batch [36] Speed: 2362.977769 samples/sec accuracy=1.000000, top_k_accuracy_5=0.972973
INFO:root:Epoch[0] Batch [37] Speed: 2300.701141 samples/sec accuracy=1.000000, top_k_accuracy_5=0.973684
INFO:root:Epoch[0] Batch [38] Speed: 2419.459984 samples/sec accuracy=1.000000, top_k_accuracy_5=0.974359
INFO:root:Epoch[0] Batch [39] Speed: 2052.376520 samples/sec accuracy=1.000000, top_k_accuracy_5=0.975000
INFO:root:Epoch[0] Batch [40] Speed: 2383.476415 samples/sec accuracy=1.000000, top_k_accuracy_5=0.975610
INFO:root:Epoch[0] Batch [41] Speed: 2334.881214 samples/sec accuracy=1.000000, top_k_accuracy_5=0.976190
INFO:root:Epoch[0] Batch [42] Speed: 2382.482158 samples/sec accuracy=1.000000, top_k_accuracy_5=0.976744
INFO:root:Epoch[0] Batch [43] Speed: 2269.846535 samples/sec accuracy=1.000000, top_k_accuracy_5=0.977273
INFO:root:Epoch[0] Batch [44] Speed: 2288.666934 samples/sec accuracy=1.000000, top_k_accuracy_5=0.977778
INFO:root:Epoch[0] Batch [45] Speed: 2362.738584 samples/sec accuracy=1.000000, top_k_accuracy_5=0.978261
INFO:root:Epoch[0] Batch [46] Speed: 2070.470430 samples/sec accuracy=1.000000, top_k_accuracy_5=0.978723
INFO:root:Epoch[0] Batch [47] Speed: 2272.854291 samples/sec accuracy=1.000000, top_k_accuracy_5=0.979167
INFO:root:Epoch[0] Batch [48] Speed: 2232.395025 samples/sec accuracy=1.000000, top_k_accuracy_5=0.979592
INFO:root:Epoch[0] Batch [49] Speed: 2197.246896 samples/sec accuracy=1.000000, top_k_accuracy_5=0.980000
INFO:root:Epoch[0] Batch [50] Speed: 2404.936959 samples/sec accuracy=1.000000, top_k_accuracy_5=0.980392
INFO:root:Epoch[0] Batch [51] Speed: 2411.732337 samples/sec accuracy=1.000000, top_k_accuracy_5=0.980769
INFO:root:Epoch[0] Batch [52] Speed: 2290.795835 samples/sec accuracy=1.000000, top_k_accuracy_5=0.981132
INFO:root:Epoch[0] Batch [53] Speed: 1863.391049 samples/sec accuracy=1.000000, top_k_accuracy_5=0.981481
INFO:root:Epoch[0] Batch [54] Speed: 1388.702278 samples/sec accuracy=1.000000, top_k_accuracy_5=0.981818
INFO:root:Epoch[0] Batch [55] Speed: 1917.601572 samples/sec accuracy=1.000000, top_k_accuracy_5=0.982143
INFO:root:Epoch[0] Batch [56] Speed: 1562.342599 samples/sec accuracy=1.000000, top_k_accuracy_5=0.982456
INFO:root:Epoch[0] Batch [57] Speed: 1610.213403 samples/sec accuracy=1.000000, top_k_accuracy_5=0.982759
INFO:root:Epoch[0] Batch [58] Speed: 1903.716551 samples/sec accuracy=1.000000, top_k_accuracy_5=0.983051
INFO:root:Epoch[0] Batch [59] Speed: 2190.898493 samples/sec accuracy=1.000000, top_k_accuracy_5=0.983333
INFO:root:Epoch[0] Batch [60] Speed: 1648.527213 samples/sec accuracy=1.000000, top_k_accuracy_5=0.983607
INFO:root:Epoch[0] Batch [61] Speed: 1567.492583 samples/sec accuracy=1.000000, top_k_accuracy_5=0.983871
INFO:root:Epoch[0] Batch [62] Speed: 1439.671858 samples/sec accuracy=1.000000, top_k_accuracy_5=0.984127
INFO:root:Epoch[0] Batch [63] Speed: 1838.045082 samples/sec accuracy=1.000000, top_k_accuracy_5=0.984375
INFO:root:Epoch[0] Batch [64] Speed: 1925.883759 samples/sec accuracy=1.000000, top_k_accuracy_5=0.984615
INFO:root:Epoch[0] Batch [65] Speed: 1418.372237 samples/sec accuracy=1.000000, top_k_accuracy_5=0.984848
INFO:root:Epoch[0] Batch [66] Speed: 1535.197685 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985075
INFO:root:Epoch[0] Batch [67] Speed: 1737.688131 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985294
INFO:root:Epoch[0] Batch [68] Speed: 1927.737047 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985507
INFO:root:Epoch[0] Batch [69] Speed: 1889.632021 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985714
INFO:root:Epoch[0] Batch [70] Speed: 1624.872618 samples/sec accuracy=1.000000, top_k_accuracy_5=0.985915
INFO:root:Epoch[0] Batch [71] Speed: 1791.229579 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986111
INFO:root:Epoch[0] Batch [72] Speed: 2030.901763 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986301
INFO:root:Epoch[0] Batch [73] Speed: 1657.909581 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986486
INFO:root:Epoch[0] Batch [74] Speed: 1754.525975 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986667
INFO:root:Epoch[0] Batch [75] Speed: 2114.081166 samples/sec accuracy=1.000000, top_k_accuracy_5=0.986842
INFO:root:Epoch[0] Batch [76] Speed: 2005.547071 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987013
INFO:root:Epoch[0] Batch [77] Speed: 1223.288891 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987179
INFO:root:Epoch[0] Batch [78] Speed: 1632.888602 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987342
INFO:root:Epoch[0] Batch [79] Speed: 1857.749099 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987500
INFO:root:Epoch[0] Batch [80] Speed: 1369.160001 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987654
INFO:root:Epoch[0] Batch [81] Speed: 1581.092164 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987805
INFO:root:Epoch[0] Batch [82] Speed: 1803.207970 samples/sec accuracy=1.000000, top_k_accuracy_5=0.987952
INFO:root:Epoch[0] Batch [83] Speed: 2025.232513 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988095
INFO:root:Epoch[0] Batch [84] Speed: 2021.739536 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988235
INFO:root:Epoch[0] Batch [85] Speed: 2089.666748 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988372
INFO:root:Epoch[0] Batch [86] Speed: 1606.815830 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988506
INFO:root:Epoch[0] Batch [87] Speed: 1784.661886 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988636
INFO:root:Epoch[0] Batch [88] Speed: 1571.737384 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988764
INFO:root:Epoch[0] Batch [89] Speed: 1813.305881 samples/sec accuracy=1.000000, top_k_accuracy_5=0.988889
INFO:root:Epoch[0] Batch [90] Speed: 2018.721515 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989011
INFO:root:Epoch[0] Batch [91] Speed: 1937.973237 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989130
INFO:root:Epoch[0] Batch [92] Speed: 2058.198976 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989247
INFO:root:Epoch[0] Batch [93] Speed: 2084.692704 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989362
INFO:root:Epoch[0] Batch [94] Speed: 1540.413033 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989474
INFO:root:Epoch[0] Batch [95] Speed: 1467.987477 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989583
INFO:root:Epoch[0] Batch [96] Speed: 1834.095430 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989691
INFO:root:Epoch[0] Batch [97] Speed: 1582.350376 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989796
INFO:root:Epoch[0] Batch [98] Speed: 1732.048380 samples/sec accuracy=1.000000, top_k_accuracy_5=0.989899
INFO:root:Epoch[0] Batch [99] Speed: 1598.191591 samples/sec accuracy=1.000000, top_k_accuracy_5=0.990000
INFO:root:[Epoch 0] training: accuracy=1.000000, top_k_accuracy_5=0.990000
INFO:root:[Epoch 0] time cost: 7.557455
Traceback (most recent call last):
File "../gluon/image_classification.py", line 290, in <module>
main()
File "../gluon/image_classification.py", line 274, in main
train(opt, context)
File "../gluon/image_classification.py", line 242, in train
name, val_acc = test(ctx, val_data)
File "../gluon/image_classification.py", line 166, in test
outputs.append(net(x))
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
out = self.forward(*args)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
return self.hybrid_forward(ndarray, x, *args, **params)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/model_zoo/vision/alexnet.py", line 65, in hybrid_forward
x = self.features(x)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
out = self.forward(*args)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
return self.hybrid_forward(ndarray, x, *args, **params)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/nn/basic_layers.py", line 117, in hybrid_forward
x = block(x)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
out = self.forward(*args)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
return self.hybrid_forward(ndarray, x, *args, **params)
File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/nn/conv_layers.py", line 135, in hybrid_forward
act = getattr(F, self._op_name)(x, weight, bias, name='fwd', **self._kwargs)
File "<string>", line 167, in Convolution
File "/usr/local/lib/python3.5/dist-packages/mxnet/_ctypes/ndarray.py", line 92, in _imperative_invoke
ctypes.byref(out_stypes)))
File "/usr/local/lib/python3.5/dist-packages/mxnet/base.py", line 210, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: [11:25:54] src/operator/nn/convolution.cc:283: Check failed: (*in_type)[i] == dtype (2 vs. 0) This layer requires uniform type. Expected 'float32' v.s. given 'float16' at 'weight'
Stack trace returned 10 entries:
[bt] (0) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x34d0ea) [0x7f28b86da0ea]
[bt] (1) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x34d711) [0x7f28b86da711]
[bt] (2) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x59284c) [0x7f28b891f84c]
[bt] (3) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x26d963f) [0x7f28baa6663f]
[bt] (4) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x26e2cad) [0x7f28baa6fcad]
[bt] (5) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x25ffe99) [0x7f28ba98ce99]
[bt] (6) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(MXImperativeInvokeEx+0x6f) [0x7f28ba98d48f]
[bt] (7) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7f29479b7e20]
[bt] (8) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7f29479b788b]
[bt] (9) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(_ctypes_callproc+0x49a) [0x7f29479b201a]
```
## Minimum reproducible example
(If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.)
## Steps to reproduce
(Paste the commands you ran that produced the error.)
1. python3 ../gluon/image_classification.py --dataset dummy --gpus 2 --epochs 10 --mode imperative --model resnet50_v2 --batch-size 128 --log-interval 1 --dtype float16
2. python3 ../gluon/image_classification.py --dataset dummy --gpus 2 --epochs 10 --mode imperative --model alexnet --batch-size 128 --log-interval 1 --dtype float16
## What have you tried to solve it?
I have no idea how to solve this.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services