You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/06/14 15:28:20 UTC

[GitHub] ZaidQureshi opened a new issue #11284: Issues with Gluon example/gluon/image_classification.py example

ZaidQureshi opened a new issue #11284: Issues with Gluon example/gluon/image_classification.py example
URL: https://github.com/apache/incubator-mxnet/issues/11284
 
 
   
   ## Description
   When trying to train a network (Alexnet and Resnet50) using --dtype float16 I get an error at some point in the training that data type of float32 was expected but got  float16. Also, there doesn't seem to be a way to train using synthetic data with this example, (ie. the script doesn't accept --benchmark 1) although the Read for the example/image_classification claims its possible.
   
   ## Environment info (Required)
   
   ```
   ----------Python Info----------
   Version      : 3.5.2
   Compiler     : GCC 5.4.0 20160609
   Build        : ('default', 'Nov 23 2017 16:37:01')
   Arch         : ('64bit', 'ELF')
   ------------Pip Info-----------
   Version      : 10.0.1
   Directory    : /usr/local/lib/python3.5/dist-packages/pip
   ----------MXNet Info-----------
   Version      : 1.3.0
   Directory    : /usr/local/lib/python3.5/dist-packages/mxnet
   Commit Hash   : b434b8ec18f774c99b0830bd3ca66859212b4911
   ----------System Info----------
   Platform     : Linux-4.13.0-45-generic-x86_64-with-Ubuntu-16.04-xenial
   system       : Linux
   node         : css-host-8
   release      : 4.13.0-45-generic
   version      : #50~16.04.1-Ubuntu SMP Wed May 30 11:18:27 UTC 2018
   ----------Hardware Info----------
   machine      : x86_64
   processor    : x86_64
   Architecture:          x86_64
   CPU op-mode(s):        32-bit, 64-bit
   Byte Order:            Little Endian
   CPU(s):                40
   On-line CPU(s) list:   0-39
   Thread(s) per core:    2
   Core(s) per socket:    10
   Socket(s):             2
   NUMA node(s):          2
   Vendor ID:             GenuineIntel
   CPU family:            6
   Model:                 79
   Model name:            Intel(R) Xeon(R) CPU E5-2640 v4 @ 2.40GHz
   Stepping:              1
   CPU MHz:               1200.189
   CPU max MHz:           3400.0000
   CPU min MHz:           1200.0000
   BogoMIPS:              4799.72
   Virtualization:        VT-x
   L1d cache:             32K
   L1i cache:             32K
   L2 cache:              256K
   L3 cache:              25600K
   NUMA node0 CPU(s):     0-9,20-29
   NUMA node1 CPU(s):     10-19,30-39
   Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti retpoline intel_ppin intel_pt spec_ctrl tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts
   ----------Network Test----------
   Setting timeout: 10
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0255 sec, LOAD: 0.1334 sec.
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0100 sec, LOAD: 0.5690 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.3449 sec, LOAD: 1.6452 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0099 sec, LOAD: 0.3464 sec.
   Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.2326 sec, LOAD: 0.6122 sec.
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.3419 sec, LOAD: 2.3122 sec.
   
   ```
   
   Package used (Python/R/Scala/Julia):
   I'm using Python3 package mxnet-cu91.
   
   ## Build info (Required if built from source)
   
   Compiler (gcc/clang/mingw/visual studio):
   gcc
   
   MXNet commit hash:
   b434b8ec18f774c99b0830bd3ca66859212b4911
   
   Build config:
   I am using the python3 mxnet-cu91 package.
   
   ## Error Message:
   
   Resnet50:
   ```
   
   INFO:root:Starting new image-classification task:, Namespace(batch_norm=False, batch_size=128, builtin_profiler=0, data_dir='', dataset='dummy', dtype='float16', epochs=10, gpus='2', kvstore='device', log_interval=1, lr=0.1, lr_factor=0.1, lr_steps='30,60,90', mode='imperative', model='resnet50_v1', momentum=0.9, num_workers=4, prefix='', profile=False, resume='', save_frequency=10, seed=123, start_epoch=0, use_pretrained=False, use_thumbnail=False, wd=0.0001)
   [11:23:39] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
   INFO:root:Epoch[0] Batch [0]	Speed: 38.761533 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.000000
   INFO:root:Epoch[0] Batch [1]	Speed: 430.549337 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.500000
   INFO:root:Epoch[0] Batch [2]	Speed: 629.449014 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.666667
   INFO:root:Epoch[0] Batch [3]	Speed: 632.102159 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.750000
   INFO:root:Epoch[0] Batch [4]	Speed: 636.633253 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.800000
   INFO:root:Epoch[0] Batch [5]	Speed: 644.789297 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.833333
   INFO:root:Epoch[0] Batch [6]	Speed: 632.178824 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.857143
   INFO:root:Epoch[0] Batch [7]	Speed: 634.513685 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.875000
   INFO:root:Epoch[0] Batch [8]	Speed: 638.223330 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.888889
   INFO:root:Epoch[0] Batch [9]	Speed: 642.008849 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.900000
   INFO:root:Epoch[0] Batch [10]	Speed: 642.620550 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.909091
   INFO:root:Epoch[0] Batch [11]	Speed: 635.800896 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.916667
   INFO:root:Epoch[0] Batch [12]	Speed: 641.673524 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.923077
   INFO:root:Epoch[0] Batch [13]	Speed: 628.907794 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.928571
   INFO:root:Epoch[0] Batch [14]	Speed: 631.297189 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.933333
   INFO:root:Epoch[0] Batch [15]	Speed: 635.057088 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.937500
   INFO:root:Epoch[0] Batch [16]	Speed: 629.546444 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.941176
   INFO:root:Epoch[0] Batch [17]	Speed: 633.887375 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.944444
   INFO:root:Epoch[0] Batch [18]	Speed: 637.994282 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.947368
   INFO:root:Epoch[0] Batch [19]	Speed: 625.316271 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.950000
   INFO:root:Epoch[0] Batch [20]	Speed: 633.214498 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.952381
   INFO:root:Epoch[0] Batch [21]	Speed: 632.507277 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.954545
   INFO:root:Epoch[0] Batch [22]	Speed: 637.411028 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.956522
   INFO:root:Epoch[0] Batch [23]	Speed: 634.509185 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.958333
   INFO:root:Epoch[0] Batch [24]	Speed: 630.860258 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.960000
   INFO:root:Epoch[0] Batch [25]	Speed: 632.370939 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.961538
   INFO:root:Epoch[0] Batch [26]	Speed: 631.902770 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.962963
   INFO:root:Epoch[0] Batch [27]	Speed: 628.879800 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.964286
   INFO:root:Epoch[0] Batch [28]	Speed: 637.116021 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.965517
   INFO:root:Epoch[0] Batch [29]	Speed: 623.945647 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.966667
   INFO:root:Epoch[0] Batch [30]	Speed: 614.313059 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.967742
   INFO:root:Epoch[0] Batch [31]	Speed: 598.348864 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.968750
   INFO:root:Epoch[0] Batch [32]	Speed: 630.654243 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.969697
   INFO:root:Epoch[0] Batch [33]	Speed: 626.203179 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.970588
   INFO:root:Epoch[0] Batch [34]	Speed: 621.371898 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.971429
   INFO:root:Epoch[0] Batch [35]	Speed: 625.962240 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.972222
   INFO:root:Epoch[0] Batch [36]	Speed: 638.778429 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.972973
   INFO:root:Epoch[0] Batch [37]	Speed: 638.853680 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.973684
   INFO:root:Epoch[0] Batch [38]	Speed: 635.063098 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.974359
   INFO:root:Epoch[0] Batch [39]	Speed: 641.104962 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.975000
   INFO:root:Epoch[0] Batch [40]	Speed: 628.048478 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.975610
   INFO:root:Epoch[0] Batch [41]	Speed: 624.718154 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.976190
   INFO:root:Epoch[0] Batch [42]	Speed: 630.666097 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.976744
   INFO:root:Epoch[0] Batch [43]	Speed: 629.468940 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.977273
   INFO:root:Epoch[0] Batch [44]	Speed: 634.288790 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.977778
   INFO:root:Epoch[0] Batch [45]	Speed: 620.335145 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.978261
   INFO:root:Epoch[0] Batch [46]	Speed: 630.456506 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.978723
   INFO:root:Epoch[0] Batch [47]	Speed: 631.097565 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.979167
   INFO:root:Epoch[0] Batch [48]	Speed: 625.456142 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.979592
   INFO:root:Epoch[0] Batch [49]	Speed: 597.499816 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.980000
   INFO:root:Epoch[0] Batch [50]	Speed: 631.840300 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.980392
   INFO:root:Epoch[0] Batch [51]	Speed: 635.177303 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.980769
   INFO:root:Epoch[0] Batch [52]	Speed: 629.745088 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.981132
   INFO:root:Epoch[0] Batch [53]	Speed: 633.966719 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.981481
   INFO:root:Epoch[0] Batch [54]	Speed: 633.290685 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.981818
   INFO:root:Epoch[0] Batch [55]	Speed: 629.980078 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.982143
   INFO:root:Epoch[0] Batch [56]	Speed: 627.281642 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.982456
   INFO:root:Epoch[0] Batch [57]	Speed: 636.660431 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.982759
   INFO:root:Epoch[0] Batch [58]	Speed: 638.425210 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.983051
   INFO:root:Epoch[0] Batch [59]	Speed: 628.249118 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.983333
   INFO:root:Epoch[0] Batch [60]	Speed: 634.437953 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.983607
   INFO:root:Epoch[0] Batch [61]	Speed: 629.313991 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.983871
   INFO:root:Epoch[0] Batch [62]	Speed: 629.671966 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.984127
   INFO:root:Epoch[0] Batch [63]	Speed: 636.321618 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.984375
   INFO:root:Epoch[0] Batch [64]	Speed: 636.397046 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.984615
   INFO:root:Epoch[0] Batch [65]	Speed: 630.730557 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.984848
   INFO:root:Epoch[0] Batch [66]	Speed: 635.813696 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985075
   INFO:root:Epoch[0] Batch [67]	Speed: 639.850346 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985294
   INFO:root:Epoch[0] Batch [68]	Speed: 630.247795 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985507
   INFO:root:Epoch[0] Batch [69]	Speed: 642.182405 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985714
   INFO:root:Epoch[0] Batch [70]	Speed: 629.762078 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985915
   INFO:root:Epoch[0] Batch [71]	Speed: 635.816708 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986111
   INFO:root:Epoch[0] Batch [72]	Speed: 631.053798 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986301
   INFO:root:Epoch[0] Batch [73]	Speed: 635.151002 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986486
   INFO:root:Epoch[0] Batch [74]	Speed: 631.234839 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986667
   INFO:root:Epoch[0] Batch [75]	Speed: 637.794947 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986842
   INFO:root:Epoch[0] Batch [76]	Speed: 635.627762 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987013
   INFO:root:Epoch[0] Batch [77]	Speed: 634.704971 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987179
   INFO:root:Epoch[0] Batch [78]	Speed: 626.428223 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987342
   INFO:root:Epoch[0] Batch [79]	Speed: 631.866328 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987500
   INFO:root:Epoch[0] Batch [80]	Speed: 633.852949 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987654
   INFO:root:Epoch[0] Batch [81]	Speed: 628.484464 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987805
   INFO:root:Epoch[0] Batch [82]	Speed: 639.696342 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987952
   INFO:root:Epoch[0] Batch [83]	Speed: 628.357943 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988095
   INFO:root:Epoch[0] Batch [84]	Speed: 632.414144 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988235
   INFO:root:Epoch[0] Batch [85]	Speed: 635.123201 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988372
   INFO:root:Epoch[0] Batch [86]	Speed: 637.747216 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988506
   INFO:root:Epoch[0] Batch [87]	Speed: 630.293670 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988636
   INFO:root:Epoch[0] Batch [88]	Speed: 634.680210 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988764
   INFO:root:Epoch[0] Batch [89]	Speed: 630.405426 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988889
   INFO:root:Epoch[0] Batch [90]	Speed: 630.142012 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989011
   INFO:root:Epoch[0] Batch [91]	Speed: 635.331395 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989130
   INFO:root:Epoch[0] Batch [92]	Speed: 637.355788 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989247
   INFO:root:Epoch[0] Batch [93]	Speed: 630.492786 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989362
   INFO:root:Epoch[0] Batch [94]	Speed: 627.613096 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989474
   INFO:root:Epoch[0] Batch [95]	Speed: 636.513241 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989583
   INFO:root:Epoch[0] Batch [96]	Speed: 639.884664 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989691
   INFO:root:Epoch[0] Batch [97]	Speed: 635.479544 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989796
   INFO:root:Epoch[0] Batch [98]	Speed: 637.682071 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989899
   INFO:root:Epoch[0] Batch [99]	Speed: 631.076052 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.990000
   INFO:root:[Epoch 0] training: accuracy=1.000000, top_k_accuracy_5=0.990000
   INFO:root:[Epoch 0] time cost: 23.480652
   Traceback (most recent call last):
     File "../gluon/image_classification.py", line 290, in <module>
       main()
     File "../gluon/image_classification.py", line 274, in main
       train(opt, context)
     File "../gluon/image_classification.py", line 242, in train
       name, val_acc = test(ctx, val_data)
     File "../gluon/image_classification.py", line 166, in test
       outputs.append(net(x))
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
       out = self.forward(*args)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
       return self.hybrid_forward(ndarray, x, *args, **params)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/model_zoo/vision/resnet.py", line 279, in hybrid_forward
       x = self.features(x)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
       out = self.forward(*args)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
       return self.hybrid_forward(ndarray, x, *args, **params)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/nn/basic_layers.py", line 117, in hybrid_forward
       x = block(x)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
       out = self.forward(*args)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
       return self.hybrid_forward(ndarray, x, *args, **params)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/nn/conv_layers.py", line 133, in hybrid_forward
       act = getattr(F, self._op_name)(x, weight, name='fwd', **self._kwargs)
     File "<string>", line 167, in Convolution
     File "/usr/local/lib/python3.5/dist-packages/mxnet/_ctypes/ndarray.py", line 92, in _imperative_invoke
       ctypes.byref(out_stypes)))
     File "/usr/local/lib/python3.5/dist-packages/mxnet/base.py", line 210, in check_call
       raise MXNetError(py_str(_LIB.MXGetLastError()))
   mxnet.base.MXNetError: [11:24:02] src/operator/nn/convolution.cc:283: Check failed: (*in_type)[i] == dtype (2 vs. 0) This layer requires uniform type. Expected 'float32' v.s. given 'float16' at 'weight'
   
   Stack trace returned 10 entries:
   [bt] (0) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x34d0ea) [0x7fcc47e1a0ea]
   [bt] (1) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x34d711) [0x7fcc47e1a711]
   [bt] (2) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x59284c) [0x7fcc4805f84c]
   [bt] (3) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x26d963f) [0x7fcc4a1a663f]
   [bt] (4) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x26e2cad) [0x7fcc4a1afcad]
   [bt] (5) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x25ffe99) [0x7fcc4a0cce99]
   [bt] (6) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(MXImperativeInvokeEx+0x6f) [0x7fcc4a0cd48f]
   [bt] (7) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7fccd70f7e20]
   [bt] (8) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7fccd70f788b]
   [bt] (9) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(_ctypes_callproc+0x49a) [0x7fccd70f201a]
   
   ```
   Alexnet:
   ```
   
   INFO:root:Starting new image-classification task:, Namespace(batch_norm=False, batch_size=128, builtin_profiler=0, data_dir='', dataset='dummy', dtype='float16', epochs=10, gpus='2', kvstore='device', log_interval=1, lr=0.1, lr_factor=0.1, lr_steps='30,60,90', mode='imperative', model='alexnet', momentum=0.9, num_workers=4, prefix='', profile=False, resume='', save_frequency=10, seed=123, start_epoch=0, use_pretrained=False, use_thumbnail=False, wd=0.0001)
   [11:25:47] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)
   INFO:root:Epoch[0] Batch [0]	Speed: 129.885558 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.000000
   INFO:root:Epoch[0] Batch [1]	Speed: 1461.012374 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.500000
   INFO:root:Epoch[0] Batch [2]	Speed: 1709.083278 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.666667
   INFO:root:Epoch[0] Batch [3]	Speed: 2039.070355 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.750000
   INFO:root:Epoch[0] Batch [4]	Speed: 1956.020534 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.800000
   INFO:root:Epoch[0] Batch [5]	Speed: 2144.635572 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.833333
   INFO:root:Epoch[0] Batch [6]	Speed: 2110.291864 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.857143
   INFO:root:Epoch[0] Batch [7]	Speed: 2191.077322 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.875000
   INFO:root:Epoch[0] Batch [8]	Speed: 2126.322487 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.888889
   INFO:root:Epoch[0] Batch [9]	Speed: 2168.509516 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.900000
   INFO:root:Epoch[0] Batch [10]	Speed: 2211.210742 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.909091
   INFO:root:Epoch[0] Batch [11]	Speed: 2181.843317 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.916667
   INFO:root:Epoch[0] Batch [12]	Speed: 1934.356274 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.923077
   INFO:root:Epoch[0] Batch [13]	Speed: 2177.674933 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.928571
   INFO:root:Epoch[0] Batch [14]	Speed: 2229.789643 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.933333
   INFO:root:Epoch[0] Batch [15]	Speed: 2218.034902 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.937500
   INFO:root:Epoch[0] Batch [16]	Speed: 2183.964593 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.941176
   INFO:root:Epoch[0] Batch [17]	Speed: 2190.022648 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.944444
   INFO:root:Epoch[0] Batch [18]	Speed: 2142.803765 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.947368
   INFO:root:Epoch[0] Batch [19]	Speed: 2128.269630 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.950000
   INFO:root:Epoch[0] Batch [20]	Speed: 2138.817161 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.952381
   INFO:root:Epoch[0] Batch [21]	Speed: 2122.069741 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.954545
   INFO:root:Epoch[0] Batch [22]	Speed: 2187.087427 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.956522
   INFO:root:Epoch[0] Batch [23]	Speed: 2153.997336 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.958333
   INFO:root:Epoch[0] Batch [24]	Speed: 2151.511277 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.960000
   INFO:root:Epoch[0] Batch [25]	Speed: 1927.757813 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.961538
   INFO:root:Epoch[0] Batch [26]	Speed: 2316.855017 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.962963
   INFO:root:Epoch[0] Batch [27]	Speed: 2260.004765 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.964286
   INFO:root:Epoch[0] Batch [28]	Speed: 2165.666585 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.965517
   INFO:root:Epoch[0] Batch [29]	Speed: 2325.495692 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.966667
   INFO:root:Epoch[0] Batch [30]	Speed: 2376.797025 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.967742
   INFO:root:Epoch[0] Batch [31]	Speed: 2369.652818 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.968750
   INFO:root:Epoch[0] Batch [32]	Speed: 2460.171438 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.969697
   INFO:root:Epoch[0] Batch [33]	Speed: 2185.253571 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.970588
   INFO:root:Epoch[0] Batch [34]	Speed: 2427.390954 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.971429
   INFO:root:Epoch[0] Batch [35]	Speed: 2398.479758 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.972222
   INFO:root:Epoch[0] Batch [36]	Speed: 2362.977769 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.972973
   INFO:root:Epoch[0] Batch [37]	Speed: 2300.701141 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.973684
   INFO:root:Epoch[0] Batch [38]	Speed: 2419.459984 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.974359
   INFO:root:Epoch[0] Batch [39]	Speed: 2052.376520 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.975000
   INFO:root:Epoch[0] Batch [40]	Speed: 2383.476415 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.975610
   INFO:root:Epoch[0] Batch [41]	Speed: 2334.881214 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.976190
   INFO:root:Epoch[0] Batch [42]	Speed: 2382.482158 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.976744
   INFO:root:Epoch[0] Batch [43]	Speed: 2269.846535 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.977273
   INFO:root:Epoch[0] Batch [44]	Speed: 2288.666934 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.977778
   INFO:root:Epoch[0] Batch [45]	Speed: 2362.738584 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.978261
   INFO:root:Epoch[0] Batch [46]	Speed: 2070.470430 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.978723
   INFO:root:Epoch[0] Batch [47]	Speed: 2272.854291 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.979167
   INFO:root:Epoch[0] Batch [48]	Speed: 2232.395025 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.979592
   INFO:root:Epoch[0] Batch [49]	Speed: 2197.246896 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.980000
   INFO:root:Epoch[0] Batch [50]	Speed: 2404.936959 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.980392
   INFO:root:Epoch[0] Batch [51]	Speed: 2411.732337 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.980769
   INFO:root:Epoch[0] Batch [52]	Speed: 2290.795835 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.981132
   INFO:root:Epoch[0] Batch [53]	Speed: 1863.391049 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.981481
   INFO:root:Epoch[0] Batch [54]	Speed: 1388.702278 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.981818
   INFO:root:Epoch[0] Batch [55]	Speed: 1917.601572 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.982143
   INFO:root:Epoch[0] Batch [56]	Speed: 1562.342599 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.982456
   INFO:root:Epoch[0] Batch [57]	Speed: 1610.213403 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.982759
   INFO:root:Epoch[0] Batch [58]	Speed: 1903.716551 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.983051
   INFO:root:Epoch[0] Batch [59]	Speed: 2190.898493 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.983333
   INFO:root:Epoch[0] Batch [60]	Speed: 1648.527213 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.983607
   INFO:root:Epoch[0] Batch [61]	Speed: 1567.492583 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.983871
   INFO:root:Epoch[0] Batch [62]	Speed: 1439.671858 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.984127
   INFO:root:Epoch[0] Batch [63]	Speed: 1838.045082 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.984375
   INFO:root:Epoch[0] Batch [64]	Speed: 1925.883759 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.984615
   INFO:root:Epoch[0] Batch [65]	Speed: 1418.372237 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.984848
   INFO:root:Epoch[0] Batch [66]	Speed: 1535.197685 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985075
   INFO:root:Epoch[0] Batch [67]	Speed: 1737.688131 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985294
   INFO:root:Epoch[0] Batch [68]	Speed: 1927.737047 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985507
   INFO:root:Epoch[0] Batch [69]	Speed: 1889.632021 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985714
   INFO:root:Epoch[0] Batch [70]	Speed: 1624.872618 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.985915
   INFO:root:Epoch[0] Batch [71]	Speed: 1791.229579 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986111
   INFO:root:Epoch[0] Batch [72]	Speed: 2030.901763 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986301
   INFO:root:Epoch[0] Batch [73]	Speed: 1657.909581 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986486
   INFO:root:Epoch[0] Batch [74]	Speed: 1754.525975 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986667
   INFO:root:Epoch[0] Batch [75]	Speed: 2114.081166 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.986842
   INFO:root:Epoch[0] Batch [76]	Speed: 2005.547071 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987013
   INFO:root:Epoch[0] Batch [77]	Speed: 1223.288891 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987179
   INFO:root:Epoch[0] Batch [78]	Speed: 1632.888602 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987342
   INFO:root:Epoch[0] Batch [79]	Speed: 1857.749099 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987500
   INFO:root:Epoch[0] Batch [80]	Speed: 1369.160001 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987654
   INFO:root:Epoch[0] Batch [81]	Speed: 1581.092164 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987805
   INFO:root:Epoch[0] Batch [82]	Speed: 1803.207970 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.987952
   INFO:root:Epoch[0] Batch [83]	Speed: 2025.232513 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988095
   INFO:root:Epoch[0] Batch [84]	Speed: 2021.739536 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988235
   INFO:root:Epoch[0] Batch [85]	Speed: 2089.666748 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988372
   INFO:root:Epoch[0] Batch [86]	Speed: 1606.815830 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988506
   INFO:root:Epoch[0] Batch [87]	Speed: 1784.661886 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988636
   INFO:root:Epoch[0] Batch [88]	Speed: 1571.737384 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988764
   INFO:root:Epoch[0] Batch [89]	Speed: 1813.305881 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.988889
   INFO:root:Epoch[0] Batch [90]	Speed: 2018.721515 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989011
   INFO:root:Epoch[0] Batch [91]	Speed: 1937.973237 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989130
   INFO:root:Epoch[0] Batch [92]	Speed: 2058.198976 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989247
   INFO:root:Epoch[0] Batch [93]	Speed: 2084.692704 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989362
   INFO:root:Epoch[0] Batch [94]	Speed: 1540.413033 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989474
   INFO:root:Epoch[0] Batch [95]	Speed: 1467.987477 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989583
   INFO:root:Epoch[0] Batch [96]	Speed: 1834.095430 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989691
   INFO:root:Epoch[0] Batch [97]	Speed: 1582.350376 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989796
   INFO:root:Epoch[0] Batch [98]	Speed: 1732.048380 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.989899
   INFO:root:Epoch[0] Batch [99]	Speed: 1598.191591 samples/sec	accuracy=1.000000, top_k_accuracy_5=0.990000
   INFO:root:[Epoch 0] training: accuracy=1.000000, top_k_accuracy_5=0.990000
   INFO:root:[Epoch 0] time cost: 7.557455
   Traceback (most recent call last):
     File "../gluon/image_classification.py", line 290, in <module>
       main()
     File "../gluon/image_classification.py", line 274, in main
       train(opt, context)
     File "../gluon/image_classification.py", line 242, in train
       name, val_acc = test(ctx, val_data)
     File "../gluon/image_classification.py", line 166, in test
       outputs.append(net(x))
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
       out = self.forward(*args)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
       return self.hybrid_forward(ndarray, x, *args, **params)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/model_zoo/vision/alexnet.py", line 65, in hybrid_forward
       x = self.features(x)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
       out = self.forward(*args)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
       return self.hybrid_forward(ndarray, x, *args, **params)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/nn/basic_layers.py", line 117, in hybrid_forward
       x = block(x)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 481, in __call__
       out = self.forward(*args)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/block.py", line 821, in forward
       return self.hybrid_forward(ndarray, x, *args, **params)
     File "/usr/local/lib/python3.5/dist-packages/mxnet/gluon/nn/conv_layers.py", line 135, in hybrid_forward
       act = getattr(F, self._op_name)(x, weight, bias, name='fwd', **self._kwargs)
     File "<string>", line 167, in Convolution
     File "/usr/local/lib/python3.5/dist-packages/mxnet/_ctypes/ndarray.py", line 92, in _imperative_invoke
       ctypes.byref(out_stypes)))
     File "/usr/local/lib/python3.5/dist-packages/mxnet/base.py", line 210, in check_call
       raise MXNetError(py_str(_LIB.MXGetLastError()))
   mxnet.base.MXNetError: [11:25:54] src/operator/nn/convolution.cc:283: Check failed: (*in_type)[i] == dtype (2 vs. 0) This layer requires uniform type. Expected 'float32' v.s. given 'float16' at 'weight'
   
   Stack trace returned 10 entries:
   [bt] (0) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x34d0ea) [0x7f28b86da0ea]
   [bt] (1) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x34d711) [0x7f28b86da711]
   [bt] (2) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x59284c) [0x7f28b891f84c]
   [bt] (3) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x26d963f) [0x7f28baa6663f]
   [bt] (4) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x26e2cad) [0x7f28baa6fcad]
   [bt] (5) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(+0x25ffe99) [0x7f28ba98ce99]
   [bt] (6) /usr/local/lib/python3.5/dist-packages/mxnet/libmxnet.so(MXImperativeInvokeEx+0x6f) [0x7f28ba98d48f]
   [bt] (7) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call_unix64+0x4c) [0x7f29479b7e20]
   [bt] (8) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(ffi_call+0x2eb) [0x7f29479b788b]
   [bt] (9) /usr/lib/python3.5/lib-dynload/_ctypes.cpython-35m-x86_64-linux-gnu.so(_ctypes_callproc+0x49a) [0x7f29479b201a]
   
   ```
   ## Minimum reproducible example
   (If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.)
   
   ## Steps to reproduce
   (Paste the commands you ran that produced the error.)
   
   1. python3 ../gluon/image_classification.py --dataset dummy --gpus 2 --epochs 10  --mode imperative  --model resnet50_v2  --batch-size 128 --log-interval 1 --dtype float16
   2. python3 ../gluon/image_classification.py --dataset dummy --gpus 2 --epochs 10  --mode imperative  --model alexnet  --batch-size 128 --log-interval 1 --dtype float16
   
   ## What have you tried to solve it?
   
   I have no idea how to solve this.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services