You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/01/23 06:04:31 UTC

[GitHub] stephenrawls opened a new issue #13967: Hybrid Variable Pass Through [Bug]

stephenrawls opened a new issue #13967: Hybrid Variable Pass Through [Bug]
URL: https://github.com/apache/incubator-mxnet/issues/13967
 
 
   ## Description
   The following code shows correct gradient for the foo parameter prior to hybridization. But if you uncomment the call to `net.hybridize()` then the code does not find the correct gradient (it crashes, error message below).
   
   ```
   import mxnet as mx
   
   class Layer(mx.gluon.HybridBlock):
       def __init__(self, **kwargs):
           super(Layer,self).__init__(**kwargs)
   
           with self.name_scope():
               self.foo = self.params.get(name="foo", shape=(1), init=mx.init.Uniform(1))
   
       def hybrid_forward(self, F, x, foo):
           return foo
   
   
   net = Layer()
   net.initialize()
   #net.hybridize()                                                                                                                                                                        
   
   with mx.autograd.record():
       y = net(mx.nd.ones(1))
       y = y * 2
       y.backward()
   
   print(net.collect_params()['layer0_foo'].grad())
   ```
   
   The error message I get after uncommenting the call to hybridize is:
   ```
   MXNetError: [21:53:22] src/imperative/imperative.cc:285: Check failed: !AGInfo::IsNone(*i) Cannot differentiate node because it is not in a computational graph. You need to set is_recording to true or use autograd.record() to save computational graphs for backward. If you want to differentiate the same graph twice, you need to pass retain_graph=True to backward.
   ```
   
   This is a contrived example, but we have a use case where we have a legitimate use to simply "pass through" the HybridBlock's parameter as one of the blocks outputs (our loss function needs it). The behavior in our actual code is not a crash, but to simply have zero gradients for the parameter. I'm not sure why this code crashes and our actual code does not, but this was my best attempt at creating a minimal reproducible example.
   
   Because the code works when it is a `mx.gluon.Block` object, and it works when it is a `mx.gluon.HybridBlock` object *but* we do not hybridize it, we suspect that the problem is a bug in the way pass-through variables are handled in hybridization.
   
   ## Environment info (Required)
   
   ```
   % python3 diagnose.py 
   ----------Python Info----------
   Version      : 3.7.2
   Compiler     : Clang 9.0.0 (clang-900.0.39.2)
   Build        : ('default', 'Dec 27 2018 07:35:45')
   Arch         : ('64bit', '')
   ------------Pip Info-----------
   Version      : 18.1
   Directory    : /usr/local/lib/python3.7/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.3.0
   Directory    : /usr/local/lib/python3.7/site-packages/mxnet
   Commit Hash   : b3be92f4a48bce62a5a8424271871c2f81c8f7f1
   ----------System Info----------
   Platform     : Darwin-16.7.0-x86_64-i386-64bit
   system       : Darwin
   node         : dca90487206c.ant.amazon.com
   release      : 16.7.0
   version      : Darwin Kernel Version 16.7.0: Sun Oct 28 22:30:19 PDT 2018; root:xnu-3789.73.27~1/RELEASE_X86_64
   ----------Hardware Info----------
   machine      : x86_64
   processor    : i386
   b'machdep.cpu.extfeatures: SYSCALL XD 1GBPAGE EM64T LAHF LZCNT PREFETCHW RDTSCP TSCI'
   b'machdep.cpu.leaf7_features: SMEP ERMS RDWRFSGS TSC_THREAD_OFFSET BMI1 HLE AVX2 BMI2 INVPCID RTM SMAP RDSEED ADX IPT SGX FPU_CSDS MPX CLFSOPT'
   b'machdep.cpu.features: FPU VME DE PSE TSC MSR PAE MCE CX8 APIC SEP MTRR PGE MCA CMOV PAT PSE36 CLFSH DS ACPI MMX FXSR SSE SSE2 SS HTT TM PBE SSE3 PCLMULQDQ DTES64 MON DSCPL VMX SMX EST TM2 SSSE3 FMA CX16 TPR PDCM SSE4.1 SSE4.2 x2APIC MOVBE POPCNT AES PCID XSAVE OSXSAVE SEGLIM64 TSCTMR AVX1.0 RDRAND F16C'
   b'machdep.cpu.brand_string: Intel(R) Core(TM) i7-7660U CPU @ 2.50GHz'
   ----------Network Test----------
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0355 sec, LOAD: 0.6027 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.0962 sec, LOAD: 0.4145 sec.
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0371 sec, LOAD: 0.3045 sec.
   Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0411 sec, LOAD: 0.3711 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0315 sec, LOAD: 0.7996 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0332 sec, LOAD: 0.1842 sec.
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services