You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2019/02/06 23:48:13 UTC
[GitHub] satyakrishnagorti opened a new issue #14080: Adam Optimizer Memory Leak in Scala

satyakrishnagorti opened a new issue #14080: Adam Optimizer Memory Leak in Scala
URL: https://github.com/apache/incubator-mxnet/issues/14080
 
 
   ## Description
   Memory leak issue while using Adam optimizer with MXNet Scala Bindings. Running the code below will keep consuming more and more memory till you run out.
   
   ## Steps to Reproduce
   ```scala
   // Simple MLP network
   def mlpNetwork(): Symbol = {
       val input = Symbol.Variable("data")
       val label = Symbol.Variable("label")
       val fc1 = Symbol.FullyConnected(name = "fc1")()(Map("data" -> input, "num_hidden" -> 128))
       val act1 = Symbol.Activation(name = "relu")()(Map("data" -> fc1, "act_type" -> "relu"))
       val fc2 = Symbol.FullyConnected(name = "fc2")()(Map("data" -> act1, "num_hidden" -> 1))
       val loss = Symbol.LinearRegressionOutput(name="loss")()(Map("data" -> fc2, "label" -> label))
       loss
     }
   
   def getNDArrayIter: NDArrayIter = {
       val f = NDArray.zeros(100, 20, 20)
       val l = NDArray.zeros(100, 1)
       val data = Array(f)
       val labels = Array(l)
       val batchSize = 10
       val iter = new NDArrayIter(data, labels, batchSize)
       iter
     }
   
   val net = mlpNetwork()
   val iter = getNDArrayIter()
   val optimizer = new Adam(0.001f, 0.9f, 0.999f, 1e-8f, 1 - 1e-8f, 0f, 10f, null);
   val init = new Normal(0.01f);
   val model = FeedForward.newBuilder(modelSpec)
                   .setContext(Array(Context.gpu(0)))
                   .setInitializer(init)
                   .setNumEpoch(100000)
                   .setOptimizer(optimizer)
                   .setTrainData(iter)
                   .setEvalData(iter)
                   .build();
   
   ```
   
   ## Issue
   
   The issue is (I think) some temporary NDArrays are not getting disposed in Adam optimizer when using `disposeDepsExcept`.
   
   The places exactly where the memory leak occurs is in 3 locations where the method `disposeDepsExcept` is used in Adam's `update` method.
   
   ## Temporary Fix
   
   Replace all the 3 lines that use `disposeDepsExcept` in `update` method of `Adam.scala` by explicitly disposing the temporary NDArrays that were created as shown below
   
   Instead of the 3 following lines in `Adam.scala`
   ```scala
   val meanT = (beta1t * mean + (1.0 - beta1t) * resdGrad)
         .disposeDepsExcept(mean, resdGrad)
   
   val varianceT = (beta2 * variance + (1.0f - beta2) * resdGrad * resdGrad)
         .disposeDepsExcept(variance, resdGrad)
   
   val step = (learningRate * meanT / (NDArray.sqrt(varianceT) + epsilon))
         .disposeDepsExcept(meanT, varianceT)
   ```
   
   Replace it by:
   ```scala
   val beta1Mean = beta1 * mean
       val beta1ResGrad = (1.0 - beta1t) * resdGrad
       val meanT = beta1Mean + beta1ResGrad
       // dipose temp NDArrays
       betaMean.dispose()
       betaResGrad.dispose()
   
   val beta2Variance = beta2 * variance
       val beta2ResGrad = (1.0f - beta2) * resdGrad
       val beta2ResGradSquare = beta2ResGrad * resdGrad
       val varianceT = beta2Variance + beta2ResGradSquare
       // dipose temp NDArrays
       beta2Variance.dispose()
       beta2ResGrad.dispose()
       beta2ResGradSquare.dispose()
   
   
       val lrMeanT = learningRate * meanT
       val sqrtVar = NDArray.sqrt(varianceT)
       val sqrtVarPlusEpsilon = sqrtVar + epsilon
       val step = lrMeanT / sqrtVarPlusEpsilon
       // dipose temp NDArrays
       lrMeanT.dispose()
       sqrtVar.dispose()
       sqrtVarPlusEpsilon.dispose()
   
   ```
   
   The above changes fixes things for now, but for some reason `disposeDepsExcept` is not doing its job in this case.
   
   ## Environment info (Required)
   
   ```
   ----------Python Info----------
   Version      : 3.7.1
   Compiler     : GCC 7.3.0
   Build        : ('default', 'Dec 14 2018 19:28:38')
   Arch         : ('64bit', '')
   ------------Pip Info-----------
   Version      : 18.1
   Directory    : /home/satya/anaconda3/lib/python3.7/site-packages/pip
   ----------MXNet Info-----------
   Version      : 1.3.1
   Directory    : /home/satya/Documents/workspace/mxnet_1.3.x/python/mxnet
   Hashtag not found. Not installed from pre-built package.
   ----------System Info----------
   Platform     : Linux-4.4.0-141-generic-x86_64-with-debian-stretch-sid
   system       : Linux
   node         : DS5
   release      : 4.4.0-141-generic
   version      : #167-Ubuntu SMP Wed Dec 5 10:40:15 UTC 2018
   ----------Network Test----------
   Setting timeout: 10
   Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0405 sec, LOAD: 0.6186 sec.
   Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.1403 sec, LOAD: 0.4726 sec.
   Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.2418 sec, LOAD: 0.4049 sec.
   Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0445 sec, LOAD: 0.1894 sec.
   Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0779 sec, LOAD: 0.2447 sec.
   Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0409 sec, LOAD: 0.0746 sec.
   ```
   
   Package used (Python/R/Scala/Julia): Scala
   
   For Scala user, please provide:
   1. Java version: 1.8.0_201
   2. Maven version: 3.6.0
   3. Scala runtime if applicable: 2.11.6
   
   ## Build info (Required if built from source)
   
   Compiler (gcc/clang/mingw/visual studio): gcc
   
   MXNet commit hash: 96b4b6ef3c60c63644a7c4d672109b97561b839d

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services