You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "yangping wu (JIRA)" <ji...@apache.org> on 2015/05/05 04:10:10 UTC

[jira] [Updated] (SPARK-7353) Driver memory leak?

     [ https://issues.apache.org/jira/browse/SPARK-7353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

yangping wu updated SPARK-7353:
-------------------------------
    Description: 
Hi all, I am using Spark Streaming to read data from kafka, My spark version is 1.3.1,the code as follow:
{code}
object Test {
  def main(args: Array[String]) {
    val brokerAddress = "192.168.246.66:9092,192.168.246.67:9092,192.168.246.68:9092"
    val kafkaParams = Map[String, String](
      "metadata.broker.list" -> brokerAddress,
      "group.id" -> args(0))

    val sparkConf = new SparkConf().setAppName("Test")
    sparkConf.set("spark.kryo.registrator", "utils.MyKryoSerializer")
    val sc = new SparkContext(sparkConf)

    val ssc = new StreamingContext(sc, Seconds(2))
    val topicsSet = Set("sparktopic")

    val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topicsSet)
    messages.foreachRDD(rdd =>{
      if(!rdd.isEmpty()){
        rdd.count()
      }
    })

    ssc.start()
    ssc.awaitTermination()
    ssc.stop()
  }
}
{code}
The program already run about 120 hours, below is *jmap -histo:live* result for the program:
{code}
num     #instances         #bytes  class name
----------------------------------------------
   1:         30148      139357920  [B
   2:       2102205       67270560  java.util.HashMap$Entry
   3:       2143056       51433344  java.lang.Long
   4:        520430       26570456  [C
   5:        119224       15271104  <methodKlass>
   6:        119224       14747984  <constMethodKlass>
   7:          3449       13476384  [Ljava.util.HashMap$Entry;
   8:        519132       12459168  java.lang.String
   9:          9680       10855744  <constantPoolKlass>
  10:          9680        9358856  <instanceKlassKlass>
  11:        282624        6782976  io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
  12:          8137        5778112  <constantPoolCacheKlass>
  13:           120        3934080  [Lscala.concurrent.forkjoin.ForkJoinTask;
  14:         71166        2846640  java.util.TreeMap$Entry
  15:          6425        2545712  <methodDataKlass>
  16:         10308        1224792  java.lang.Class
  17:           640        1140736  [Lio.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry;
  18:         22087        1060176  java.util.TreeMap
  19:         19337        1014288  [[I
  20:         16327         916376  [S
  21:         17481         559392  java.util.concurrent.ConcurrentHashMap$HashEntry
  22:          2235         548480  [I
  23:         22000         528000  javax.management.openmbean.CompositeDataSupport
{code}
([The jmap result screenshot|https://cloud.githubusercontent.com/assets/5170878/7465993/c9fc5b24-f30d-11e4-9276-ae635f850833.jpg])Note the  *java.util.HashMap$Entry* and  *java.lang.Long* object, There are already using about 120MB! and  I found, as time goes by, the *java.util.HashMap$Entry* and *java.lang.Long* object will occupied more and more memory,   and this will cause OOM on driver side. But I don't know what component cause this problem.

The other program that has many job in one batch interval, and the *jmap -histo:live* result for the program:
{code}
num     #instances         #bytes  class name
----------------------------------------------
   1:       5256952      168222464  java.util.HashMap$Entry
   2:         53317      144304808  [B
   3:       5185499      124451976  java.lang.Long
   4:          2456       39707888  [Ljava.util.HashMap$Entry;
   5:        127343       16310384  <methodKlass>
   6:        127343       15745680  <constMethodKlass>
   7:         10403       11696960  <constantPoolKlass>
   8:         10403       10103520  <instanceKlassKlass>
   9:         69955       10046040  [Ljava.lang.Object;
  10:        122628        7963480  [C
  11:         72208        7458496  [Lscala.collection.mutable.HashEntry;
  12:        282624        6782976  io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
  13:          8824        6326208  <constantPoolCacheKlass>
  14:        138037        5521480  org.apache.spark.storage.BlockStatus
  15:        209132        5019168  scala.collection.mutable.DefaultEntry
  16:           140        4589760  [Lscala.concurrent.forkjoin.ForkJoinTask;
  17:          6591        3509992  [I
  18:          8454        3275104  <methodDataKlass>
  19:        135468        3251232  org.apache.spark.storage.RDDBlockId
  20:        121176        2908224  java.lang.String
  21:         72207        2888280  scala.collection.mutable.HashMap
  22:         65714        2628560  scala.collection.mutable.HashSet
  23:         17935        2008720  org.apache.spark.ui.jobs.UIData$ExecutorSummary
  24:        121144        1938304  java.lang.Integer
  25:         17987        1870648  org.apache.spark.executor.TaskMetrics
  26:         65268        1566432  org.apache.spark.rdd.ShuffledRDDPartition
  27:         11094        1319248  java.lang.Class
  28:         17996        1295712  org.apache.spark.scheduler.TaskInfo
  29:         37067        1186144  scala.Tuple4
  30:           640        1140736  [Lio.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry;
  31:         21033        1090832  [[I
  32:         43517        1044408  scala.Tuple2
  33:         17645         992808  [S
  34:         40676         976224  scala.collection.immutable.$colon$colon
  35:         60064         961024  scala.Some
  36:         16118         902608  org.apache.spark.storage.RDDInfo
{code}
the *java.util.HashMap$Entry* and *java.lang.Long* object using 280MB memory!


  was:
Hi all, I am using Spark Streaming to read data from kafka, My spark version is 1.3.1,the code as follow:
{code}
object Test {
  def main(args: Array[String]) {
    val brokerAddress = "192.168.246.66:9092,192.168.246.67:9092,192.168.246.68:9092"
    val kafkaParams = Map[String, String](
      "metadata.broker.list" -> brokerAddress,
      "group.id" -> args(0))

    val sparkConf = new SparkConf().setAppName("Test")
    sparkConf.set("spark.kryo.registrator", "utils.MyKryoSerializer")
    val sc = new SparkContext(sparkConf)

    val ssc = new StreamingContext(sc, Seconds(2))
    val topicsSet = Set("sparktopic")

    val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topicsSet)
    messages.foreachRDD(rdd =>{
      if(!rdd.isEmpty()){
        rdd.count()
      }
    })

    ssc.start()
    ssc.awaitTermination()
    ssc.stop()
  }
}
{code}
The program already run about 120 hours, below is *jmap -histo:live* result for the program:
{code}
num     #instances         #bytes  class name
----------------------------------------------
   1:         30148      139357920  [B
   2:       2102205       67270560  java.util.HashMap$Entry
   3:       2143056       51433344  java.lang.Long
   4:        520430       26570456  [C
   5:        119224       15271104  <methodKlass>
   6:        119224       14747984  <constMethodKlass>
   7:          3449       13476384  [Ljava.util.HashMap$Entry;
   8:        519132       12459168  java.lang.String
   9:          9680       10855744  <constantPoolKlass>
  10:          9680        9358856  <instanceKlassKlass>
  11:        282624        6782976  io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
  12:          8137        5778112  <constantPoolCacheKlass>
  13:           120        3934080  [Lscala.concurrent.forkjoin.ForkJoinTask;
  14:         71166        2846640  java.util.TreeMap$Entry
  15:          6425        2545712  <methodDataKlass>
  16:         10308        1224792  java.lang.Class
  17:           640        1140736  [Lio.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry;
  18:         22087        1060176  java.util.TreeMap
  19:         19337        1014288  [[I
  20:         16327         916376  [S
  21:         17481         559392  java.util.concurrent.ConcurrentHashMap$HashEntry
  22:          2235         548480  [I
  23:         22000         528000  javax.management.openmbean.CompositeDataSupport
{code}
([The jmap result screenshot|https://cloud.githubusercontent.com/assets/5170878/7465993/c9fc5b24-f30d-11e4-9276-ae635f850833.jpg])Note the  *java.util.HashMap$Entry* and  *java.lang.Long* object, There are already using about 120MB! and  I found, as time goes by, the *java.util.HashMap$Entry* and *java.lang.Long* object will occupied more and more memory,   and this will cause OOM on driver side. But I don't know what component cause this problem.




> Driver memory leak?
> -------------------
>
>                 Key: SPARK-7353
>                 URL: https://issues.apache.org/jira/browse/SPARK-7353
>             Project: Spark
>          Issue Type: Bug
>          Components: Streaming
>    Affects Versions: 1.3.1
>            Reporter: yangping wu
>
> Hi all, I am using Spark Streaming to read data from kafka, My spark version is 1.3.1,the code as follow:
> {code}
> object Test {
>   def main(args: Array[String]) {
>     val brokerAddress = "192.168.246.66:9092,192.168.246.67:9092,192.168.246.68:9092"
>     val kafkaParams = Map[String, String](
>       "metadata.broker.list" -> brokerAddress,
>       "group.id" -> args(0))
>     val sparkConf = new SparkConf().setAppName("Test")
>     sparkConf.set("spark.kryo.registrator", "utils.MyKryoSerializer")
>     val sc = new SparkContext(sparkConf)
>     val ssc = new StreamingContext(sc, Seconds(2))
>     val topicsSet = Set("sparktopic")
>     val messages = KafkaUtils.createDirectStream[String, String, StringDecoder, StringDecoder](ssc, kafkaParams, topicsSet)
>     messages.foreachRDD(rdd =>{
>       if(!rdd.isEmpty()){
>         rdd.count()
>       }
>     })
>     ssc.start()
>     ssc.awaitTermination()
>     ssc.stop()
>   }
> }
> {code}
> The program already run about 120 hours, below is *jmap -histo:live* result for the program:
> {code}
> num     #instances         #bytes  class name
> ----------------------------------------------
>    1:         30148      139357920  [B
>    2:       2102205       67270560  java.util.HashMap$Entry
>    3:       2143056       51433344  java.lang.Long
>    4:        520430       26570456  [C
>    5:        119224       15271104  <methodKlass>
>    6:        119224       14747984  <constMethodKlass>
>    7:          3449       13476384  [Ljava.util.HashMap$Entry;
>    8:        519132       12459168  java.lang.String
>    9:          9680       10855744  <constantPoolKlass>
>   10:          9680        9358856  <instanceKlassKlass>
>   11:        282624        6782976  io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
>   12:          8137        5778112  <constantPoolCacheKlass>
>   13:           120        3934080  [Lscala.concurrent.forkjoin.ForkJoinTask;
>   14:         71166        2846640  java.util.TreeMap$Entry
>   15:          6425        2545712  <methodDataKlass>
>   16:         10308        1224792  java.lang.Class
>   17:           640        1140736  [Lio.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry;
>   18:         22087        1060176  java.util.TreeMap
>   19:         19337        1014288  [[I
>   20:         16327         916376  [S
>   21:         17481         559392  java.util.concurrent.ConcurrentHashMap$HashEntry
>   22:          2235         548480  [I
>   23:         22000         528000  javax.management.openmbean.CompositeDataSupport
> {code}
> ([The jmap result screenshot|https://cloud.githubusercontent.com/assets/5170878/7465993/c9fc5b24-f30d-11e4-9276-ae635f850833.jpg])Note the  *java.util.HashMap$Entry* and  *java.lang.Long* object, There are already using about 120MB! and  I found, as time goes by, the *java.util.HashMap$Entry* and *java.lang.Long* object will occupied more and more memory,   and this will cause OOM on driver side. But I don't know what component cause this problem.
> The other program that has many job in one batch interval, and the *jmap -histo:live* result for the program:
> {code}
> num     #instances         #bytes  class name
> ----------------------------------------------
>    1:       5256952      168222464  java.util.HashMap$Entry
>    2:         53317      144304808  [B
>    3:       5185499      124451976  java.lang.Long
>    4:          2456       39707888  [Ljava.util.HashMap$Entry;
>    5:        127343       16310384  <methodKlass>
>    6:        127343       15745680  <constMethodKlass>
>    7:         10403       11696960  <constantPoolKlass>
>    8:         10403       10103520  <instanceKlassKlass>
>    9:         69955       10046040  [Ljava.lang.Object;
>   10:        122628        7963480  [C
>   11:         72208        7458496  [Lscala.collection.mutable.HashEntry;
>   12:        282624        6782976  io.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry
>   13:          8824        6326208  <constantPoolCacheKlass>
>   14:        138037        5521480  org.apache.spark.storage.BlockStatus
>   15:        209132        5019168  scala.collection.mutable.DefaultEntry
>   16:           140        4589760  [Lscala.concurrent.forkjoin.ForkJoinTask;
>   17:          6591        3509992  [I
>   18:          8454        3275104  <methodDataKlass>
>   19:        135468        3251232  org.apache.spark.storage.RDDBlockId
>   20:        121176        2908224  java.lang.String
>   21:         72207        2888280  scala.collection.mutable.HashMap
>   22:         65714        2628560  scala.collection.mutable.HashSet
>   23:         17935        2008720  org.apache.spark.ui.jobs.UIData$ExecutorSummary
>   24:        121144        1938304  java.lang.Integer
>   25:         17987        1870648  org.apache.spark.executor.TaskMetrics
>   26:         65268        1566432  org.apache.spark.rdd.ShuffledRDDPartition
>   27:         11094        1319248  java.lang.Class
>   28:         17996        1295712  org.apache.spark.scheduler.TaskInfo
>   29:         37067        1186144  scala.Tuple4
>   30:           640        1140736  [Lio.netty.buffer.PoolThreadCache$MemoryRegionCache$Entry;
>   31:         21033        1090832  [[I
>   32:         43517        1044408  scala.Tuple2
>   33:         17645         992808  [S
>   34:         40676         976224  scala.collection.immutable.$colon$colon
>   35:         60064         961024  scala.Some
>   36:         16118         902608  org.apache.spark.storage.RDDInfo
> {code}
> the *java.util.HashMap$Entry* and *java.lang.Long* object using 280MB memory!



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org