You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/24 08:25:42 UTC

[GitHub] [hudi] cnDenis opened a new issue #2019: Leak in DiskBasedMap

cnDenis opened a new issue #2019:
URL: https://github.com/apache/hudi/issues/2019


   **Describe the problem you faced**
   
   java.lang.ApplicationShutdownHooks holds huge number of org.apache.hudi.common.util.collection.DiskBasedMap and org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator object, run out of memory.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. run hudi in local mode with  local[4] and 4G memory, set spark.memory.fraction=0.2 spark.memory.storageFraction=0.2 (allowing it to spill rather than OOM according to tuning guide)
   2. receive messages from kafka and write them into hdfs using hudi for few days
   3. then out of memory happened
   
   I check lsof, the process opens 25000+ temp file in /tmp
   
   Check with jmap -dump, there are large number of DiskBasedMap and LazyFileIterable in ShutDownHooks. (see below)
   
   How to close these temp files and hooks ?
   
   **Environment Description**
   
   * Hudi version : 0.5.3
   
   * Spark version : 2.4.6
   
   * Hive version : 1.2.1
   
   * Hadoop version : 2.7.3
   
   * Storage (HDFS/S3/GCS..) : HDFS
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   ```
   Class Name                                                                                                                       | Shallow Heap | Retained Heap
   ----------------------------------------------------------------------------------------------------------------------------------------------------------------
   class java.lang.ApplicationShutdownHooks @ 0x6c45b3888 System Class                                                              |            8 | 3,233,852,456
   |- <class> class java.lang.Class @ 0x6c3d26108 System Class                                                                      |           40 |         1,152
   |- <classloader> java.lang.ClassLoader @ 0x0  <system class loader>                                                              |           64 |            64
   |- <super> class java.lang.Object @ 0x6c3d4e498 System Class                                                                     |            8 |            40
   |- <resolved_references> java.lang.Object[3] @ 0x6c449b708                                                                       |           32 |           208
   |- hooks java.util.IdentityHashMap @ 0x6c4658950                                                                                 |           40 | 3,233,852,240
   |  |- <class> class java.util.IdentityHashMap @ 0x6c45b36a0 System Class                                                         |           32 |           264
   |  |- table java.lang.Object[262144] @ 0x794000000                                                                               |    1,048,592 | 3,233,852,200
   |  |  |- <class> class java.lang.Object[] @ 0x6c3cc9610                                                                          |            0 |             0
   |  |  |- [221646], [221647] org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 0x6c0009d60  Thread-112 |          128 |           536
   |  |  |- [88924], [88925] org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 0x6c0019338  Thread-1549  |          128 |           536
   |  |  |- [203058], [203059] org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 0x6c0019650  Thread-1611|          128 |           536
   |  |  |- [256104], [256105] org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 0x6c0160080  Thread-1460|          128 |           536
   |  |  |- [239698], [239699] org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c0160500  Thread-1432                     |          128 |       131,808
   |  |  |- [72214], [72215] org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c01609d0  Thread-1451                       |          128 |       131,808
   |  |  |- [31762], [31763] org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c0160df0  Thread-1435                       |          128 |       131,808
   ----------------------------------------------------------------------------------------------------------------------------------------------------------------
   
   ```
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar closed issue #2019: Leak in DiskBasedMap

Posted by GitBox <gi...@apache.org>.
bvaradar closed issue #2019:
URL: https://github.com/apache/hudi/issues/2019


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2019: Leak in DiskBasedMap

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2019:
URL: https://github.com/apache/hudi/issues/2019#issuecomment-679261953


   This is currently tracked in https://issues.apache.org/jira/browse/HUDI-945
   
   Should be a simple fix as mentioned in the jira. Are you interested in submitting a PR ? We will have this fixed in next release.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [hudi] bvaradar commented on issue #2019: Leak in DiskBasedMap

Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2019:
URL: https://github.com/apache/hudi/issues/2019#issuecomment-682063177


   Closing this issue as we have a jira to track.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org