You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/08/24 08:25:42 UTC
[GitHub] [hudi] cnDenis opened a new issue #2019: Leak in DiskBasedMap
cnDenis opened a new issue #2019:
URL: https://github.com/apache/hudi/issues/2019
**Describe the problem you faced**
java.lang.ApplicationShutdownHooks holds huge number of org.apache.hudi.common.util.collection.DiskBasedMap and org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator object, run out of memory.
**To Reproduce**
Steps to reproduce the behavior:
1. run hudi in local mode with local[4] and 4G memory, set spark.memory.fraction=0.2 spark.memory.storageFraction=0.2 (allowing it to spill rather than OOM according to tuning guide)
2. receive messages from kafka and write them into hdfs using hudi for few days
3. then out of memory happened
I check lsof, the process opens 25000+ temp file in /tmp
Check with jmap -dump, there are large number of DiskBasedMap and LazyFileIterable in ShutDownHooks. (see below)
How to close these temp files and hooks ?
**Environment Description**
* Hudi version : 0.5.3
* Spark version : 2.4.6
* Hive version : 1.2.1
* Hadoop version : 2.7.3
* Storage (HDFS/S3/GCS..) : HDFS
* Running on Docker? (yes/no) : no
**Additional context**
```
Class Name | Shallow Heap | Retained Heap
----------------------------------------------------------------------------------------------------------------------------------------------------------------
class java.lang.ApplicationShutdownHooks @ 0x6c45b3888 System Class | 8 | 3,233,852,456
|- <class> class java.lang.Class @ 0x6c3d26108 System Class | 40 | 1,152
|- <classloader> java.lang.ClassLoader @ 0x0 <system class loader> | 64 | 64
|- <super> class java.lang.Object @ 0x6c3d4e498 System Class | 8 | 40
|- <resolved_references> java.lang.Object[3] @ 0x6c449b708 | 32 | 208
|- hooks java.util.IdentityHashMap @ 0x6c4658950 | 40 | 3,233,852,240
| |- <class> class java.util.IdentityHashMap @ 0x6c45b36a0 System Class | 32 | 264
| |- table java.lang.Object[262144] @ 0x794000000 | 1,048,592 | 3,233,852,200
| | |- <class> class java.lang.Object[] @ 0x6c3cc9610 | 0 | 0
| | |- [221646], [221647] org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 0x6c0009d60 Thread-112 | 128 | 536
| | |- [88924], [88925] org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 0x6c0019338 Thread-1549 | 128 | 536
| | |- [203058], [203059] org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 0x6c0019650 Thread-1611| 128 | 536
| | |- [256104], [256105] org.apache.hudi.common.util.collection.LazyFileIterable$LazyFileIterator$1 @ 0x6c0160080 Thread-1460| 128 | 536
| | |- [239698], [239699] org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c0160500 Thread-1432 | 128 | 131,808
| | |- [72214], [72215] org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c01609d0 Thread-1451 | 128 | 131,808
| | |- [31762], [31763] org.apache.hudi.common.util.collection.DiskBasedMap$1 @ 0x6c0160df0 Thread-1435 | 128 | 131,808
----------------------------------------------------------------------------------------------------------------------------------------------------------------
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar closed issue #2019: Leak in DiskBasedMap
Posted by GitBox <gi...@apache.org>.
bvaradar closed issue #2019:
URL: https://github.com/apache/hudi/issues/2019
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2019: Leak in DiskBasedMap
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2019:
URL: https://github.com/apache/hudi/issues/2019#issuecomment-679261953
This is currently tracked in https://issues.apache.org/jira/browse/HUDI-945
Should be a simple fix as mentioned in the jira. Are you interested in submitting a PR ? We will have this fixed in next release.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [hudi] bvaradar commented on issue #2019: Leak in DiskBasedMap
Posted by GitBox <gi...@apache.org>.
bvaradar commented on issue #2019:
URL: https://github.com/apache/hudi/issues/2019#issuecomment-682063177
Closing this issue as we have a jira to track.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org