You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Juan Yu (JIRA)" <ji...@apache.org> on 2018/04/24 17:15:00 UTC
[jira] [Created] (IMPALA-6915) Reduce working memory when
processing metadata cache updates
Juan Yu created IMPALA-6915:
-------------------------------
Summary: Reduce working memory when processing metadata cache updates
Key: IMPALA-6915
URL: https://issues.apache.org/jira/browse/IMPALA-6915
Project: IMPALA
Issue Type: Sub-task
Components: Catalog
Reporter: Juan Yu
When processing catalog metadata cache update, working memory usage could be 5x more than the final metadata object memory footprint. If GC doesn't recycle memory fast enough, Impala could crash due to JVM out of memory.
most of it is coming from the HDFS client
[https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/Path.java#L147]
Stack Trace Average Object Size(bytes) Total TLAB size(bytes) Pressure(%)
{code}
java.lang.Thread.run() 152.486 6,586,166,960 78.246
java.util.concurrent.ThreadPoolExecutor$Worker.run() 152.959 6,583,034,136 78.208
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor$Worker) 152.959 6,583,034,136 78.208
java.util.concurrent.FutureTask.run() 154.425 6,575,955,192 78.124
org.apache.impala.catalog.HdfsTable$FileMetadataLoadRequest.call() 155.678 6,561,367,568 77.951
org.apache.impala.catalog.HdfsTable$FileMetadataLoadRequest.call() 155.678 6,561,367,568 77.951
org.apache.impala.catalog.HdfsTable.access$000(HdfsTable, Path, List) 155.678 6,561,367,568 77.951
org.apache.impala.catalog.HdfsTable.refreshFileMetadata(Path, List) 155.678 6,561,367,568 77.951
org.apache.impala.common.FileSystemUtil.listStatus(FileSystem, Path) 164.294 5,958,270,360 70.786
org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(Path) 164.294 5,958,270,360 70.786
org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystem, Path) 164.294 5,958,270,360 70.786
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(Path) 164.294 5,958,270,360 70.786
org.apache.hadoop.hdfs.DistributedFileSystem$16.doCall(Path) 164.294 5,958,270,360 70.786
org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem, Path) 164.294 5,958,270,360 70.786
org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(Path) 164.294 5,958,270,360 70.786
org.apache.hadoop.hdfs.protocol.HdfsFileStatus.makeQualified(URI, Path) 188.964 4,715,516,408 56.022
org.apache.hadoop.hdfs.protocol.HdfsFileStatus.getFullPath(Path) 190.731 4,649,582,248 55.238
org.apache.hadoop.fs.Path.<init>(Path, String) 193.378 4,543,189,320 53.974
org.apache.hadoop.fs.Path.<init>(Path, Path) 202.23 4,204,506,424 49.951
org.apache.hadoop.fs.Path.initialize(String, String, String, String) 231.389 1,623,793,272 19.291
java.net.URI.<init>(String, String, String, String, String) 162.808 1,219,880,472 14.493
java.net.URI.resolve(URI) 226.126 596,637,792 7.088
java.lang.StringBuilder.append(String) 253.489 404,781,104 4.809
java.lang.StringBuilder.toString() 132.941 180,183,984 2.141
java.lang.StringBuilder.<init>() 48 72,680,008 0.863
{code}
Different GC strategy may help release some memory pressure, but it's better to see if we could reduce the working memory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)