You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/09/27 20:44:00 UTC

[jira] [Commented] (IMPALA-6857) Add JVM Pause Monitor to Impala Processes

    [ https://issues.apache.org/jira/browse/IMPALA-6857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16631014#comment-16631014 ] 

ASF subversion and git services commented on IMPALA-6857:
---------------------------------------------------------

Commit abd230647fa92db29ac3719096eb4ebc7c151069 in impala's branch refs/heads/master from [~philip]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=abd2306 ]

IMPALA-7596. Adding JvmPauseMonitor (and other GC) metrics to Impala metrics.

Following up to IMPALA-6857, it's useful for monitoring tools to see if
the pause monitor is getting triggered, and to see other GC metrics.

The Java side here, and the Thrift side, were easy enough.

However, the Impala metric implementation here caused us to call into
the frontend to read through the JMX memory beans 72 times, because each
call to GetValue() was getting all the data for the pool. This structure
made it hard to add additional, non-pool, metrics, and it felt wasteful.
To combat this, I added a cache of 10 seconds for getting the metrics
from the Frontend. The counters will typically re-use the same data.

There are five metrics here, and to avoid yet another enum class, I used
C++ lambdas to capture which field of the Thrift object I care about. If
folks like the approach, I think it can simplify way the enums for the
pool metrics as well.

I measured the cost of calling into the metrics code by
looping the metrics-gathering 100 times and looking at CPU
time for the process using this script:

  START_CPU=$(cat /proc/$(fuser 25000/tcp 2> /dev/null | tr -d ' ')/stat | awk '{ print $14 + $15 }')
  for i in $(seq 100); do
    curl http://localhost:25000/jsonmetrics?json > /dev/null 2> /dev/null
  done
  END_CPU=$(  cat /proc/$(fuser 25000/tcp 2> /dev/null | tr -d ' ')/stat | awk '{ print $14 + $15 }')
  echo $START_CPU $END_CPU $(($END_CPU - $START_CPU))

On a release build on my development machine, gathering metrics 100
times took 0.16 cpu seconds without this change and 0.07 cpu seconds
with this change. The measurement accuracy here is 0.01 (I spot-checked
this with using the cpuacct cgroup infrastructure which gives you nanos,
but it was more painful to script), but this convinces me that this is a
net improvement.

Change-Id: Ia707393962ad94ef715ec015b3fe3bb1769104a2
Reviewed-on: http://gerrit.cloudera.org:8080/11468
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add JVM Pause Monitor to Impala Processes
> -----------------------------------------
>
>                 Key: IMPALA-6857
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6857
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Catalog, Frontend
>            Reporter: Philip Zeyliger
>            Priority: Major
>              Labels: ramp-up, supportability
>             Fix For: Impala 3.1.0
>
>
> In IMPALA-3114, we added a pause monitor for Impala. In addition to that, we should port/borrow Hadoop's JvmPauseMonitor [https://github.com/apache/hadoop/blob/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/JvmPauseMonitor.java.] I believe that when the JVM is aggressively GCing, the C++ threads will continue to get scheduled (and won't log), but the Java ones will log. (I've definitely seen JvmPauseMonitor be accurate many times.)
> [~bharathv], when you were testing this, were you able to reproduce it triggering when the JVM half was in "GC hell"?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org