You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Hadoop QA (JIRA)" <ji...@apache.org> on 2015/10/24 02:21:27 UTC

[jira] [Commented] (HADOOP-12482) Race condition in JMX cache update

    [ https://issues.apache.org/jira/browse/HADOOP-12482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14972170#comment-14972170 ] 

Hadoop QA commented on HADOOP-12482:
------------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s {color} | {color:blue} docker + precommit patch detected. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 19s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 52s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 46s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 15s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 50s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 58s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 10s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 30s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 49s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 49s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 4m 52s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 4m 52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 19s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 8m 2s {color} | {color:red} hadoop-common in the patch failed with JDK v1.8.0_60. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 7m 39s {color} | {color:red} hadoop-common in the patch failed with JDK v1.7.0_79. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 24s {color} | {color:green} Patch does not generate ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 51m 11s {color} | {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.7.0_79 Failed junit tests | hadoop.ipc.TestDecayRpcScheduler |
|   | hadoop.fs.shell.TestCopyPreserveFlag |
|   | hadoop.security.ssl.TestReloadingX509TrustManager |
|   | hadoop.metrics2.impl.TestGangliaMetrics |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-10-23 |
| JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12766823/HADOOP-12482.001.patch |
| JIRA Issue | HADOOP-12482 |
| Optional Tests |  asflicense  javac  javadoc  mvninstall  unit  findbugs  checkstyle  compile  |
| uname | Linux 815b6bc23c30 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HADOOP-Build/patchprocess/apache-yetus-28a3a3d/dev-support/personality/hadoop.sh |
| git revision | trunk / 15eb84b |
| Default Java | 1.7.0_79 |
| Multi-JDK versions |  /usr/lib/jvm/java-8-oracle:1.8.0_60 /usr/lib/jvm/java-7-openjdk-amd64:1.7.0_79 |
| findbugs | v3.0.0 |
| unit | https://builds.apache.org/job/PreCommit-HADOOP-Build/7921/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_60.txt |
| unit | https://builds.apache.org/job/PreCommit-HADOOP-Build/7921/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_79.txt |
| unit test logs |  https://builds.apache.org/job/PreCommit-HADOOP-Build/7921/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.8.0_60.txt https://builds.apache.org/job/PreCommit-HADOOP-Build/7921/artifact/patchprocess/patch-unit-hadoop-common-project_hadoop-common-jdk1.7.0_79.txt |
| JDK v1.7.0_79  Test Results | https://builds.apache.org/job/PreCommit-HADOOP-Build/7921/testReport/ |
| Max memory used | 225MB |
| Powered by | Apache Yetus   http://yetus.apache.org |
| Console output | https://builds.apache.org/job/PreCommit-HADOOP-Build/7921/console |


This message was automatically generated.



> Race condition in JMX cache update
> ----------------------------------
>
>                 Key: HADOOP-12482
>                 URL: https://issues.apache.org/jira/browse/HADOOP-12482
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.7.1
>            Reporter: Tony Wu
>            Assignee: Tony Wu
>         Attachments: HADOOP-12482.001.patch
>
>
> updateJmxCache() was updated in HADOOP-11301. However the patch introduced a race condition. In updateJmxCache() function in MetricsSourceAdapter.java:
> {code:java}
>   private void updateJmxCache() {
>     boolean getAllMetrics = false;
>     synchronized (this) {
>       if (Time.now() - jmxCacheTS >= jmxCacheTTL) {
>         // temporarilly advance the expiry while updating the cache
>         jmxCacheTS = Time.now() + jmxCacheTTL;
>         if (lastRecs == null) {
>           getAllMetrics = true;
>         }
>       } else {
>         return;
>       }
>       if (getAllMetrics) {
>         MetricsCollectorImpl builder = new MetricsCollectorImpl();
>         getMetrics(builder, true);
>       }
>       updateAttrCache();
>       if (getAllMetrics) {
>         updateInfoCache();
>       }
>       jmxCacheTS = Time.now();
>       lastRecs = null; // in case regular interval update is not running
>     }
>   }
> {code}
> Notice that getAllMetrics is set to true when:
> # jmxCacheTTL has passed
> # lastRecs == null
> lastRecs is set to null in the same function, but gets reassigned by getMetrics().
> However getMetrics() can be called from a different thread:
> # MetricsSystemImpl.onTimerEvent()
> # MetricsSystemImpl.publishMetricsNow()
> Consider the following sequence:
> # updateJmxCache() is called by getMBeanInfo() from a thread getting cached info. 
> ** lastRecs is set to null.
> # metrics sources is updated with new value/field.
> # getMetrics() is called by publishMetricsNow() or onTimerEvent() from a different thread getting the latest metrics. 
> ** lastRecs is updated (!= null).
> # jmxCacheTTL passed.
> # updateJmxCache() is called again via getMBeanInfo().
> ** However because lastRecs is already updated (!= null), getAllMetrics will not be set to true. So updateInfoCache() is not called and getMBeanInfo() returns the old cached info.
> We ran into this issue on a cluster where a new metric did not get published until much later.
> The case can be made worse by a periodic call to getMetrics() (driven by an external program or script). In such case getMBeanInfo() may never be able to retrieve the new record.
> The desired behavior should be that updateJmxCache() will guarantee to call updateInfoCache() once after jmxCacheTTL, if lastRecs has been set to null by updateJmxCache() itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)