You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Prabhu Joseph (JIRA)" <ji...@apache.org> on 2016/10/19 11:39:58 UTC
[jira] [Created] (MAPREDUCE-6797) Improvement in the fix of
Mapreduce-6684
Prabhu Joseph created MAPREDUCE-6797:
----------------------------------------
Summary: Improvement in the fix of Mapreduce-6684
Key: MAPREDUCE-6797
URL: https://issues.apache.org/jira/browse/MAPREDUCE-6797
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: jobhistoryserver
Affects Versions: 2.4.0, 2.8.0
Reporter: Prabhu Joseph
Priority: Critical
Description:
There is one more piece of code in HistoryFileManager where Synchronized keyword on HistoryFileInfo need to be removed. The JobHistoryServer contention issue is hit on our environment where stacktrace (attached) shows the HistoryFileManager$JobListCache.addIfAbsent unnecessarily waiting to lock on HistoryFileInfo.
Synchronized on isMovePending and didMoveFail has been removed by Mapreduce-6684.
{code}
HistoryFileInfo firstValue = cache.get(key);
synchronized(firstValue) { ---------------> Synchronized is not needed here
if (firstValue.isMovePending()) {
if(firstValue.didMoveFail() &&
firstValue.jobIndexInfo.getFinishTime() <= cutoff) {
cache.remove(key);
//Now lets try to delete it
try {
firstValue.delete();
} catch (IOException e) {
LOG.error("Error while trying to delete history files" +
" that could not be moved to done.", e);
}
} else {
LOG.warn("Waiting to remove " + key
+ " from JobListCache because it is not in done yet.");
}
} else {
cache.remove(key);
}
}
{code}
{code}
Note: stacktrace is from hadoop-2.4.0 version and the problem exists in latest hadoop as well
"2144820863@qtp-313351300-38156" daemon prio=10 tid=0x0000000001e13800 nid=0xf133 waiting for monitor entry [0x00007f7c1d8dd000]
java.lang.Thread.State: BLOCKED (on object monitor)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$JobListCache.addIfAbsent(HistoryFileManager.java:226)
- waiting to lock <0x000000040145c4d8> (a org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$HistoryFileInfo)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanIntermediateDirectory(HistoryFileManager.java:825)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.access$200(HistoryFileManager.java:82)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$UserLogDir.scanIfNeeded(HistoryFileManager.java:280)
- locked <0x0000000400375388> (a org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager$UserLogDir)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.scanIntermediateDirectory(HistoryFileManager.java:792)
at org.apache.hadoop.mapreduce.v2.hs.HistoryFileManager.getAllFileInfo(HistoryFileManager.java:920)
at org.apache.hadoop.mapreduce.v2.hs.CachedHistoryStorage.getAllPartialJobs(CachedHistoryStorage.java:156)
at org.apache.hadoop.mapreduce.v2.hs.JobHistory.getAllJobs(JobHistory.java:235)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: mapreduce-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-help@hadoop.apache.org