You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Purshotam Shah (JIRA)" <ji...@apache.org> on 2015/08/31 19:27:45 UTC
[jira] [Updated] (OOZIE-2347) Remove unnecessary new
Configuration()/new jobConf() calls from oozie
[ https://issues.apache.org/jira/browse/OOZIE-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Purshotam Shah updated OOZIE-2347:
----------------------------------
Description:
We noticed that setting of job sharelib was slow and one prime reason was lot of thread was blocked on "java.util.zip.ZipFile.getEntry"
<0x00000005c0afda68> (a java.util.jar.JarFile): 0 Thread(s) sleeping, 178 Thread(s) waiting, 1 Thread(s) locking
There are lot of places we do new Configuration()/new jobConf() unnecessarily. This can be easily removed to enhance performance.
1.
Configuration defaultConf = new Configuration(); is called for every file we add to classpath.
{code}
public static void addFileToClassPath(Path file, Configuration conf, FileSystem fs) throws IOException {
Configuration defaultConf = new Configuration();
XConfiguration.copy(conf, defaultConf);
if (fs == null) {
// it fails with conf, therefore we pass defaultConf instead
fs = file.getFileSystem(defaultConf);
}
// Hadoop 0.20/1.x.
if (defaultConf.get("yarn.resourcemanager.webapp.address") == null) {
// Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 0.20
// Refer OOZIE-1806.
String filepath = file.toUri().getPath();
String classpath = conf.get("mapred.job.classpath.files");
conf.set("mapred.job.classpath.files", classpath == null
? filepath
: classpath + System.getProperty("path.separator") + filepath);
URI uri = fs.makeQualified(file).toUri();
DistributedCache.addCacheFile(uri, conf);
}
else { // Hadoop 0.23/2.x
DistributedCache.addFileToClassPath(file, conf, fs);
}
}
{code}
2.
sharelib setup also calls new Configuration(), which is not needed.
{code}
public Configuration getShareLibConf(String inputKey, Path path) {
Configuration conf = new Configuration();
if (shareLibConfigMap.containsKey(inputKey)) {
conf = shareLibConfigMap.get(inputKey).get(path);
}
return conf;
}
{code}
3.CoordActionInputCheckXCommand.checkPath also creates jobConf every time.
> Remove unnecessary new Configuration()/new jobConf() calls from oozie
> ---------------------------------------------------------------------
>
> Key: OOZIE-2347
> URL: https://issues.apache.org/jira/browse/OOZIE-2347
> Project: Oozie
> Issue Type: Bug
> Reporter: Purshotam Shah
>
> We noticed that setting of job sharelib was slow and one prime reason was lot of thread was blocked on "java.util.zip.ZipFile.getEntry"
> <0x00000005c0afda68> (a java.util.jar.JarFile): 0 Thread(s) sleeping, 178 Thread(s) waiting, 1 Thread(s) locking
> There are lot of places we do new Configuration()/new jobConf() unnecessarily. This can be easily removed to enhance performance.
> 1.
> Configuration defaultConf = new Configuration(); is called for every file we add to classpath.
> {code}
> public static void addFileToClassPath(Path file, Configuration conf, FileSystem fs) throws IOException {
> Configuration defaultConf = new Configuration();
> XConfiguration.copy(conf, defaultConf);
> if (fs == null) {
> // it fails with conf, therefore we pass defaultConf instead
> fs = file.getFileSystem(defaultConf);
> }
> // Hadoop 0.20/1.x.
> if (defaultConf.get("yarn.resourcemanager.webapp.address") == null) {
> // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 0.20
> // Refer OOZIE-1806.
> String filepath = file.toUri().getPath();
> String classpath = conf.get("mapred.job.classpath.files");
> conf.set("mapred.job.classpath.files", classpath == null
> ? filepath
> : classpath + System.getProperty("path.separator") + filepath);
> URI uri = fs.makeQualified(file).toUri();
> DistributedCache.addCacheFile(uri, conf);
> }
> else { // Hadoop 0.23/2.x
> DistributedCache.addFileToClassPath(file, conf, fs);
> }
> }
> {code}
> 2.
> sharelib setup also calls new Configuration(), which is not needed.
> {code}
> public Configuration getShareLibConf(String inputKey, Path path) {
> Configuration conf = new Configuration();
> if (shareLibConfigMap.containsKey(inputKey)) {
> conf = shareLibConfigMap.get(inputKey).get(path);
> }
> return conf;
> }
> {code}
>
>
> 3.CoordActionInputCheckXCommand.checkPath also creates jobConf every time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)