You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Purshotam Shah (JIRA)" <ji...@apache.org> on 2015/09/14 23:08:46 UTC
[jira] [Reopened] (OOZIE-2347) Remove unnecessary new
Configuration()/new jobConf() calls from oozie
[ https://issues.apache.org/jira/browse/OOZIE-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Purshotam Shah reopened OOZIE-2347:
-----------------------------------
> Remove unnecessary new Configuration()/new jobConf() calls from oozie
> ---------------------------------------------------------------------
>
> Key: OOZIE-2347
> URL: https://issues.apache.org/jira/browse/OOZIE-2347
> Project: Oozie
> Issue Type: Bug
> Reporter: Purshotam Shah
> Assignee: Purshotam Shah
> Fix For: trunk
>
> Attachments: OOZIE-2347-V1.patch, OOZIE-2347-V2.patch
>
>
> We noticed that setting of job sharelib was slow and one prime reason was lot of thread was blocked on "java.util.zip.ZipFile.getEntry"
> <0x00000005c0afda68> (a java.util.jar.JarFile): 0 Thread(s) sleeping, 178 Thread(s) waiting, 1 Thread(s) locking
> There are lot of places we do new Configuration()/new jobConf() unnecessarily. This can be easily removed to enhance performance.
> 1.
> Configuration defaultConf = new Configuration(); is called for every file we add to classpath.
> {code}
> public static void addFileToClassPath(Path file, Configuration conf, FileSystem fs) throws IOException {
> Configuration defaultConf = new Configuration();
> XConfiguration.copy(conf, defaultConf);
> if (fs == null) {
> // it fails with conf, therefore we pass defaultConf instead
> fs = file.getFileSystem(defaultConf);
> }
> // Hadoop 0.20/1.x.
> if (defaultConf.get("yarn.resourcemanager.webapp.address") == null) {
> // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 0.20
> // Refer OOZIE-1806.
> String filepath = file.toUri().getPath();
> String classpath = conf.get("mapred.job.classpath.files");
> conf.set("mapred.job.classpath.files", classpath == null
> ? filepath
> : classpath + System.getProperty("path.separator") + filepath);
> URI uri = fs.makeQualified(file).toUri();
> DistributedCache.addCacheFile(uri, conf);
> }
> else { // Hadoop 0.23/2.x
> DistributedCache.addFileToClassPath(file, conf, fs);
> }
> }
> {code}
> 2.
> sharelib setup also calls new Configuration(), which is not needed.
> {code}
> public Configuration getShareLibConf(String inputKey, Path path) {
> Configuration conf = new Configuration();
> if (shareLibConfigMap.containsKey(inputKey)) {
> conf = shareLibConfigMap.get(inputKey).get(path);
> }
> return conf;
> }
> {code}
>
>
> 3.CoordActionInputCheckXCommand.checkPath also creates jobConf every time.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)