You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@oozie.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2015/09/01 00:09:46 UTC

[jira] [Commented] (OOZIE-2347) Remove unnecessary new Configuration()/new jobConf() calls from oozie

    [ https://issues.apache.org/jira/browse/OOZIE-2347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14724127#comment-14724127 ] 

Rohini Palaniswamy commented on OOZIE-2347:
-------------------------------------------

* JobConf jobConf = new JobConf(cachedJobConf); is still not useful as Configuration class does lazy loading. You need to call a dummy get method on the cachedJobConf before it is used in cloning.
* Configuration conf = new XConfiguration(); is not required in getShareLibConf. Please return null and process the null in JavaActionExecutor
* Please remove the comment - // it fails with conf, therefore we pass defaultConf instead - if it is not an issue for actual run. Most likely TestJavaActionExecutor.testAddToCache is going to fail and you will have to fix that test to do Configuration conf = new Configuration(); instead of Configuration conf = new XConfiguration();.

> Remove unnecessary new Configuration()/new jobConf() calls from oozie
> ---------------------------------------------------------------------
>
>                 Key: OOZIE-2347
>                 URL: https://issues.apache.org/jira/browse/OOZIE-2347
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Purshotam Shah
>            Assignee: Purshotam Shah
>         Attachments: OOZIE-2347-V1.patch
>
>
> We noticed that setting of job sharelib was slow and one prime reason was lot of thread was blocked on "java.util.zip.ZipFile.getEntry"
> <0x00000005c0afda68> (a java.util.jar.JarFile): 0 Thread(s) sleeping, 178 Thread(s) waiting, 1 Thread(s) locking
> There are lot of places we do new Configuration()/new jobConf() unnecessarily. This can be easily removed to enhance performance.
> 1.
> Configuration defaultConf = new Configuration(); is called for every file we add to classpath.
> {code}
> public static void addFileToClassPath(Path file, Configuration conf, FileSystem fs) throws IOException {
>       Configuration defaultConf = new Configuration();
>       XConfiguration.copy(conf, defaultConf);
>       if (fs == null) {
>         // it fails with conf, therefore we pass defaultConf instead
>         fs = file.getFileSystem(defaultConf);
>       }
>       // Hadoop 0.20/1.x.
>       if (defaultConf.get("yarn.resourcemanager.webapp.address") == null) {
>           // Duplicate hadoop 1.x code to workaround MAPREDUCE-2361 in Hadoop 0.20
>           // Refer OOZIE-1806.
>           String filepath = file.toUri().getPath();
>           String classpath = conf.get("mapred.job.classpath.files");
>           conf.set("mapred.job.classpath.files", classpath == null
>               ? filepath
>               : classpath + System.getProperty("path.separator") + filepath);
>           URI uri = fs.makeQualified(file).toUri();
>           DistributedCache.addCacheFile(uri, conf);
>       }
>       else { // Hadoop 0.23/2.x
>           DistributedCache.addFileToClassPath(file, conf, fs);
>       }
>     }
> {code}
> 2.
> sharelib setup also calls new Configuration(), which is not needed.
> {code}
> public Configuration getShareLibConf(String inputKey, Path path) {
>         Configuration conf = new Configuration();
>         if (shareLibConfigMap.containsKey(inputKey)) {
>             conf = shareLibConfigMap.get(inputKey).get(path);
>         }
>         return conf;
>     }
> {code}	
> 	
> 	
> 3.CoordActionInputCheckXCommand.checkPath also creates jobConf every time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)