You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@oozie.apache.org by "Roman Shaposhnik (JIRA)" <ji...@apache.org> on 2011/09/10 05:05:11 UTC

[jira] [Created] (OOZIE-425) OOZIE-8: Support multiple versions of jar

OOZIE-8: Support multiple versions of jar
-----------------------------------------

                 Key: OOZIE-425
                 URL: https://issues.apache.org/jira/browse/OOZIE-425
             Project: Oozie
          Issue Type: New Feature
            Reporter: Hadoop QA
            Assignee: Roman Shaposhnik


Background:
-------------
Currently oozie supports 3 ways of including jar files for any workflow. All these cases are getting file from hdfs directory add those jar into distributed cache.
1. System supported jars.
2. User-specific common jars.
3. Workflow specific jar.


Current Shortcomings:
---------------------
However, if the system wants to support multiple version of the same product (such as pig or hive), it is not possible in current oozie. In addition, oozie user often includes very old or non-supported jars (mostly unintended). This creates a lot of support issues. Id oozie could restrict (in *some* extent) the usage of version, this frequent support overhead could be minimized.

Purpose:
-------
This JIRA is created for the following purposes:
1. Support multiple version of jars. In reality, for instance, there are multiple active version of pig jar. One might be in 'stable' and another might be immediate next version.

2. Enforce the usage of supported jars. For example, user could configure to use specific version of pig. If user doesn't provide any configuration, oozie will pick up the most stable jar (system configurable).

It is important to note, that this proposed feature will not fully remove the usage of unsupported version of jar.


Design Details:
===================

1. Every product could have a product specific sub-directory in the system lib dir. In that sub-directory, there could be multiple versions of jars. For example: <SYSTEM_LIB>/pig/0.7/lib, <SYSTEM_LIB>/pig/0.8/lib and <SYSTEM_LIB>/pig/stable/lib.

2. Need to modify the current way of including jar from SYSTEM_LIB.

3. In addition of including SYSTEM_LIB jars, oozie need to include the jar from the user-selected (or default) version. For example, if a user configure to use pig 0.8, oozie should include jar from <SYSTEM_LIB>/pig/0.8/lib/*.jar. And if user doesn't configure for any specific pig version, oozie should include <SYSTEM_LIB>/pig/stable/lib/*.jar.

4. If user specified some unsupported version jar, Oozie should throw exception with appropriate error message.

5. Oozie should include the product specific jar when it asked for that product. For example, oozie should include the pig jar through PigActionExecutor and hive jars through HiveActionExecutor. As a side effect,it will reduce the number of jars included in the Distributed Cache by selectively including the appropriate jar.

6. It will be the SE/OPS responsibility to maintain the supported versions of lib directories for any product.


Implementation Details:
-------------------------
How to implement it in current Oozie?

Once we agreed on the conceptual part, we could do the following changes in the code.

1. In place of putting all APP_LIB_PATH into WorkflowAppService.APP_LIB_PATH_LIST, we could create two such list. WorkflowAppService.SYSTEM_LIB_PATH_LIST that will contain only <SYSTEM_LIB/*.jar>.  WorkflowAppService.APP_LIB_PATH_LIST will hold the rest (user-specific and wf/lib/.)

2. Modify JavaActionExecutor:setLibFilesArchives()
.....
       String[] paths = getLibPaths(); //New method

        if (paths != null) {
            for (String path : paths) {
                addToCache(conf, appPath, path, false);
            }
        }
 ....
}

//New method :Base implementation
protected String[] getLibPaths(...) {
       String[] paths = proto.getStrings(WorkflowAppService.APP_LIB_PATH_LIST);
       return paths; 
}

3. For example, in PigActionExecutor getLibPaths could be overridden using the following pseudo-code:

protected String[] getLibPaths(...) {
{
       String paths[] = super.getLibPaths();
       String path = services.getConf().get(SYSTEM_LIB_PATH, " ");
       if (path.trim().length() > 0) {
            systemLibPath = new Path(path.trim());
       } else { return ..}
       String pigHome = systemLibPath = "/pig/" + usedVersion;
       List<String> libPaths = getLibFiles(fs, systemLibPath);
       return paths + libPaths; 
}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira