You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/06/24 06:17:07 UTC

[jira] Created: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Hive should use ClassLoader from hadoop Configuration
-----------------------------------------------------

                 Key: HIVE-574
                 URL: https://issues.apache.org/jira/browse/HIVE-574
             Project: Hadoop Hive
          Issue Type: Bug
    Affects Versions: 0.3.0, 0.3.1
            Reporter: Zheng Shao
            Assignee: Zheng Shao


See HIVE-338.
Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:

{code}
package org.apache.hadoop.conf;
public class Configuration implements Iterable<Map.Entry<String,String>> {
   ...
  /**
   * Load a class by name.
   * 
   * @param name the class name.
   * @return the class object.
   * @throws ClassNotFoundException if the class is not found.
   */
  public Class<?> getClassByName(String name) throws ClassNotFoundException {
    return Class.forName(name, true, classLoader);
  }
{code}



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724356#action_12724356 ] 

Zheng Shao commented on HIVE-574:
---------------------------------

@Joydeep,

I will use SessionState.get().getConf().getClassLoader() for all compilation time code and ExecDriver (JobClient side) code. For execution time code (Mapper and Reducer) I will use the conf from Mapper/Reducer.configure() method.

Is that good?


> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724358#action_12724358 ] 

Joydeep Sen Sarma commented on HIVE-574:
----------------------------------------

yes - makes total sense.

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723606#action_12723606 ] 

Joydeep Sen Sarma commented on HIVE-574:
----------------------------------------

we should always be able to get a Conf from our session (SessionState.get()) from any part of the code. the same conf would be used throughout a session.


> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-574:
----------------------------

    Attachment: HIVE-574.3.patch

Make ExecMapper and ExecReducer print out class path with l4j.info(), to ease debugging of classpaths in the future.

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724480#action_12724480 ] 

Zheng Shao commented on HIVE-574:
---------------------------------

I couldn't figure out how to do the job cleanly after 6 hours of debugging. I would suggest to commit the current patch and open a jira for follow-up.

We need a major refactoring of the use of HiveConf in the code.


> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joydeep Sen Sarma updated HIVE-574:
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.4.0
           Status: Resolved  (was: Patch Available)

committed - thanks Zheng!

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>             Fix For: 0.4.0
>
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Yongqiang He (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723429#action_12723429 ] 

Yongqiang He commented on HIVE-574:
-----------------------------------

For adding EnumSet support in Hadoop, i added a method loadClass(Configuration conf, String className) in ObjectWritable. 
<noformat>
  /**
   * Find and load the class with given name <tt>className</tt> by first finding
   * it in the specified <tt>conf</tt>. If the specified <tt>conf</tt> is null,
   * try load it directly.
   */
  public static Class<?> loadClass(Configuration conf, String className) {
    Class<?> declaredClass = null;
    try {
      if (conf != null)
        declaredClass = conf.getClassByName(className);
      else
        declaredClass = Class.forName(className);
    } catch (ClassNotFoundException e) {
      throw new RuntimeException("readObject can't find class " + className,
          e);
    }
    return declaredClass;
  }
</noformat>
I donot know if it can be used here.

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Min Zhou (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723874#action_12723874 ] 

Min Zhou commented on HIVE-574:
-------------------------------

+1 for Zheng, thanks
It worked fine here, nothing abnormal.

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12723472#action_12723472 ] 

Joydeep Sen Sarma commented on HIVE-574:
----------------------------------------

- there's a bunch of places where JavaUtils.getClassLoader() is called - all of these need to use the conf classloader instead.
- where are we invoking conf.setClassLoader? (addToClassPath should be calling this it seems)

if conf.set/getClassLoader is supported in all versions 17 onwards - then we can just remove the -libjars business - since that doesn't help us any bit now (ExecDriver is not a Tool - so it doesn't even get the conf with the classpath set that a Tool would normally get by specifying the -libjars option).

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724351#action_12724351 ] 

Joydeep Sen Sarma commented on HIVE-574:
----------------------------------------

hi zheng - i would prefer to add a getClassByName to JavaUtils (replace getClassLoader) and then use the Session Configuration object's getclassbyname (SessionState.get().getConf())  if it's available - or else default to thread classloader or else finally default to current classloader.

i feel very uncomfortable that right now we are sometimes using the conf classloader and sometimes using the thread classloader. it's sheer chance that all the uses of JavaUtils.getClassLoader() are happening within the same thread as the thread that adds jar files (or the main thread of the ExecDriver). This can change anytime tomorrow.

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12724372#action_12724372 ] 

Zheng Shao commented on HIVE-574:
---------------------------------

I was trying to make the changes and found some other problems.

The SessionState is only available in ql, while JavaUtils is in common right now. SessionState depends on a bunch of Hive ql classes so I cannot move it to common, and JavaUtils is called from serde, metastore, and ql.

I will change both serde and metastore methods to require hadoop configuration, so that they can get the classloader from it.

I will also move JavaUtils to ql.

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725004#action_12725004 ] 

Zheng Shao commented on HIVE-574:
---------------------------------

Let's get this in first since it fixes the bug.

The follow-up JIRA is HIVE-584.


> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-574:
----------------------------

    Attachment: HIVE-574.1.patch

This patch uses the ClassLoader from all paths sets the class loader in most cases.
It also adds the "addedJars" to HIVEADDEDJARS so that ExecDriver can add it into the ClassLoader for both the ClassLoader in the conf and the thread contextClassLoader.

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-574:
----------------------------

    Status: Patch Available  (was: Open)

> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725441#action_12725441 ] 

Joydeep Sen Sarma commented on HIVE-574:
----------------------------------------

changes look good! will commit after tests.



> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch, HIVE-574.3.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-574) Hive should use ClassLoader from hadoop Configuration

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zheng Shao updated HIVE-574:
----------------------------

    Attachment: HIVE-574.2.patch

This patch fixed a bug in ExecDriver.initialize where we were setting "tmpjars", which were overwritten by ExecDriver.execute().

> there's a bunch of places where JavaUtils.getClassLoader() is called - all of these need to use the conf classloader instead.
I tried to change all of these to use conf classloader, but in lots of places there are no shared conf objects, so I will leave it for now. I changed the comment so it's clear that in hive (outside hadoop map-reduce jobs) we are using thread classloader, while in hadoop map-reduce jobs we are using conf classloaders.

> where are we invoking conf.setClassLoader? (addToClassPath should be calling this it seems)
I moved that logic outside of the addToClassPath function. All callers will do that set.

> if conf.set/getClassLoader is supported in all versions 17 onwards - then we can just remove the -libjars business - since that doesn't help us any bit now (ExecDriver is not a Tool - so it doesn't even get the conf with the classpath set that a Tool would normally get by specifying the -libjars option).

Maybe in the future we should move ExecDriver to a tool? I think we should rely more on hadoop to set the classpath for us, so maybe in the future we should remove the special "if (local)" block in ExecDriver where we set the classpath ourselves.


> Hive should use ClassLoader from hadoop Configuration
> -----------------------------------------------------
>
>                 Key: HIVE-574
>                 URL: https://issues.apache.org/jira/browse/HIVE-574
>             Project: Hadoop Hive
>          Issue Type: Bug
>    Affects Versions: 0.3.0, 0.3.1
>            Reporter: Zheng Shao
>            Assignee: Zheng Shao
>         Attachments: HIVE-574.1.patch, HIVE-574.2.patch
>
>
> See HIVE-338.
> Hive should always use the getClassByName method from hadoop Configuration, so that we choose the correct ClassLoader. Examples include all plug-in interfaces, including UDF/GenericUDF/UDAF, SerDe, and FileFormats. Basically the following code snippet shows the idea:
> {code}
> package org.apache.hadoop.conf;
> public class Configuration implements Iterable<Map.Entry<String,String>> {
>    ...
>   /**
>    * Load a class by name.
>    * 
>    * @param name the class name.
>    * @return the class object.
>    * @throws ClassNotFoundException if the class is not found.
>    */
>   public Class<?> getClassByName(String name) throws ClassNotFoundException {
>     return Class.forName(name, true, classLoader);
>   }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.