You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2012/07/13 09:09:33 UTC

[jira] [Commented] (PIG-2815) class loader management in PigContext

    [ https://issues.apache.org/jira/browse/PIG-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413532#comment-13413532 ] 

Raghu Angadi commented on PIG-2815:
-----------------------------------


An example:

{noformat}
register elephant-bird.jar; -- for working with Thrift objects.
-- (1)

register T_One.jar;
-- (2)

-- ThriftPigLoader takes name of a Thrift class that corresponds to input.
a = load '/logs/T_One' using ThriftPigLoader('thrift.gen.T_One');
-- (3)

register second.jar; 
-- (4)

b = load '/logs/T_two' using ThriftPigLoader('thrift.gen.T_two');

-- (5)
-- FAIL!
{noformat}

 * (1): new classlaoder cl_A is created with root classloader as the parent.
 * (2): cl_B is created with root as the parent.
 * (3): {{ThirftPigLoader.class}} is instantiated with cl_B and cached.
 * (4): cl_C is created with root as the parent.
 * (5): {{thrift.gen.T_two.class}} is instantiated with cl_C, but '{{ThriftPigLoader.class}}' from cl_B is reused by Pig. So all the Thrift classes seen by ThriftPigLoader are entirely different from all the Thrift classes seen by {{thrift.gen.T_two}} since cl_B is not a parent of cl_C. That can lead to a number of issues and it does.


                
> class loader management in PigContext
> -------------------------------------
>
>                 Key: PIG-2815
>                 URL: https://issues.apache.org/jira/browse/PIG-2815
>             Project: Pig
>          Issue Type: Bug
>          Components: impl
>    Affects Versions: 0.9.0
>            Reporter: Raghu Angadi
>             Fix For: 0.11
>
>
> The way {{PigContext.classloader}} and resolveClassName() are managed can lead to strange class loading issues, especially when not all {{register}} statements are at the top (example in the first comment).
> Two factors contribute to this: sometimes only one of them and sometimes together:
>  # a new classloader (CL) is created after registering each jar.
>     ** but the new jar's parent is the root CL rather than previous CL, effectively throwing previous CL away.
>  # resolveClassName() caches classes based on just the name
>     ** A class is not defined by name alone. Classes loaded by two different unrelated CLs are different objects even if both extract the class from same physical jar file.
>     ** because of (1), the cached class is not necessarily same as the class that would be loaded based on 'current' CL
> having different class objects for same class have many subtle side effects. e.g. there would be two instances of static variables. 
> I think both should be fixed.. thought fixing one of them might be good enough in many cases. I will add a patch.
>        

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira