You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Raghu Angadi (JIRA)" <ji...@apache.org> on 2012/07/13 09:09:33 UTC
[jira] [Commented] (PIG-2815) class loader management in PigContext
[ https://issues.apache.org/jira/browse/PIG-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13413532#comment-13413532 ]
Raghu Angadi commented on PIG-2815:
-----------------------------------
An example:
{noformat}
register elephant-bird.jar; -- for working with Thrift objects.
-- (1)
register T_One.jar;
-- (2)
-- ThriftPigLoader takes name of a Thrift class that corresponds to input.
a = load '/logs/T_One' using ThriftPigLoader('thrift.gen.T_One');
-- (3)
register second.jar;
-- (4)
b = load '/logs/T_two' using ThriftPigLoader('thrift.gen.T_two');
-- (5)
-- FAIL!
{noformat}
* (1): new classlaoder cl_A is created with root classloader as the parent.
* (2): cl_B is created with root as the parent.
* (3): {{ThirftPigLoader.class}} is instantiated with cl_B and cached.
* (4): cl_C is created with root as the parent.
* (5): {{thrift.gen.T_two.class}} is instantiated with cl_C, but '{{ThriftPigLoader.class}}' from cl_B is reused by Pig. So all the Thrift classes seen by ThriftPigLoader are entirely different from all the Thrift classes seen by {{thrift.gen.T_two}} since cl_B is not a parent of cl_C. That can lead to a number of issues and it does.
> class loader management in PigContext
> -------------------------------------
>
> Key: PIG-2815
> URL: https://issues.apache.org/jira/browse/PIG-2815
> Project: Pig
> Issue Type: Bug
> Components: impl
> Affects Versions: 0.9.0
> Reporter: Raghu Angadi
> Fix For: 0.11
>
>
> The way {{PigContext.classloader}} and resolveClassName() are managed can lead to strange class loading issues, especially when not all {{register}} statements are at the top (example in the first comment).
> Two factors contribute to this: sometimes only one of them and sometimes together:
> # a new classloader (CL) is created after registering each jar.
> ** but the new jar's parent is the root CL rather than previous CL, effectively throwing previous CL away.
> # resolveClassName() caches classes based on just the name
> ** A class is not defined by name alone. Classes loaded by two different unrelated CLs are different objects even if both extract the class from same physical jar file.
> ** because of (1), the cached class is not necessarily same as the class that would be loaded based on 'current' CL
> having different class objects for same class have many subtle side effects. e.g. there would be two instances of static variables.
> I think both should be fixed.. thought fixing one of them might be good enough in many cases. I will add a patch.
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira