You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Thejas M Nair (JIRA)" <ji...@apache.org> on 2011/05/02 16:41:03 UTC
[jira] [Commented] (PIG-1821) UDFContext.getUDFProperties does not
handle collisions in hashcode of udf classname (+ arg hashcodes)
[ https://issues.apache.org/jira/browse/PIG-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027676#comment-13027676 ]
Thejas M Nair commented on PIG-1821:
------------------------------------
PIG-1821.1.patch - unit tests passed (except TestStoreInstances, which is failing in trunk). test-patch failed because of no new unit tests. There are no new unit tests because it is not easy to create a test case to produce the problem this could have caused.
> UDFContext.getUDFProperties does not handle collisions in hashcode of udf classname (+ arg hashcodes)
> -----------------------------------------------------------------------------------------------------
>
> Key: PIG-1821
> URL: https://issues.apache.org/jira/browse/PIG-1821
> Project: Pig
> Issue Type: Bug
> Affects Versions: 0.8.0
> Reporter: Thejas M Nair
> Assignee: Thejas M Nair
> Fix For: 0.9.0
>
> Attachments: PIG-1821.1.patch
>
>
> In code below, if generateKey() returns same value for two udfs, the udfs would end up sharing the properties object.
> {code}
> private HashMap<Integer, Properties> udfConfs = new HashMap<Integer, Properties>();
> public Properties getUDFProperties(Class c) {
> Integer k = generateKey(c);
> Properties p = udfConfs.get(k);
> if (p == null) {
> p = new Properties();
> udfConfs.put(k, p);
> }
> return p;
> }
> private int generateKey(Class c) {
> return c.getName().hashCode();
> }
> public Properties getUDFProperties(Class c, String[] args) {
> Integer k = generateKey(c, args);
> Properties p = udfConfs.get(k);
> if (p == null) {
> p = new Properties();
> udfConfs.put(k, p);
> }
> return p;
> }
> private int generateKey(Class c, String[] args) {
> int hc = c.getName().hashCode();
> for (int i = 0; i < args.length; i++) {
> hc <<= 1;
> hc ^= args[i].hashCode();
> }
> return hc;
> }
> {code}
> To prevent this, a new class (say X) that can hold the classname and args should be created, and instead of HashMap<Integer, Properties>, HashMap<X, Properties> should be used. Then HahsMap will deal with the collisions.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira