You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org> on 2010/08/20 03:19:16 UTC

[jira] Created: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Improve dynamic invokers to deal with no-arg methods and array parameters
-------------------------------------------------------------------------

                 Key: PIG-1551
                 URL: https://issues.apache.org/jira/browse/PIG-1551
             Project: Pig
          Issue Type: Improvement
            Reporter: Dmitriy V. Ryaboy


PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.

This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901992#action_12901992 ] 

Richard Ding commented on PIG-1551:
-----------------------------------


The typo is still there:

{code}
private static final Class<?> LONG_ARRAY_CLASS = new Long[0].getClass();
{code}

It seems what you want is 

{code}
private static final Class<?> LONG_ARRAY_CLASS = new long[0].getClass();
{code}

so it's consistent with other array classes.

This does raise a question about array parameters: the first form applies to methods like _amethod(Long[] nums)_, while the second supports methods like _amethod(long[] nums)_. And they are not exchangeable. 

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch, PIG_1551.2.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12901656#action_12901656 ] 

Richard Ding commented on PIG-1551:
-----------------------------------


In Invoker.java, there is a typo:

{code}
private static final Class<?> LONG_ARRAY_CLASS = new String[0].getClass();
{code}

also in unPrimitivize method, this code seems unnecessary:

{code}
        } else if (klass.equals(DOUBLE_ARRAY_CLASS)) {
            return DOUBLE_ARRAY_CLASS;
{code}

Otherwise the patch looks good.

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-1551:
-----------------------------------

    Status: Open  (was: Patch Available)

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch, PIG_1551.2.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-1551:
-----------------------------------

    Attachment: PIG_1551.3.patch

Ugh. Thank you for catching that -- fixed, and added a test to make sure it stays fixed.

The particular set of methods I needed this for used primitives, so that's what I did. It's a bit tricky to add support for Long, Double, etc arrays, as I would have to check all combinations of possible method signatures when seeing things like (int[], int[], int[]) -- it becomes fairly ugly code.. Do you think this is particularly compelling? I can't really think of methods that take arrays of Number classes; usually, if you start using Numbers, you are also using Collections, not plain arrays.

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch, PIG_1551.2.patch, PIG_1551.3.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-1551:
-----------------------------------

    Attachment: PIG-1551.patch

Patch attached.

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy reassigned PIG-1551:
--------------------------------------

    Assignee: Dmitriy V. Ryaboy

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-1551:
-----------------------------------

               Status: Patch Available  (was: Open)
    Affects Version/s: 0.8.0
        Fix Version/s: 0.8.0

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Richard Ding (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12902042#action_12902042 ] 

Richard Ding commented on PIG-1551:
-----------------------------------

+1.

I'm fine with arrays of primitive types. I can't think of a Java method that uses an array of object Long as a parameter.

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch, PIG_1551.2.patch, PIG_1551.3.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-1551:
-----------------------------------

          Status: Resolved  (was: Patch Available)
    Release Note: 
The idea is simple: frequently, Pig users need to use a simple function that is already provided by standard Java libraries, but for which a UDF has not been written. Dynamic Invokers allow a Pig programmer to refer to Java functions without having to wrap them in custom Pig UDFs, at the cost of doing some Java reflection on every function call.

{code}
DEFINE UrlDecode InvokeForString('java.net.URLDecoder.decode', 'String String');
encoded_strings = LOAD 'encoded_strings.txt' as (encoded:chararray);
decoded_strings = FOREACH encoded_strings GENERATE UrlDecode(encoded, 'UTF-8');
{code}

Currently, Dynamic Invokers can be used for any static function that accepts no arguments or some combination of Strings, ints, longs, doubles, floats, or arrays of same, and returns a String, an int, a long, a double, or a float. Primitives only for the numbers, no capital-letter numeric classes as arguments. Depending on the return type, a specific kind of Invoker must be used: InvokeForString, InvokeForInt, InvokeForLong, InvokeForDouble, or InvokeForFloat.

The DEFINE keyword is used to bind a keyword to a Java method, as above. The first argument to the InvokeFor* constructor is the full path to the desired method. The second argument is a space-delimited ordered list of the classes of the method arguments. This can be omitted or an empty string if the method takes no arguments. Valid class names are String, Long, Float, Double, and Int. Invokers can also work with array arguments, represented in Pig as DataBags of single-tuple elements. Simply refer to string[], for example. Class names are not case-sensitive.

The ability to use invokers on methods that take array arguments makes methods like those in org.apache.commons.math.stat.StatUtils available for processing the results of grouping your datasets, for example. This is very nice, but a word of caution: the resulting UDF will of course not be optimized for Hadoop, and the very significant benefits one gains from implementing the Algebraic and Accumulative interfaces are lost here. Be careful with this one.
      Resolution: Fixed

Commited.

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch, PIG_1551.2.patch, PIG_1551.3.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-1551:
-----------------------------------

    Attachment: PIG_1551.2.patch

Attaching patch that fixes the two errors Richard pointed out.


> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch, PIG_1551.2.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PIG-1551) Improve dynamic invokers to deal with no-arg methods and array parameters

Posted by "Dmitriy V. Ryaboy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PIG-1551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dmitriy V. Ryaboy updated PIG-1551:
-----------------------------------

    Status: Patch Available  (was: Open)

> Improve dynamic invokers to deal with no-arg methods and array parameters
> -------------------------------------------------------------------------
>
>                 Key: PIG-1551
>                 URL: https://issues.apache.org/jira/browse/PIG-1551
>             Project: Pig
>          Issue Type: Improvement
>    Affects Versions: 0.8.0
>            Reporter: Dmitriy V. Ryaboy
>            Assignee: Dmitriy V. Ryaboy
>             Fix For: 0.8.0
>
>         Attachments: PIG-1551.patch, PIG_1551.2.patch
>
>
> PIG-1354 introduced a set of UDFs that can be used to dynamically wrap simple Java methods in a UDF, so that users don't need to create trivial wrappers if they are ok sacrificing some speed.
> This issue is to extend the set of methods that can be wrapped this way to include methods that do not take any arguments, and methods that take arrays of {int,long,float,double,string} as arguments. 
> Arrays are expected to be represented by bags in Pig. Notably, this allows users to wrap statistical functions in o.a.commons.math.stat.StatUtils . 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.