You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "David Worms (Created) (JIRA)" <ji...@apache.org> on 2012/03/07 02:11:58 UTC

[jira] [Created] (HIVE-2843) UDAF to convert an aggregation to a map

UDAF to convert an aggregation to a map
---------------------------------------

                 Key: HIVE-2843
                 URL: https://issues.apache.org/jira/browse/HIVE-2843
             Project: Hive
          Issue Type: New Feature
          Components: UDF
    Affects Versions: 0.9.0
            Reporter: David Worms
            Priority: Minor


I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class.

Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/

If you are interested by my proposal, I'll take the time to update this issue while following the guideline posted on the wiki to create an appropriate path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2843) UDAF to convert an aggregation to a map

Posted by "Thiruvel Thirumoolan (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242286#comment-13242286 ] 

Thiruvel Thirumoolan commented on HIVE-2843:
--------------------------------------------

@David,

It will be nice to get this in. It will also help when people import data from RDBMS/other stores and want to save space by converting first level columns into complex types and vice-versa using UDTF (explode()). Would naming it 'implode' help?
                
> UDAF to convert an aggregation to a map
> ---------------------------------------
>
>                 Key: HIVE-2843
>                 URL: https://issues.apache.org/jira/browse/HIVE-2843
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 0.9.0
>            Reporter: David Worms
>            Priority: Minor
>              Labels: features, udf
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue while following the guideline posted on the wiki to create an appropriate path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2843) UDAF to convert an aggregation to a map

Posted by "David Worms (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Worms updated HIVE-2843:
------------------------------

    Affects Version/s: 0.10.0
               Status: Patch Available  (was: Open)
    
> UDAF to convert an aggregation to a map
> ---------------------------------------
>
>                 Key: HIVE-2843
>                 URL: https://issues.apache.org/jira/browse/HIVE-2843
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 0.9.0, 0.10.0
>            Reporter: David Worms
>            Priority: Minor
>              Labels: features, udf
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue while following the guideline posted on the wiki to create an appropriate path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2843) UDAF to convert an aggregation to a map

Posted by "David Worms (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Worms updated HIVE-2843:
------------------------------

    Attachment: HIVE-2843.1.patch.txt
    
> UDAF to convert an aggregation to a map
> ---------------------------------------
>
>                 Key: HIVE-2843
>                 URL: https://issues.apache.org/jira/browse/HIVE-2843
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 0.9.0, 0.10.0
>            Reporter: David Worms
>            Priority: Minor
>              Labels: features, udf
>         Attachments: HIVE-2843.1.patch.txt
>
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue while following the guideline posted on the wiki to create an appropriate path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2843) UDAF to convert an aggregation to a map

Posted by "David Worms (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417872#comment-13417872 ] 

David Worms commented on HIVE-2843:
-----------------------------------

Thank for offering your help, I should have a little of time this week so i'll try to follow the recommandation posted on the wiki.
                
> UDAF to convert an aggregation to a map
> ---------------------------------------
>
>                 Key: HIVE-2843
>                 URL: https://issues.apache.org/jira/browse/HIVE-2843
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 0.9.0
>            Reporter: David Worms
>            Priority: Minor
>              Labels: features, udf
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue while following the guideline posted on the wiki to create an appropriate path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2843) UDAF to convert an aggregation to a map

Posted by "David Worms (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13430324#comment-13430324 ] 

David Worms commented on HIVE-2843:
-----------------------------------

Patch is now uploaded, let me know in case it need to be modified.
d.
                
> UDAF to convert an aggregation to a map
> ---------------------------------------
>
>                 Key: HIVE-2843
>                 URL: https://issues.apache.org/jira/browse/HIVE-2843
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 0.9.0, 0.10.0
>            Reporter: David Worms
>            Priority: Minor
>              Labels: features, udf
>         Attachments: HIVE-2843.1.patch.txt
>
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue while following the guideline posted on the wiki to create an appropriate path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2843) UDAF to convert an aggregation to a map

Posted by "Philip Tromans (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417144#comment-13417144 ] 

Philip Tromans commented on HIVE-2843:
--------------------------------------

Hi David,

I think that this would be a really useful addition to Hive. I was just about to write the same UDAF when I came across yours. I think implode_to_map is a good name for it, because as you say, implode has quite a few meanings, but I'm interested in what others have to say.

I don't mind preparing / submitting the patch / test cases if you're working on other things.

Cheers,

Phil.
                
> UDAF to convert an aggregation to a map
> ---------------------------------------
>
>                 Key: HIVE-2843
>                 URL: https://issues.apache.org/jira/browse/HIVE-2843
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 0.9.0
>            Reporter: David Worms
>            Priority: Minor
>              Labels: features, udf
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue while following the guideline posted on the wiki to create an appropriate path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2843) UDAF to convert an aggregation to a map

Posted by "David Worms (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242373#comment-13242373 ] 

David Worms commented on HIVE-2843:
-----------------------------------

Thiruvel,
I have been thinking to name it "implode" at first. Then i step back because of 2 reasons:
1. there are potentially different map implementations, for example, i'm personally intensively using the toOrderedMap UDAF which in my case would be much faster than using an 'order by' clause in my query.
2. implode is not restricted to map, and actually would probably be more appropriate for an array conversion.
What do you think?
                
> UDAF to convert an aggregation to a map
> ---------------------------------------
>
>                 Key: HIVE-2843
>                 URL: https://issues.apache.org/jira/browse/HIVE-2843
>             Project: Hive
>          Issue Type: New Feature
>          Components: UDF
>    Affects Versions: 0.9.0
>            Reporter: David Worms
>            Priority: Minor
>              Labels: features, udf
>
> I propose the addition of two new Hive UDAF to help with maps in Apache Hive. The source code is available on GitHub at https://github.com/wdavidw/hive-udf in two Java classes: "UDAFToMap" and "UDAFToOrderedMap". The first function convert an aggregation into a map and is internally using a Java `HashMap`. The second function extends the first one. It convert an aggregation into an ordered map and is internally using a Java `TreeMap`. They both extends the `AbstractGenericUDAFResolver` class.
> Also, I have covered the motivations and usages of those UDAF in a blog post at http://adaltas.com/blog/2012/03/06/hive-udaf-map-conversion/
> If you are interested by my proposal, I'll take the time to update this issue while following the guideline posted on the wiki to create an appropriate path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira