You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Lars Francke (JIRA)" <ji...@apache.org> on 2011/03/30 11:32:05 UTC

[jira] [Created] (HIVE-2085) Document GenericUD(A|T)F

Document GenericUD(A|T)F
------------------------

                 Key: HIVE-2085
                 URL: https://issues.apache.org/jira/browse/HIVE-2085
             Project: Hive
          Issue Type: Improvement
          Components: Documentation
            Reporter: Lars Francke
            Priority: Minor


GenericUDFs are very poorly documented, this includes everything they relate to:

  * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
  * ObjectInspectorFactory
  * ObjectInspectorConverters
  * ...

An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Resolved] (HIVE-2085) Document GenericUD(A|T)F

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Edward Capriolo resolved HIVE-2085.
-----------------------------------

    Resolution: Won't Fix

What are you saying? Do you want more Java Doc? Hive currently uses a wiki that all our free to edit. I do not think we should be opening up Jira's for documentation. Hive already has enough issues assigned to no one, to be completed never.

> Document GenericUD(A|T)F
> ------------------------
>
>                 Key: HIVE-2085
>                 URL: https://issues.apache.org/jira/browse/HIVE-2085
>             Project: Hive
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lars Francke
>            Priority: Minor
>
> GenericUDFs are very poorly documented, this includes everything they relate to:
>   * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
>   * ObjectInspectorFactory
>   * ObjectInspectorConverters
>   * ...
> An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2085) Document GenericUD(A|T)F

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013631#comment-13013631 ] 

Edward Capriolo commented on HIVE-2085:
---------------------------------------

I do not care that much. I just dislike seeing things stay open forever that no one is going to work on.
https://issues.apache.org/jira/browse/HIVE-29


> Document GenericUD(A|T)F
> ------------------------
>
>                 Key: HIVE-2085
>                 URL: https://issues.apache.org/jira/browse/HIVE-2085
>             Project: Hive
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lars Francke
>            Priority: Minor
>
> GenericUDFs are very poorly documented, this includes everything they relate to:
>   * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
>   * ObjectInspectorFactory
>   * ObjectInspectorConverters
>   * ...
> An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2085) Document GenericUD(A|T)F

Posted by "Patrick Angeles (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013545#comment-13013545 ] 

Patrick Angeles commented on HIVE-2085:
---------------------------------------

FWIW, +1 on better documentation, particularly on extension points like UD*Fs, SerDes and StorageHandlers. These are things that are worth taking on up front because it increases user engagement and adoption and reduces the burden of support/education on core committers in the long term.

> Document GenericUD(A|T)F
> ------------------------
>
>                 Key: HIVE-2085
>                 URL: https://issues.apache.org/jira/browse/HIVE-2085
>             Project: Hive
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lars Francke
>            Priority: Minor
>
> GenericUDFs are very poorly documented, this includes everything they relate to:
>   * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
>   * ObjectInspectorFactory
>   * ObjectInspectorConverters
>   * ...
> An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2085) Document GenericUD(A|T)F

Posted by "Lars Francke (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13012997#comment-13012997 ] 

Lars Francke commented on HIVE-2085:
------------------------------------

What's the "Documentation" component for then?

The Wiki is nice but it's currently not a very good source of information beyond the very basic things. It is also not complete and I guess probably not up-to-date either on every page.

The nature of this issue also requires someone with more knowledge about Hive to take a look it at than a regular user. So I think it's a bad idea closing this issue just because there are other unassigned issues. This is an issue tracker after all.

> Document GenericUD(A|T)F
> ------------------------
>
>                 Key: HIVE-2085
>                 URL: https://issues.apache.org/jira/browse/HIVE-2085
>             Project: Hive
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lars Francke
>            Priority: Minor
>
> GenericUDFs are very poorly documented, this includes everything they relate to:
>   * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
>   * ObjectInspectorFactory
>   * ObjectInspectorConverters
>   * ...
> An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2085) Document GenericUD(A|T)F

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013021#comment-13013021 ] 

Edward Capriolo commented on HIVE-2085:
---------------------------------------

Lars
I understand that no one likes to see issues closed as "WONT FIX" (I am not trying to be snotty). Hive currently has hundreds of Open Issues. Opening an issue like "Document X" is vague. You are correct in saying this is an issue tracker, but it is quite common to first come on the IRC or ML and discuss the feature you want. 

IMHO. What this boils down to is if the developers had more time to document they would. If we had to open an issue for each thing that needed more documentation Jira would be unusable (Many things need documentation).
Generally, if the user submitting the request is not willing to assign it to themselves there is little chance of it getting done by anyone else (as evidenced by the number of opened unassigned tickets).  These issues should be actionable in the near term. If no one is going to actively work on the issue we do not get anything from having a ticket open on it.



> Document GenericUD(A|T)F
> ------------------------
>
>                 Key: HIVE-2085
>                 URL: https://issues.apache.org/jira/browse/HIVE-2085
>             Project: Hive
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lars Francke
>            Priority: Minor
>
> GenericUDFs are very poorly documented, this includes everything they relate to:
>   * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
>   * ObjectInspectorFactory
>   * ObjectInspectorConverters
>   * ...
> An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Reopened] (HIVE-2085) Document GenericUD(A|T)F

Posted by "Jeff Hammerbacher (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Hammerbacher reopened HIVE-2085:
-------------------------------------


It's ridiculous to close an issue as "Won't Fix" that many people think should be fixed.

> Document GenericUD(A|T)F
> ------------------------
>
>                 Key: HIVE-2085
>                 URL: https://issues.apache.org/jira/browse/HIVE-2085
>             Project: Hive
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lars Francke
>            Priority: Minor
>
> GenericUDFs are very poorly documented, this includes everything they relate to:
>   * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
>   * ObjectInspectorFactory
>   * ObjectInspectorConverters
>   * ...
> An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2085) Document GenericUD(A|T)F

Posted by "Edward Capriolo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013549#comment-13013549 ] 

Edward Capriolo commented on HIVE-2085:
---------------------------------------

The best way to currently learn about these features is to look through the unit tests, code, and .q files. There other UDFs like atan easy to follow. There is a UDFT for example that splits a URL into parts. Looking at the split() or case() UDF's give you an idea of some of the more complex things that can be done. Looking at struct() or list() udfs shows you a lot about how to use object inspectors to detect and return different types. If you want to come on IRC I can help with specific questions.

> Document GenericUD(A|T)F
> ------------------------
>
>                 Key: HIVE-2085
>                 URL: https://issues.apache.org/jira/browse/HIVE-2085
>             Project: Hive
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lars Francke
>            Priority: Minor
>
> GenericUDFs are very poorly documented, this includes everything they relate to:
>   * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
>   * ObjectInspectorFactory
>   * ObjectInspectorConverters
>   * ...
> An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2085) Document GenericUD(A|T)F

Posted by "Lars Francke (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13013026#comment-13013026 ] 

Lars Francke commented on HIVE-2085:
------------------------------------

I tried to add details about what and why should be documented so I had hoped it wouldn't be vague. Let me know what needs clarifying and I'll gladly do it.

I've been on IRC and the mailing list and we've asked questions about these things and tried to figure it out on our own that's why I decided to open the ticket.

I didn't assign the issue to myself because I don't feel like I have any idea about what's going on there. And I also don't agree on the value of open and unassigned tickets. It gives an overview of what needs to be done still and perhaps one of these days someone's going to focus on documentation and I think it would be helpful then to know what's missing.

But I'll leave this issue alone now to not take any more time. Thanks for looking at it anyway. One of your GenericUDFs can be found on Google and that's about the only documentation/example I could find so you've already helped :)

> Document GenericUD(A|T)F
> ------------------------
>
>                 Key: HIVE-2085
>                 URL: https://issues.apache.org/jira/browse/HIVE-2085
>             Project: Hive
>          Issue Type: Improvement
>          Components: Documentation
>            Reporter: Lars Francke
>            Priority: Minor
>
> GenericUDFs are very poorly documented, this includes everything they relate to:
>   * ObjectInspector (JavaDoc not really helpful for someone not familiar with Hive)
>   * ObjectInspectorFactory
>   * ObjectInspectorConverters
>   * ...
> An example would help as well as a unit test for one of the built in GenericUDFs. Writing a normal UDF is pretty well documented but GenericUDFs (and UDTF/UDAF) require more knowledge about the inner workings of Hive and that could be documented better.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira