You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hive.apache.org by "Adam Kramer (JIRA)" <ji...@apache.org> on 2011/06/21 20:29:47 UTC

[jira] [Created] (HIVE-2231) Column aliases

Column aliases
--------------

                 Key: HIVE-2231
                 URL: https://issues.apache.org/jira/browse/HIVE-2231
             Project: Hive
          Issue Type: Wish
          Components: Query Processor
            Reporter: Adam Kramer
            Priority: Trivial


It would be nice in several cases to be able to alias column names.

Say someone in your company CREATEd a TABLE called important_but_named_poorly (alvin BIGINT, theodore BIGINT, simon STRING) PARTITIONED BY (dave STRING), that indexes the relationship between an actor (alvin), a target (theodore), and the interaction between them (simon), partitioned based on the date string (dave). Renaming the columns would break a million pipelines that are important but ownerless.

It would be awesome to define an aliasing system as such:

ALTER TABLE important_but_named_poorly REPLACE COLUMNS (actor BIGINT AKA alvin, target BIGINT AKA theodore, ixn STRING AKA simon) PARTITIONED BY (ds STRING AKA dave);

...which would mean that any user could, e.g., use the term "dave" to refer to ds if they really wanted to.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2231) Column aliases

Posted by "Adam Kramer (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071412#comment-13071412 ] 

Adam Kramer commented on HIVE-2231:
-----------------------------------

The use case here is basically providing backwards compatibility. So for many users of a table, and many new users of a table, they are using the same table and want to refer to it as such; it is the canonical table.

But sometimes the table was originally named with crummy names, and it'd be better and cleaner to document and train new people on the appropriate names.

Views eat up the namespace and provide a level of misdirection that is not always desirable, but here are the two biggest limitations of views:
* SELECT * is not fast. I can't SELECT * on a view and get data immediately in the same way that I would upon writing the same query. This is true even when the schema are exactly the same.
* Partitions are not see-through. I can't use "show partitions" on a view or write any automated system based on the view to identify when new partitions land, which forces reference to the original table, and then all is lost.



> Column aliases
> --------------
>
>                 Key: HIVE-2231
>                 URL: https://issues.apache.org/jira/browse/HIVE-2231
>             Project: Hive
>          Issue Type: Wish
>          Components: Query Processor
>            Reporter: Adam Kramer
>            Priority: Trivial
>
> It would be nice in several cases to be able to alias column names.
> Say someone in your company CREATEd a TABLE called important_but_named_poorly (alvin BIGINT, theodore BIGINT, simon STRING) PARTITIONED BY (dave STRING), that indexes the relationship between an actor (alvin), a target (theodore), and the interaction between them (simon), partitioned based on the date string (dave). Renaming the columns would break a million pipelines that are important but ownerless.
> It would be awesome to define an aliasing system as such:
> ALTER TABLE important_but_named_poorly REPLACE COLUMNS (actor BIGINT AKA alvin, target BIGINT AKA theodore, ixn STRING AKA simon) PARTITIONED BY (ds STRING AKA dave);
> ...which would mean that any user could, e.g., use the term "dave" to refer to ds if they really wanted to.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-2231) Column aliases

Posted by "John Sichi (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HIVE-2231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052751#comment-13052751 ] 

John Sichi commented on HIVE-2231:
----------------------------------

Since views are already a standard way of addressing this, wouldn't it be better to put effort into fixing any limitations there?


> Column aliases
> --------------
>
>                 Key: HIVE-2231
>                 URL: https://issues.apache.org/jira/browse/HIVE-2231
>             Project: Hive
>          Issue Type: Wish
>          Components: Query Processor
>            Reporter: Adam Kramer
>            Priority: Trivial
>
> It would be nice in several cases to be able to alias column names.
> Say someone in your company CREATEd a TABLE called important_but_named_poorly (alvin BIGINT, theodore BIGINT, simon STRING) PARTITIONED BY (dave STRING), that indexes the relationship between an actor (alvin), a target (theodore), and the interaction between them (simon), partitioned based on the date string (dave). Renaming the columns would break a million pipelines that are important but ownerless.
> It would be awesome to define an aliasing system as such:
> ALTER TABLE important_but_named_poorly REPLACE COLUMNS (actor BIGINT AKA alvin, target BIGINT AKA theodore, ixn STRING AKA simon) PARTITIONED BY (ds STRING AKA dave);
> ...which would mean that any user could, e.g., use the term "dave" to refer to ds if they really wanted to.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira