You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "He Yongqiang (JIRA)" <ji...@apache.org> on 2011/08/26 20:03:33 UTC

[jira] [Created] (HIVE-2415) disallow partition column names when doing replace columns

disallow partition column names when doing replace columns
----------------------------------------------------------

                 Key: HIVE-2415
                 URL: https://issues.apache.org/jira/browse/HIVE-2415
             Project: Hive
          Issue Type: Bug
            Reporter: He Yongqiang
            Assignee: He Yongqiang


alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 

We should disallow this. 


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "He Yongqiang (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-2415:
-------------------------------

    Status: Open  (was: Patch Available)
    
> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096131#comment-13096131 ] 

Ashutosh Chauhan commented on HIVE-2415:
----------------------------------------

@Ning,

I understand that this patch is not introducing any more calls to the metastore but moves them from execution time to compile time. What I am suggesting is insteading of moving those checks from DDLTask to parser, move them to HiveMetaStore. That way they will still be in compile time only but with only one network call to metastore. 

> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-2415:
-------------------------------

    Attachment: HIVE-2415.1.patch

> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13100473#comment-13100473 ] 

Ashutosh Chauhan commented on HIVE-2415:
----------------------------------------

We discussed this a bit in yesterday's contributor meeting. The general view was all of these checks make sense in HiveMetaStore. For the backdoor you are describing, John suggested there could be an "admin" mode which will skip these checks in metastore. So, to make changes generally not permitted you will run in "admin" mode and all other times you will run in regular user mode. 

> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

He Yongqiang updated HIVE-2415:
-------------------------------

    Status: Patch Available  (was: Open)

> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094094#comment-13094094 ] 

He Yongqiang commented on HIVE-2415:
------------------------------------

@Ashutosh Chauhan, today it is doing 2 metastore calls. one is in DDLSemanticAnalyzer, and the other is in DDLTask. Merging these 2 (check and change) to metastore server will save one metastore call, but add more load to metastore. Since this is only for a DDL command, it should be fine.


> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093446#comment-13093446 ] 

Ashutosh Chauhan commented on HIVE-2415:
----------------------------------------

Won't doing validation in parser will result in two network calls, first to fetch the current column info and then update. I am wondering if it will be better to do one call to metastore which will first validate the columns and then do the update.

> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "He Yongqiang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13098414#comment-13098414 ] 

He Yongqiang commented on HIVE-2415:
------------------------------------

@Ashutosh, yeah, i understand your point of moving the validation from client to metastore server. There is another concern is that we want the hive metastore have much more flexibility than the client side, so if something goes wrong for any reason, we can use thrift metastore interface to fix it. For example, if a table is somehow has a normal column whose name conflicts with a partition column, we won't be able to fix it if we do validation on the metastore side.

> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092565#comment-13092565 ] 

jiraposter@reviews.apache.org commented on HIVE-2415:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1672/
-----------------------------------------------------------

Review request for hive and Ning Zhang.


Summary
-------

move validation of renaming/replacing columns from runtime to parser


This addresses bug HIVE-2415.
    https://issues.apache.org/jira/browse/HIVE-2415


Diffs
-----

  trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1162190 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1162190 
  trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1162190 
  trunk/ql/src/test/queries/clientnegative/replace_columns.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/replace_columns_2.q PRE-CREATION 
  trunk/ql/src/test/queries/clientnegative/replace_columns_3.q PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/altern1.q.out 1162190 
  trunk/ql/src/test/results/clientnegative/column_rename1.q.out 1162190 
  trunk/ql/src/test/results/clientnegative/column_rename2.q.out 1162190 
  trunk/ql/src/test/results/clientnegative/column_rename4.q.out 1162190 
  trunk/ql/src/test/results/clientnegative/replace_columns.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/replace_columns_2.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/replace_columns_3.q.out PRE-CREATION 
  trunk/ql/src/test/results/clientnegative/replace_columns_4.q.out PRE-CREATION 
  trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java 1162190 

Diff: https://reviews.apache.org/r/1672/diff


Testing
-------


Thanks,

Yongqiang



> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094238#comment-13094238 ] 

Ashutosh Chauhan commented on HIVE-2415:
----------------------------------------

CPU cycles spent in doing validation is lot less then setting up a connection and tearing it down, not to mention extra roundtrips and bytes over the network. I don't see how it is not worse. 

> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093498#comment-13093498 ] 

jiraposter@reviews.apache.org commented on HIVE-2415:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1672/#review1686
-----------------------------------------------------------



trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java
<https://reviews.apache.org/r/1672/#comment3842>

    remove TAB



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
<https://reviews.apache.org/r/1672/#comment3844>

    tab



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
<https://reviews.apache.org/r/1672/#comment3843>

    tab



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
<https://reviews.apache.org/r/1672/#comment3845>

    does this metastore object change causes the mapped DB table change? it looks dangerous here. 



trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java
<https://reviews.apache.org/r/1672/#comment3846>

    why we need to clear deserializer here? Is it to make sure column names are not from deserializer?


- Ning


On 2011-08-28 23:14:12, Yongqiang He wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1672/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-28 23:14:12)
bq.  
bq.  
bq.  Review request for hive and Ning Zhang.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  move validation of renaming/replacing columns from runtime to parser
bq.  
bq.  
bq.  This addresses bug HIVE-2415.
bq.      https://issues.apache.org/jira/browse/HIVE-2415
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/DDLTask.java 1162190 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java 1162190 
bq.    trunk/ql/src/java/org/apache/hadoop/hive/ql/parse/DDLSemanticAnalyzer.java 1162190 
bq.    trunk/ql/src/test/queries/clientnegative/replace_columns.q PRE-CREATION 
bq.    trunk/ql/src/test/queries/clientnegative/replace_columns_2.q PRE-CREATION 
bq.    trunk/ql/src/test/queries/clientnegative/replace_columns_3.q PRE-CREATION 
bq.    trunk/ql/src/test/results/clientnegative/altern1.q.out 1162190 
bq.    trunk/ql/src/test/results/clientnegative/column_rename1.q.out 1162190 
bq.    trunk/ql/src/test/results/clientnegative/column_rename2.q.out 1162190 
bq.    trunk/ql/src/test/results/clientnegative/column_rename4.q.out 1162190 
bq.    trunk/ql/src/test/results/clientnegative/replace_columns.q.out PRE-CREATION 
bq.    trunk/ql/src/test/results/clientnegative/replace_columns_2.q.out PRE-CREATION 
bq.    trunk/ql/src/test/results/clientnegative/replace_columns_3.q.out PRE-CREATION 
bq.    trunk/ql/src/test/results/clientnegative/replace_columns_4.q.out PRE-CREATION 
bq.    trunk/serde/src/java/org/apache/hadoop/hive/serde2/SerDeUtils.java 1162190 
bq.  
bq.  Diff: https://reviews.apache.org/r/1672/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Yongqiang
bq.  
bq.



> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-2415) disallow partition column names when doing replace columns

Posted by "Ning Zhang (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-2415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13094831#comment-13094831 ] 

Ning Zhang commented on HIVE-2415:
----------------------------------

Ashutosh, we are doing metastore connections in DDLSemanticAnalyzer even before Yongqiang's patch (see DDLSemanticAnalyzer.java:1712). The extra JDO query in DDLSemanticAnalyzer introduced in this patch is to get the columns info (in SD), which was moved from the DDLTask. 

So this patch should not introduce any extra metastore connections or network traffic. It just moved some metastore access from execution time to compile time, which I think is an improvement. The reason is that the error it tries to handle is a kind of "semantic" error which should be handled in DDLSemanticAnalyzer, rather than execution error. What do you think? 

> disallow partition column names when doing replace columns
> ----------------------------------------------------------
>
>                 Key: HIVE-2415
>                 URL: https://issues.apache.org/jira/browse/HIVE-2415
>             Project: Hive
>          Issue Type: Bug
>            Reporter: He Yongqiang
>            Assignee: He Yongqiang
>         Attachments: HIVE-2415.1.patch
>
>
> alter table replace columns allows to add a column with the same name as partition column, which introduced inconsistency. 
> We should disallow this. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira