You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Travis Crawford (JIRA)" <ji...@apache.org> on 2012/07/19 23:35:35 UTC

[jira] [Created] (HIVE-3279) Table schema not being copied to Partitions with no columns

Travis Crawford created HIVE-3279:
-------------------------------------

             Summary: Table schema not being copied to Partitions with no columns
                 Key: HIVE-3279
                 URL: https://issues.apache.org/jira/browse/HIVE-3279
             Project: Hive
          Issue Type: Bug
            Reporter: Travis Crawford
            Assignee: Travis Crawford


Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167}}

{code}
// set default if columns are not set
if (tPartition.getSd().getCols() == null) {
  if (table.getCols() != null) {
    tPartition.getSd().setCols(table.getCols());
  }
}
{code}

There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.

I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:

{code}
-        if (tPartition.getSd().getCols() == null) {
+        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
{code}

Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Travis Crawford updated HIVE-3279:
----------------------------------

    Attachment: HIVE-3279_serde_reported_partition_schema.1.patch
    
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>         Attachments: HIVE-3279_serde_reported_partition_schema.1.patch
>
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420840#comment-13420840 ] 

Ashutosh Chauhan commented on HIVE-3279:
----------------------------------------

bq. Later, when initializing a Partition we check if the storage descriptor has null for its columns. It actually has the empty list copied from the empty table (not null) and we do not copy the table schema into the partition.

This is true even for regular columns, not only for serde reported columns. How in non-serde case, partition gets the columns from table then ?
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13423663#comment-13423663 ] 

Hudson commented on HIVE-3279:
------------------------------

Integrated in Hive-trunk-h0.21 #1569 (See [https://builds.apache.org/job/Hive-trunk-h0.21/1569/])
    HIVE-3279: Table schema not being copied to Partitions with no columns (Travis Crawford via Ashutosh Chauhan) (Revision 1366058)

     Result = FAILURE
hashutosh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1366058
Files : 
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java
* /hive/trunk/ql/src/test/queries/clientpositive/serde_reported_schema.q
* /hive/trunk/ql/src/test/results/clientpositive/serde_reported_schema.q.out

                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3279_serde_reported_partition_schema.1.patch
>
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420809#comment-13420809 ] 

Ashutosh Chauhan commented on HIVE-3279:
----------------------------------------

Interesting. There seems to be no code in {{ql.metadata}} that can bring this kind of discrepancy, so wondering why difference in behavior for serde reported columns Vs regular code.
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420880#comment-13420880 ] 

Travis Crawford commented on HIVE-3279:
---------------------------------------

In both the serde-reported & non-serde cases, the table schema is copied into the partition storage descriptor. If the schema was explicitly defined, there's no need to copy it from the table so things work correctly.

I can't actually generate a test case where the partition storage descriptor cols are null – its either the list of explicitly defined fields, or an empty list when serde-reported.

Any ideas how to explicitly define fields for the table, but not have them copied into the partition storage descriptor?

To double-check - do you think the current serde-reported schema behavior is a bug? If so, I'm very interested in helping figure this one out. It feels like a simple issue where perhaps the table cols should be initialized to null instead of an empty list, or the empty list should be accommodated when choosing to copy the table schema.
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

       

[jira] [Updated] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ashutosh Chauhan updated HIVE-3279:
-----------------------------------

       Resolution: Fixed
    Fix Version/s: 0.10.0
           Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Travis!
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>             Fix For: 0.10.0
>
>         Attachments: HIVE-3279_serde_reported_partition_schema.1.patch
>
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420819#comment-13420819 ] 

Travis Crawford commented on HIVE-3279:
---------------------------------------

Looking at [Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121] we see:

{code}
sd.setCols(new ArrayList<FieldSchema>());
{code}

What I believe happens is a new empty table is created, which initializes an empty list of columns. No columns are actually set, because the serde reports them at runtime.

Later, when initializing a Partition we check if the storage descriptor has null for its columns. It actually has the empty list copied from the empty table (not null) and we do not copy the table schema into the partition.

Typically tables/partitions have an explicitly defined schema so maybe this use-case just hasn't come up? If you explicitly define the schema things work as expected.
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422385#comment-13422385 ] 

Travis Crawford commented on HIVE-3279:
---------------------------------------

Differential review: https://reviews.facebook.net/D4329
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>         Attachments: HIVE-3279_serde_reported_partition_schema.1.patch
>
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422958#comment-13422958 ] 

Ashutosh Chauhan commented on HIVE-3279:
----------------------------------------

+1 running tests.
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>         Attachments: HIVE-3279_serde_reported_partition_schema.1.patch
>
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422103#comment-13422103 ] 

Ashutosh Chauhan commented on HIVE-3279:
----------------------------------------

@Travis,
Yeah, this fix is required. Can you prepare a patch for it. Also, include the testcase which you have in https://issues.apache.org/jira/browse/HIVE-3279?focusedCommentId=13420724&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13420724 in the patch
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Ashutosh Chauhan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420289#comment-13420289 ] 

Ashutosh Chauhan commented on HIVE-3279:
----------------------------------------

bq. There's an issue though, because Table.getEmptyTable initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.

I am not sure if this really is the case.
{code}
hive> create table tt (a string) partitioned by (b string);
OK
Time taken: 7.907 seconds
hive> describe tt ;                     
OK
a	string	
b	string	
Time taken: 0.073 seconds
hive> alter table tt add partition(b='part1');
OK
Time taken: 0.848 seconds
hive> describe tt partition (b='part1');          
OK
a	string	
b	string	
Time taken: 0.071 seconds
hive> 
{code}

Above suggests that partition did inherit the columns from table. If the bug as you described is present, then partition won't have column information?
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420724#comment-13420724 ] 

Travis Crawford commented on HIVE-3279:
---------------------------------------

Applying the above change and rerunning these statements causes the partition to have dynamically-reported columns:

{code:title=Proposed behavior: partition has correct columns with above change}
hive> create external table int_string
    >   partitioned by (b string)
    >   row format serde "org.apache.hadoop.hive.serde2.thrift.ThriftDeserializer"
    >   with serdeproperties (
    >     "serialization.class"="org.apache.hadoop.hive.serde2.thrift.test.IntString",
    >     "serialization.format"="org.apache.thrift.protocol.TBinaryProtocol");
OK
Time taken: 0.085 seconds
hive> alter table int_string add partition (b='part1');
OK
Time taken: 0.128 seconds
hive> describe int_string partition (b='part1');     
OK
myint	int	from deserializer
mystring	string	from deserializer
underscore_int	int	from deserializer
b	string	
Time taken: 0.09 seconds
hive> 
{code}
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13421823#comment-13421823 ] 

Travis Crawford commented on HIVE-3279:
---------------------------------------

Hey [~ashutoshc] - fixing this issue is pretty high priority for me, any thoughts on next steps?
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13420721#comment-13420721 ] 

Travis Crawford commented on HIVE-3279:
---------------------------------------

I believe the issue happens with serde-reported columns; it works correctly as you pointed out when explicit columns are sued.

Consider the following. I believe the correct behavior is for part1 to inherit the table columns. Here we see part1 only has the partition key.

{code}
hive> create external table int_string
  partitioned by (b string)
  row format serde "org.apache.hadoop.hive.serde2.thrift.ThriftDeserializer"
  with serdeproperties (
    "serialization.class"="org.apache.hadoop.hive.serde2.thrift.test.IntString",
    "serialization.format"="org.apache.thrift.protocol.TBinaryProtocol");
OK
Time taken: 0.203 seconds
hive> describe int_string;
OK
myint	int	from deserializer
mystring	string	from deserializer
underscore_int	int	from deserializer
b	string	
Time taken: 0.098 seconds
hive> alter table int_string add partition (b='part1');
OK
Time taken: 0.154 seconds
hive> describe int_string partition (b='part1');       
OK
b	string	
Time taken: 0.072 seconds
hive> 
{code}
                
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Travis Crawford updated HIVE-3279:
----------------------------------

    Description: 
Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}

{code}
// set default if columns are not set
if (tPartition.getSd().getCols() == null) {
  if (table.getCols() != null) {
    tPartition.getSd().setCols(table.getCols());
  }
}
{code}

There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.

I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:

{code}
-        if (tPartition.getSd().getCols() == null) {
+        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
{code}

Thoughts?

  was:
Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167}}

{code}
// set default if columns are not set
if (tPartition.getSd().getCols() == null) {
  if (table.getCols() != null) {
    tPartition.getSd().setCols(table.getCols());
  }
}
{code}

There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.

I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:

{code}
-        if (tPartition.getSd().getCols() == null) {
+        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
{code}

Thoughts?

    
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (HIVE-3279) Table schema not being copied to Partitions with no columns

Posted by "Travis Crawford (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-3279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Travis Crawford updated HIVE-3279:
----------------------------------

    Status: Patch Available  (was: Open)
    
> Table schema not being copied to Partitions with no columns
> -----------------------------------------------------------
>
>                 Key: HIVE-3279
>                 URL: https://issues.apache.org/jira/browse/HIVE-3279
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Travis Crawford
>            Assignee: Travis Crawford
>         Attachments: HIVE-3279_serde_reported_partition_schema.1.patch
>
>
> Hive has a feature where {{Partition}}'s without any defined columns use the {{Table}} schema. This happens in {{[Partition.initialize|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Partition.java#L167]}}
> {code}
> // set default if columns are not set
> if (tPartition.getSd().getCols() == null) {
>   if (table.getCols() != null) {
>     tPartition.getSd().setCols(table.getCols());
>   }
> }
> {code}
> There's an issue though, because {{[Table.getEmptyTable|https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Table.java#L121]}} initializes cols to an empty array, which of course is not null, causing the above feature to not work as expected.
> I'm not sure of the fix - is there a case where cols can indeed be null? I think the best thing to do here is:
> {code}
> -        if (tPartition.getSd().getCols() == null) {
> +        if (tPartition.getSd().getCols() == null || tPartition.getSd().getCols().size() == 0) {
> {code}
> Thoughts?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira