You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Johan Oskarsson (JIRA)" <ji...@apache.org> on 2009/01/13 11:09:03 UTC

[jira] Commented: (HIVE-126) Don't fetch information on Partitions from HDFS instead of MetaStore

    [ https://issues.apache.org/jira/browse/HIVE-126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12663286#action_12663286 ] 

Johan Oskarsson commented on HIVE-126:
--------------------------------------

Since HIVE-142 is committed and the code in this ticket has been committed by accident, perhaps it's best to just update the CHANGES file with the ticket information and close this one?

> Don't fetch information on Partitions from HDFS instead of MetaStore
> --------------------------------------------------------------------
>
>                 Key: HIVE-126
>                 URL: https://issues.apache.org/jira/browse/HIVE-126
>             Project: Hadoop Hive
>          Issue Type: Improvement
>          Components: Metastore
>            Reporter: Johan Oskarsson
>            Assignee: Johan Oskarsson
>             Fix For: 0.2.0
>
>         Attachments: HIVE-126.patch, HIVE-126.patch
>
>
> When investigating HIVE-91 an issue came up where the information on what partitions a table contains is loaded by listing the directories in the table directory on HDFS. This is then used to overrule what is in the MetaStore if any difference is found. 
> * Would it not be preferable if MetaStore is the one authority on what the table contains?
> * It will also be a major hassle (or impossible?) to retrieve this information from HDFS with external tables that have non standard partition names (HIVE-91), such as: table/2008/01/08/portugal where "2008/01/08" is one partition value and "portugal" is another.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


RE: [jira] Commented: (HIVE-126) Don't fetch information on Partitions from HDFS instead of MetaStore

Posted by Joydeep Sen Sarma <js...@facebook.com>.
This makes sense. We should allow this as an option and this was discussed initially in hive-91 (infer partitioning information from hdfs).

-----Original Message-----
From: Fred Oko [mailto:foko@hi5.com] 
Sent: Wednesday, January 21, 2009 5:48 PM
To: hive-dev@hadoop.apache.org
Subject: Re: [jira] Commented: (HIVE-126) Don't fetch information on Partitions from HDFS instead of MetaStore

Coming into this a tad late but we leveraged the simplicity of the original
state of this code i.e. "// let's trust hdfs partitions for now". Is it
possible to make this configurable? Basically we came up with a design that
treated Hive solely as a post-processing overlay in that we feed data into
HDFS directly but were fine with using the Hive expected partitioning
scheme. We could switch to HIVE-91 but for our current purposes the
integration of Hive into the data load stage presents no value.


On 1/13/09 2:09 AM, "Johan Oskarsson (JIRA)" <ji...@apache.org> wrote:

> 
>     [ 
> https://issues.apache.org/jira/browse/HIVE-126?page=com.atlassian.jira.plugin.
> system.issuetabpanels:comment-tabpanel&focusedCommentId=12663286#action_126632
> 86 ] 
> 
> Johan Oskarsson commented on HIVE-126:
> --------------------------------------
> 
> Since HIVE-142 is committed and the code in this ticket has been committed by
> accident, perhaps it's best to just update the CHANGES file with the ticket
> information and close this one?
> 
>> Don't fetch information on Partitions from HDFS instead of MetaStore
>> --------------------------------------------------------------------
>> 
>>                 Key: HIVE-126
>>                 URL: https://issues.apache.org/jira/browse/HIVE-126
>>             Project: Hadoop Hive
>>          Issue Type: Improvement
>>          Components: Metastore
>>            Reporter: Johan Oskarsson
>>            Assignee: Johan Oskarsson
>>             Fix For: 0.2.0
>> 
>>         Attachments: HIVE-126.patch, HIVE-126.patch
>> 
>> 
>> When investigating HIVE-91 an issue came up where the information on what
>> partitions a table contains is loaded by listing the directories in the table
>> directory on HDFS. This is then used to overrule what is in the MetaStore if
>> any difference is found.
>> * Would it not be preferable if MetaStore is the one authority on what the
>> table contains?
>> * It will also be a major hassle (or impossible?) to retrieve this
>> information from HDFS with external tables that have non standard partition
>> names (HIVE-91), such as: table/2008/01/08/portugal where "2008/01/08" is one
>> partition value and "portugal" is another.


Re: [jira] Commented: (HIVE-126) Don't fetch information on Partitions from HDFS instead of MetaStore

Posted by Fred Oko <fo...@hi5.com>.
Coming into this a tad late but we leveraged the simplicity of the original
state of this code i.e. "// let's trust hdfs partitions for now". Is it
possible to make this configurable? Basically we came up with a design that
treated Hive solely as a post-processing overlay in that we feed data into
HDFS directly but were fine with using the Hive expected partitioning
scheme. We could switch to HIVE-91 but for our current purposes the
integration of Hive into the data load stage presents no value.


On 1/13/09 2:09 AM, "Johan Oskarsson (JIRA)" <ji...@apache.org> wrote:

> 
>     [ 
> https://issues.apache.org/jira/browse/HIVE-126?page=com.atlassian.jira.plugin.
> system.issuetabpanels:comment-tabpanel&focusedCommentId=12663286#action_126632
> 86 ] 
> 
> Johan Oskarsson commented on HIVE-126:
> --------------------------------------
> 
> Since HIVE-142 is committed and the code in this ticket has been committed by
> accident, perhaps it's best to just update the CHANGES file with the ticket
> information and close this one?
> 
>> Don't fetch information on Partitions from HDFS instead of MetaStore
>> --------------------------------------------------------------------
>> 
>>                 Key: HIVE-126
>>                 URL: https://issues.apache.org/jira/browse/HIVE-126
>>             Project: Hadoop Hive
>>          Issue Type: Improvement
>>          Components: Metastore
>>            Reporter: Johan Oskarsson
>>            Assignee: Johan Oskarsson
>>             Fix For: 0.2.0
>> 
>>         Attachments: HIVE-126.patch, HIVE-126.patch
>> 
>> 
>> When investigating HIVE-91 an issue came up where the information on what
>> partitions a table contains is loaded by listing the directories in the table
>> directory on HDFS. This is then used to overrule what is in the MetaStore if
>> any difference is found.
>> * Would it not be preferable if MetaStore is the one authority on what the
>> table contains?
>> * It will also be a major hassle (or impossible?) to retrieve this
>> information from HDFS with external tables that have non standard partition
>> names (HIVE-91), such as: table/2008/01/08/portugal where "2008/01/08" is one
>> partition value and "portugal" is another.