You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Yang <te...@gmail.com> on 2010/07/02 01:42:39 UTC

Schema evolution?

I read on the VLBD hive paper "Hive - A Warehousing Solution Over a
Map-Reduce    Framework"
that Partitions could have different schemas : (section 3.1 MetaStore) "
Partition - Each partition can have its own columns
and SerDe and storage information. This can be used
in the future to support schema evolution in a Hive
warehouse.
"

but the API:
http://hadoop.apache.org/hive/docs/r0.5.0/api/

only lists getSchema() for Table, and Partition does not have a
separate getSchema().

is the schema evolution feature really there?

Thanks
Yang

Re: Schema evolution?

Posted by Yang <te...@gmail.com>.
Paul:

thanks.
currently I do not need this feature from Hive QL, just need it in metastore.

you said "There exists structures for supporting this in the
metastore", could you please give more details?  I suppose the
interface to metastore is basically classes like Table, Partition, but
in the Partition API
http://hadoop.apache.org/hive/docs/r0.5.0/api/org/apache/hadoop/hive/metastore/api/Partition.html
I don't see any reference to schema,

Yang

On Thu, Jul 1, 2010 at 4:56 PM, Paul Yang <py...@facebook.com> wrote:
> There exists structures for supporting this in the metastore, but that feature isn't in Hive yet. For example, although the metadata for partitions include its own set of columns, parts of the code in the query processor still read from table level metadata.
>
> Some evolution can occur in the form of adding columns to a table.
>
> -----Original Message-----
> From: Yang [mailto:teddyyyy123@gmail.com]
> Sent: Thursday, July 01, 2010 4:43 PM
> To: hive-user@hadoop.apache.org
> Subject: Schema evolution?
>
> I read on the VLBD hive paper "Hive - A Warehousing Solution Over a
> Map-Reduce    Framework"
> that Partitions could have different schemas : (section 3.1 MetaStore) "
> Partition - Each partition can have its own columns
> and SerDe and storage information. This can be used
> in the future to support schema evolution in a Hive
> warehouse.
> "
>
> but the API:
> http://hadoop.apache.org/hive/docs/r0.5.0/api/
>
> only lists getSchema() for Table, and Partition does not have a
> separate getSchema().
>
> is the schema evolution feature really there?
>
> Thanks
> Yang
>

RE: Schema evolution?

Posted by Paul Yang <py...@facebook.com>.
There exists structures for supporting this in the metastore, but that feature isn't in Hive yet. For example, although the metadata for partitions include its own set of columns, parts of the code in the query processor still read from table level metadata.

Some evolution can occur in the form of adding columns to a table.

-----Original Message-----
From: Yang [mailto:teddyyyy123@gmail.com] 
Sent: Thursday, July 01, 2010 4:43 PM
To: hive-user@hadoop.apache.org
Subject: Schema evolution?

I read on the VLBD hive paper "Hive - A Warehousing Solution Over a
Map-Reduce    Framework"
that Partitions could have different schemas : (section 3.1 MetaStore) "
Partition - Each partition can have its own columns
and SerDe and storage information. This can be used
in the future to support schema evolution in a Hive
warehouse.
"

but the API:
http://hadoop.apache.org/hive/docs/r0.5.0/api/

only lists getSchema() for Table, and Partition does not have a
separate getSchema().

is the schema evolution feature really there?

Thanks
Yang