You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Josh Ferguson <jo...@besquared.net> on 2009/01/27 00:05:53 UTC

Migration Strategy

What's the current strategy for when you have a production system and you
realize you need to add another column to the table or do some other thing?
Seems like you'd have to make a new table, run a script to transform and
load all your old data to the new table, and then remove the old table. Is
this what is currently being done?
Josh F.

Re: Migration Strategy

Posted by Zheng Shao <zs...@gmail.com>.
Hi Josh,

DynamicSerDe with TCTLSeparatedProtocol will also treat missing columns from
data as NULL.

Basically, if you create the table without specifying the SerDe or Protocol,
then it should be Ok to add a new column in the schema, and for old data,
that new column will be NULL.


Zheng

On Mon, Jan 26, 2009 at 3:31 PM, Ashish Thusoo <at...@facebook.com> wrote:

>  If you are adding a column at the end of the table, you should be ok with
> the old data staying in the state that it was provided it is created with
> MetadataTypedColumnSetSerDe (I am not sure what happens with DynamicSerDe).
> MetadataTypedColumnSetSerdDe interprets missing columns at the end as nulls
> in the old data. Note this only works when adding columns at the end without
> changing names...
>
> Ashish
>
>  ------------------------------
> *From:* Josh Ferguson [mailto:josh@besquared.net]
> *Sent:* Monday, January 26, 2009 3:06 PM
> *To:* hive-user@hadoop.apache.org
> *Subject:* Migration Strategy
>
> What's the current strategy for when you have a production system and you
> realize you need to add another column to the table or do some other thing?
> Seems like you'd have to make a new table, run a script to transform and
> load all your old data to the new table, and then remove the old table. Is
> this what is currently being done?
> Josh F.
>



-- 
Yours,
Zheng

RE: Migration Strategy

Posted by Ashish Thusoo <at...@facebook.com>.
If you are adding a column at the end of the table, you should be ok with the old data staying in the state that it was provided it is created with MetadataTypedColumnSetSerDe (I am not sure what happens with DynamicSerDe). MetadataTypedColumnSetSerdDe interprets missing columns at the end as nulls in the old data. Note this only works when adding columns at the end without changing names...

Ashish

________________________________
From: Josh Ferguson [mailto:josh@besquared.net]
Sent: Monday, January 26, 2009 3:06 PM
To: hive-user@hadoop.apache.org
Subject: Migration Strategy

What's the current strategy for when you have a production system and you realize you need to add another column to the table or do some other thing? Seems like you'd have to make a new table, run a script to transform and load all your old data to the new table, and then remove the old table. Is this what is currently being done?

Josh F.