You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Kevin Weil <ke...@gmail.com> on 2008/10/08 21:00:16 UTC
BinStorage and versioning
Hi,
Say that I write out a tuple with three fields using BinStorage. And then
in a couple months, I add a parameter, so now I write out a tuple with a new
fourth field. If I'm loading a directory containing the files that have
both of these tuples, and I say
a = LOAD 'my_directory' using BinStorage() as (f1, f2, f3, f4)
what will happen when BinStorage tries to load the first tuple with only
three fields? Will f4 just be NULL?
Thanks,
Kevin
Re: BinStorage and versioning
Posted by Kevin Weil <ke...@gmail.com>.
Just tested -- it does not work in the release, but in the types branch it
does:
Thanks!
> cat f3.txt
a b c
> cat f4.txt
A B C D
a = load 'f3.txt' using PigStorage(' ') as (f1, f2, f3);
b = load 'f4.txt' using PigStorage(' ') as (F1, F2, F3, F4);
both = union a, b;
store both into 'fields.bz2' using BinStorage();
c = load 'fields.bz2' using BinStorage() as (p1, p2, p3, p4);
dump c;
(a, b, c)
(A, B, C, D)
k = foreach c generate p4;
dump k;
(D)
(NULL)
On Wed, Oct 8, 2008 at 12:59 PM, Olga Natkovich <ol...@yahoo-inc.com> wrote:
> It should work just fine; the missing data will be replaced with null at
> least in the types branch.
>
> Olga
>
> > -----Original Message-----
> > From: Kevin Weil [mailto:kevinweil@gmail.com]
> > Sent: Wednesday, October 08, 2008 12:00 PM
> > To: pig-user@incubator.apache.org
> > Subject: BinStorage and versioning
> >
> > Hi,
> >
> > Say that I write out a tuple with three fields using
> > BinStorage. And then in a couple months, I add a parameter,
> > so now I write out a tuple with a new fourth field. If I'm
> > loading a directory containing the files that have both of
> > these tuples, and I say
> >
> > a = LOAD 'my_directory' using BinStorage() as (f1, f2, f3, f4)
> >
> > what will happen when BinStorage tries to load the first
> > tuple with only three fields? Will f4 just be NULL?
> >
> > Thanks,
> > Kevin
> >
>
RE: BinStorage and versioning
Posted by Olga Natkovich <ol...@yahoo-inc.com>.
It should work just fine; the missing data will be replaced with null at
least in the types branch.
Olga
> -----Original Message-----
> From: Kevin Weil [mailto:kevinweil@gmail.com]
> Sent: Wednesday, October 08, 2008 12:00 PM
> To: pig-user@incubator.apache.org
> Subject: BinStorage and versioning
>
> Hi,
>
> Say that I write out a tuple with three fields using
> BinStorage. And then in a couple months, I add a parameter,
> so now I write out a tuple with a new fourth field. If I'm
> loading a directory containing the files that have both of
> these tuples, and I say
>
> a = LOAD 'my_directory' using BinStorage() as (f1, f2, f3, f4)
>
> what will happen when BinStorage tries to load the first
> tuple with only three fields? Will f4 just be NULL?
>
> Thanks,
> Kevin
>