You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Mohit Anchlia <mo...@gmail.com> on 2013/03/07 01:58:33 UTC

Tuples in UDF and null

If I define and set tuple like this:

Tuple t1 = mTupleFactory.newTuple(2);
t1.set(0, "Hello");
t1.set(1, NULL);

and have schema like:

b:bag{t:tuple(a:chararray, b:chararray)

and then in the pig script if I do:

page = foreach B generate b;



What should be expected outcome? Would "generate" convert NULL into literal
'NULL' as a string? Or does it skip over that NULL.

Re: Tuples in UDF and null

Posted by Mohit Anchlia <mo...@gmail.com>.
yes my mistake. I meant to FLATTEN it and then reference it directly. I'll
look at filter. What I really need is something where I can filter rows
that have UUID followed by either only \t (delims) or \n

On Thu, Mar 7, 2013 at 12:11 PM, Harsha <ha...@defun.org> wrote:

> Mohit,
>    A = LOAD '/user/apuser/test/data1' AS b:bag{
> you are naming your data bag as b.
> if you want to refer values inside the data bag try b.a or b.b.
> The sample data I gave you is something random if you are trying to skip
> over nulls
> you can do so by using Filter.
> Take a look at http://pig.apache.org/docs/r0.11.0/
> -Harsha
>
>
> --
> Harsha
>
>
> On Thursday, March 7, 2013 at 11:58 AM, Mohit Anchlia wrote:
>
> > So I did this. I took your example and put it in a file and ran some pig
> > commands through grunt but I am getting same results from a bag and
> > generating from tuple. I might be doing something wrong here.
> >
> > grunt> A = LOAD '/user/apuser/test/data1' AS b:bag{t:tuple(a:chararray,
> > b:chararray)};
> > grunt> dump A;
> > 2013-03-07 14:55:25,125 [main] INFO
> > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input
> > paths to process : 1
> > ({(1,)})
> > ({(3,)})
> > ({(5,10)})
> > ({(7,)})
> >
> > grunt> b = foreach A generate b;
> > grunt> dump b;
> > 2013-03-07 14:57:59,509 [main] INFO
> > org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total
> input
> > paths to process : 1
> > ({(1,)})
> > ({(3,)})
> > ({(5,10)})
> > ({(7,)})
> > grunt>
> >
> > I get the same output again.
> >
> >
> > On Thu, Mar 7, 2013 at 11:40 AM, Mohit Anchlia <mohitanchlia@gmail.com(mailto:
> mohitanchlia@gmail.com)>wrote:
> >
> > > good suggestion. Let me try that
> > >
> > >
> > > On Thu, Mar 7, 2013 at 11:27 AM, Harsha <harsha@defun.org (mailto:
> harsha@defun.org)> wrote:
> > >
> > > > It will be easier if you have some sample data and run it through
> grunt
> > > > shell.
> > > > Lets say you have a dataset like this
> > > > ({(1,)})
> > > > ({(3,)})
> > > > ({(5,10)})
> > > > ({(7,)})
> > > >
> > > > some of them are nulls in your "b" and some rows has values for "b"
> > > > and if you do a "generate" for above it will run through each row
> > > > and try to fetch values for b if there is none it will do ()
> > > > something like this
> > > >
> > > > ({()})
> > > > ({()})
> > > > ({(10)})
> > > > ({()})
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Harsha
> > > >
> > > >
> > > > On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:
> > > >
> > > > > sorry, yes my question was about accessing b not $1. What's the
> effect
> > > > of
> > > > > writing empty() to a file. Say if I did store b into temp then
> should I
> > > > > expect a line or nothing gets writen at all in the file.
> > > > >
> > > > > On Thu, Mar 7, 2013 at 10:53 AM, Harsha <harsha@defun.org (mailto:
> harsha@defun.org) (mailto:
> > > > harsha@defun.org (mailto:harsha@defun.org))> wrote:
> > > > >
> > > > > > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> > > > > > your tuple is inside a bag so on the next line if you are trying
> to
> > > > > >
> > > > >
> > > > >
> > > >
> > > > access
> > > > > > through $1 pig will
> > > > > > throw up an error saying non-existent column.
> > > > > > but if your question is about accessing b than it will print
> empty ()
> > > > > >
> > > > >
> > > >
> > > > if
> > > > > > the there is no value present (as you are setting it as null).
> > > > > >
> > > > > > --
> > > > > > Harsha
> > > > > >
> > > > > >
> > > > > > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
> > > > > >
> > > > > > > Thanks! Does "generate" skip over that? if I did b = for B
> generate
> > > > $1
> > > > > > what
> > > > > > > should be expected outcome of alias "b"
> > > > > >
> > > > > >
> > > > >
> > > >
> > > > > > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <harsha@defun.org(mailto:
> harsha@defun.org) (mailto:
> > > > harsha@defun.org (mailto:harsha@defun.org)) (mailto:
>  > > > > > harsha@defun.org (mailto:harsha@defun.org))> wrote:
> > > > > > >
> > > > > > > > Hi Mohit,
> > > > > > > > it won't convert into string literal 'NULL' since its a tuple
> > > > > > > > you'll see results like
> > > > > > > > ('Hello',)
> > > > > > > >
> > > > > > > > --
> > > > > > > > Harsha
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
> > > > > > > >
> > > > > > > > > Any help would be appreciated. I'll also write something
> > > > shortly and
> > > > > > see
> > > > > > > > > what happens.
> > > > > > > > >
> > > > > > > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
> > > > > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)(mailto:
> > > > > > > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com
> ))>wrote:
> > > > > > >
> > > > > >
> > > > > >
> > > > > > > > >
> > > > > > > > > > If I define and set tuple like this:
> > > > > > > > > >
> > > > > > > > > > Tuple t1 = mTupleFactory.newTuple(2);
> > > > > > > > > > t1.set(0, "Hello");
> > > > > > > > > > t1.set(1, NULL);
> > > > > > > > > >
> > > > > > > > > > and have schema like:
> > > > > > > > > >
> > > > > > > > > > b:bag{t:tuple(a:chararray, b:chararray)
> > > > > > > > > >
> > > > > > > > > > and then in the pig script if I do:
> > > > > > > > > >
> > > > > > > > > > page = foreach B generate b;
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > What should be expected outcome? Would "generate" convert
> > > > NULL into
> > > > > > > > > > literal 'NULL' as a string? Or does it skip over that
> NULL.
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> > >
> >
> >
> >
>
>
>

Re: Tuples in UDF and null

Posted by Harsha <ha...@defun.org>.
Mohit,
   A = LOAD '/user/apuser/test/data1' AS b:bag{
you are naming your data bag as b.
if you want to refer values inside the data bag try b.a or b.b.
The sample data I gave you is something random if you are trying to skip over nulls
you can do so by using Filter.
Take a look at http://pig.apache.org/docs/r0.11.0/
-Harsha


-- 
Harsha


On Thursday, March 7, 2013 at 11:58 AM, Mohit Anchlia wrote:

> So I did this. I took your example and put it in a file and ran some pig
> commands through grunt but I am getting same results from a bag and
> generating from tuple. I might be doing something wrong here.
> 
> grunt> A = LOAD '/user/apuser/test/data1' AS b:bag{t:tuple(a:chararray,
> b:chararray)};
> grunt> dump A;
> 2013-03-07 14:55:25,125 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
> paths to process : 1
> ({(1,)})
> ({(3,)})
> ({(5,10)})
> ({(7,)})
> 
> grunt> b = foreach A generate b;
> grunt> dump b;
> 2013-03-07 14:57:59,509 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
> paths to process : 1
> ({(1,)})
> ({(3,)})
> ({(5,10)})
> ({(7,)})
> grunt>
> 
> I get the same output again.
> 
> 
> On Thu, Mar 7, 2013 at 11:40 AM, Mohit Anchlia <mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)>wrote:
> 
> > good suggestion. Let me try that
> > 
> > 
> > On Thu, Mar 7, 2013 at 11:27 AM, Harsha <harsha@defun.org (mailto:harsha@defun.org)> wrote:
> > 
> > > It will be easier if you have some sample data and run it through grunt
> > > shell.
> > > Lets say you have a dataset like this
> > > ({(1,)})
> > > ({(3,)})
> > > ({(5,10)})
> > > ({(7,)})
> > > 
> > > some of them are nulls in your "b" and some rows has values for "b"
> > > and if you do a "generate" for above it will run through each row
> > > and try to fetch values for b if there is none it will do ()
> > > something like this
> > > 
> > > ({()})
> > > ({()})
> > > ({(10)})
> > > ({()})
> > > 
> > > 
> > > 
> > > 
> > > --
> > > Harsha
> > > 
> > > 
> > > On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:
> > > 
> > > > sorry, yes my question was about accessing b not $1. What's the effect
> > > of
> > > > writing empty() to a file. Say if I did store b into temp then should I
> > > > expect a line or nothing gets writen at all in the file.
> > > > 
> > > > On Thu, Mar 7, 2013 at 10:53 AM, Harsha <harsha@defun.org (mailto:harsha@defun.org) (mailto:
> > > harsha@defun.org (mailto:harsha@defun.org))> wrote:
> > > > 
> > > > > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> > > > > your tuple is inside a bag so on the next line if you are trying to
> > > > > 
> > > > 
> > > > 
> > > 
> > > access
> > > > > through $1 pig will
> > > > > throw up an error saying non-existent column.
> > > > > but if your question is about accessing b than it will print empty ()
> > > > > 
> > > > 
> > > 
> > > if
> > > > > the there is no value present (as you are setting it as null).
> > > > > 
> > > > > --
> > > > > Harsha
> > > > > 
> > > > > 
> > > > > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
> > > > > 
> > > > > > Thanks! Does "generate" skip over that? if I did b = for B generate
> > > $1
> > > > > what
> > > > > > should be expected outcome of alias "b"
> > > > > 
> > > > > 
> > > > 
> > > 
> > > > > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <harsha@defun.org (mailto:harsha@defun.org) (mailto:
> > > harsha@defun.org (mailto:harsha@defun.org)) (mailto:
> > > > > harsha@defun.org (mailto:harsha@defun.org))> wrote:
> > > > > > 
> > > > > > > Hi Mohit,
> > > > > > > it won't convert into string literal 'NULL' since its a tuple
> > > > > > > you'll see results like
> > > > > > > ('Hello',)
> > > > > > > 
> > > > > > > --
> > > > > > > Harsha
> > > > > > > 
> > > > > > > 
> > > > > > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
> > > > > > > 
> > > > > > > > Any help would be appreciated. I'll also write something
> > > shortly and
> > > > > see
> > > > > > > > what happens.
> > > > > > > > 
> > > > > > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
> > > > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)(mailto:
> > > > > > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com))>wrote:
> > > > > > 
> > > > > 
> > > > > 
> > > > > > > > 
> > > > > > > > > If I define and set tuple like this:
> > > > > > > > > 
> > > > > > > > > Tuple t1 = mTupleFactory.newTuple(2);
> > > > > > > > > t1.set(0, "Hello");
> > > > > > > > > t1.set(1, NULL);
> > > > > > > > > 
> > > > > > > > > and have schema like:
> > > > > > > > > 
> > > > > > > > > b:bag{t:tuple(a:chararray, b:chararray)
> > > > > > > > > 
> > > > > > > > > and then in the pig script if I do:
> > > > > > > > > 
> > > > > > > > > page = foreach B generate b;
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > What should be expected outcome? Would "generate" convert
> > > NULL into
> > > > > > > > > literal 'NULL' as a string? Or does it skip over that NULL.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > > 
> > 
> > 
> 
> 
> 



Re: Tuples in UDF and null

Posted by Mohit Anchlia <mo...@gmail.com>.
So I did this. I took your example and put it in a file and ran some pig
commands through grunt but I am getting same results from a bag and
generating from tuple. I might be doing something wrong here.

grunt> A = LOAD '/user/apuser/test/data1' AS b:bag{t:tuple(a:chararray,
b:chararray)};
grunt> dump A;
2013-03-07 14:55:25,125 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths to process : 1
({(1,)})
({(3,)})
({(5,10)})
({(7,)})

grunt> b = foreach A generate b;
grunt> dump b;
2013-03-07 14:57:59,509 [main] INFO
org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input
paths to process : 1
({(1,)})
({(3,)})
({(5,10)})
({(7,)})
grunt>

I get the same output again.


On Thu, Mar 7, 2013 at 11:40 AM, Mohit Anchlia <mo...@gmail.com>wrote:

> good suggestion. Let me try that
>
>
> On Thu, Mar 7, 2013 at 11:27 AM, Harsha <ha...@defun.org> wrote:
>
>> It will be easier if you have some sample data and run it through grunt
>> shell.
>> Lets say you have a dataset like this
>> ({(1,)})
>> ({(3,)})
>> ({(5,10)})
>> ({(7,)})
>>
>> some of them are nulls in your "b" and some rows has values for "b"
>> and if you do a "generate" for above it will run through each row
>> and try to fetch values for b if there is none it will do ()
>> something like this
>>
>> ({()})
>> ({()})
>> ({(10)})
>> ({()})
>>
>>
>>
>>
>> --
>> Harsha
>>
>>
>> On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:
>>
>> > sorry, yes my question was about accessing b not $1. What's the effect
>> of
>> > writing empty() to a file. Say if I did store b into temp then should I
>> > expect a line or nothing gets writen at all in the file.
>> >
>> > On Thu, Mar 7, 2013 at 10:53 AM, Harsha <harsha@defun.org (mailto:
>> harsha@defun.org)> wrote:
>> >
>> > > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
>> > > your tuple is inside a bag so on the next line if you are trying to
>> access
>> > > through $1 pig will
>> > > throw up an error saying non-existent column.
>> > > but if your question is about accessing b than it will print empty ()
>> if
>> > > the there is no value present (as you are setting it as null).
>> > >
>> > > --
>> > > Harsha
>> > >
>> > >
>> > > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
>> > >
>> > > > Thanks! Does "generate" skip over that? if I did b = for B generate
>> $1
>> > > what
>> > > > should be expected outcome of alias "b"
>> > > >
>>  > > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <harsha@defun.org (mailto:
>> harsha@defun.org) (mailto:
>> > > harsha@defun.org (mailto:harsha@defun.org))> wrote:
>> > > >
>> > > > > Hi Mohit,
>> > > > > it won't convert into string literal 'NULL' since its a tuple
>> > > > > you'll see results like
>> > > > > ('Hello',)
>> > > > >
>> > > > > --
>> > > > > Harsha
>> > > > >
>> > > > >
>> > > > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
>> > > > >
>> > > > > > Any help would be appreciated. I'll also write something
>> shortly and
>> > > see
>> > > > > > what happens.
>> > > > > >
>> > > > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
>> > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)(mailto:
>> > > > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com))>wrote:
>> > > >
>> > >
>> > > > > >
>> > > > > > > If I define and set tuple like this:
>> > > > > > >
>> > > > > > > Tuple t1 = mTupleFactory.newTuple(2);
>> > > > > > > t1.set(0, "Hello");
>> > > > > > > t1.set(1, NULL);
>> > > > > > >
>> > > > > > > and have schema like:
>> > > > > > >
>> > > > > > > b:bag{t:tuple(a:chararray, b:chararray)
>> > > > > > >
>> > > > > > > and then in the pig script if I do:
>> > > > > > >
>> > > > > > > page = foreach B generate b;
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > What should be expected outcome? Would "generate" convert
>> NULL into
>> > > > > > > literal 'NULL' as a string? Or does it skip over that NULL.
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> >
>> >
>> >
>>
>>
>>
>

Re: Tuples in UDF and null

Posted by Mohit Anchlia <mo...@gmail.com>.
good suggestion. Let me try that

On Thu, Mar 7, 2013 at 11:27 AM, Harsha <ha...@defun.org> wrote:

> It will be easier if you have some sample data and run it through grunt
> shell.
> Lets say you have a dataset like this
> ({(1,)})
> ({(3,)})
> ({(5,10)})
> ({(7,)})
>
> some of them are nulls in your "b" and some rows has values for "b"
> and if you do a "generate" for above it will run through each row
> and try to fetch values for b if there is none it will do ()
> something like this
>
> ({()})
> ({()})
> ({(10)})
> ({()})
>
>
>
>
> --
> Harsha
>
>
> On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:
>
> > sorry, yes my question was about accessing b not $1. What's the effect of
> > writing empty() to a file. Say if I did store b into temp then should I
> > expect a line or nothing gets writen at all in the file.
> >
> > On Thu, Mar 7, 2013 at 10:53 AM, Harsha <harsha@defun.org (mailto:
> harsha@defun.org)> wrote:
> >
> > > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> > > your tuple is inside a bag so on the next line if you are trying to
> access
> > > through $1 pig will
> > > throw up an error saying non-existent column.
> > > but if your question is about accessing b than it will print empty ()
> if
> > > the there is no value present (as you are setting it as null).
> > >
> > > --
> > > Harsha
> > >
> > >
> > > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
> > >
> > > > Thanks! Does "generate" skip over that? if I did b = for B generate
> $1
> > > what
> > > > should be expected outcome of alias "b"
> > > >
>  > > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <harsha@defun.org (mailto:
> harsha@defun.org) (mailto:
> > > harsha@defun.org (mailto:harsha@defun.org))> wrote:
> > > >
> > > > > Hi Mohit,
> > > > > it won't convert into string literal 'NULL' since its a tuple
> > > > > you'll see results like
> > > > > ('Hello',)
> > > > >
> > > > > --
> > > > > Harsha
> > > > >
> > > > >
> > > > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
> > > > >
> > > > > > Any help would be appreciated. I'll also write something shortly
> and
> > > see
> > > > > > what happens.
> > > > > >
> > > > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
> > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)(mailto:
> > > > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com))>wrote:
> > > >
> > >
> > > > > >
> > > > > > > If I define and set tuple like this:
> > > > > > >
> > > > > > > Tuple t1 = mTupleFactory.newTuple(2);
> > > > > > > t1.set(0, "Hello");
> > > > > > > t1.set(1, NULL);
> > > > > > >
> > > > > > > and have schema like:
> > > > > > >
> > > > > > > b:bag{t:tuple(a:chararray, b:chararray)
> > > > > > >
> > > > > > > and then in the pig script if I do:
> > > > > > >
> > > > > > > page = foreach B generate b;
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > What should be expected outcome? Would "generate" convert NULL
> into
> > > > > > > literal 'NULL' as a string? Or does it skip over that NULL.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
> >
> >
>
>
>

Re: Tuples in UDF and null

Posted by Harsha <ha...@defun.org>.
It will be easier if you have some sample data and run it through grunt shell.
Lets say you have a dataset like this
({(1,)}) 
({(3,)})
({(5,10)})
({(7,)})

some of them are nulls in your "b" and some rows has values for "b"
and if you do a "generate" for above it will run through each row 
and try to fetch values for b if there is none it will do ()
something like this

({()})
({()})
({(10)})
({()})




-- 
Harsha


On Thursday, March 7, 2013 at 11:15 AM, Mohit Anchlia wrote:

> sorry, yes my question was about accessing b not $1. What's the effect of
> writing empty() to a file. Say if I did store b into temp then should I
> expect a line or nothing gets writen at all in the file.
> 
> On Thu, Mar 7, 2013 at 10:53 AM, Harsha <harsha@defun.org (mailto:harsha@defun.org)> wrote:
> 
> > from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> > your tuple is inside a bag so on the next line if you are trying to access
> > through $1 pig will
> > throw up an error saying non-existent column.
> > but if your question is about accessing b than it will print empty () if
> > the there is no value present (as you are setting it as null).
> > 
> > --
> > Harsha
> > 
> > 
> > On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
> > 
> > > Thanks! Does "generate" skip over that? if I did b = for B generate $1
> > what
> > > should be expected outcome of alias "b"
> > > 
> > > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <harsha@defun.org (mailto:harsha@defun.org) (mailto:
> > harsha@defun.org (mailto:harsha@defun.org))> wrote:
> > > 
> > > > Hi Mohit,
> > > > it won't convert into string literal 'NULL' since its a tuple
> > > > you'll see results like
> > > > ('Hello',)
> > > > 
> > > > --
> > > > Harsha
> > > > 
> > > > 
> > > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
> > > > 
> > > > > Any help would be appreciated. I'll also write something shortly and
> > see
> > > > > what happens.
> > > > > 
> > > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
> > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)(mailto:
> > > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com))>wrote:
> > > 
> > 
> > > > >
> > > > > > If I define and set tuple like this:
> > > > > > 
> > > > > > Tuple t1 = mTupleFactory.newTuple(2);
> > > > > > t1.set(0, "Hello");
> > > > > > t1.set(1, NULL);
> > > > > > 
> > > > > > and have schema like:
> > > > > > 
> > > > > > b:bag{t:tuple(a:chararray, b:chararray)
> > > > > > 
> > > > > > and then in the pig script if I do:
> > > > > > 
> > > > > > page = foreach B generate b;
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > What should be expected outcome? Would "generate" convert NULL into
> > > > > > literal 'NULL' as a string? Or does it skip over that NULL.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> > 
> 
> 
> 



Re: Tuples in UDF and null

Posted by Mohit Anchlia <mo...@gmail.com>.
sorry, yes my question was about accessing b not $1. What's the effect of
writing empty() to a file. Say if I did store b into temp then should I
expect a line or nothing gets writen at all in the file.

On Thu, Mar 7, 2013 at 10:53 AM, Harsha <ha...@defun.org> wrote:

> from your schema b:bag{t:tuple(a:chararray, b:chararray)}
> your tuple is inside a bag so on the next line if you are trying to access
> through $1 pig will
> throw up an error saying non-existent column.
> but if your question is about accessing b than it will print empty () if
> the there is no value present (as you are setting it as null).
>
> --
> Harsha
>
>
> On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:
>
> > Thanks! Does "generate" skip over that? if I did b = for B generate $1
> what
> > should be expected outcome of alias "b"
> >
> > On Thu, Mar 7, 2013 at 10:31 AM, Harsha <harsha@defun.org (mailto:
> harsha@defun.org)> wrote:
> >
> > > Hi Mohit,
> > > it won't convert into string literal 'NULL' since its a tuple
> > > you'll see results like
> > > ('Hello',)
> > >
> > > --
> > > Harsha
> > >
> > >
> > > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
> > >
> > > > Any help would be appreciated. I'll also write something shortly and
> see
> > > > what happens.
> > > >
> > > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <
> mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)(mailto:
> > > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com))>wrote:
>  > > >
> > > > > If I define and set tuple like this:
> > > > >
> > > > > Tuple t1 = mTupleFactory.newTuple(2);
> > > > > t1.set(0, "Hello");
> > > > > t1.set(1, NULL);
> > > > >
> > > > > and have schema like:
> > > > >
> > > > > b:bag{t:tuple(a:chararray, b:chararray)
> > > > >
> > > > > and then in the pig script if I do:
> > > > >
> > > > > page = foreach B generate b;
> > > > >
> > > > >
> > > > >
> > > > > What should be expected outcome? Would "generate" convert NULL into
> > > > > literal 'NULL' as a string? Or does it skip over that NULL.
> > > > >
> > > >
> > >
> > >
> >
> >
> >
>
>
>

Re: Tuples in UDF and null

Posted by Harsha <ha...@defun.org>.
from your schema b:bag{t:tuple(a:chararray, b:chararray)}
your tuple is inside a bag so on the next line if you are trying to access through $1 pig will
throw up an error saying non-existent column.
but if your question is about accessing b than it will print empty () if the there is no value present (as you are setting it as null).

-- 
Harsha


On Thursday, March 7, 2013 at 10:35 AM, Mohit Anchlia wrote:

> Thanks! Does "generate" skip over that? if I did b = for B generate $1 what
> should be expected outcome of alias "b"
> 
> On Thu, Mar 7, 2013 at 10:31 AM, Harsha <harsha@defun.org (mailto:harsha@defun.org)> wrote:
> 
> > Hi Mohit,
> > it won't convert into string literal 'NULL' since its a tuple
> > you'll see results like
> > ('Hello',)
> > 
> > --
> > Harsha
> > 
> > 
> > On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
> > 
> > > Any help would be appreciated. I'll also write something shortly and see
> > > what happens.
> > > 
> > > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)(mailto:
> > mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com))>wrote:
> > >
> > > > If I define and set tuple like this:
> > > > 
> > > > Tuple t1 = mTupleFactory.newTuple(2);
> > > > t1.set(0, "Hello");
> > > > t1.set(1, NULL);
> > > > 
> > > > and have schema like:
> > > > 
> > > > b:bag{t:tuple(a:chararray, b:chararray)
> > > > 
> > > > and then in the pig script if I do:
> > > > 
> > > > page = foreach B generate b;
> > > > 
> > > > 
> > > > 
> > > > What should be expected outcome? Would "generate" convert NULL into
> > > > literal 'NULL' as a string? Or does it skip over that NULL.
> > > > 
> > > 
> > 
> > 
> 
> 
> 



Re: Tuples in UDF and null

Posted by Mohit Anchlia <mo...@gmail.com>.
Thanks! Does "generate" skip over that? if I did b = for B generate $1 what
should be expected outcome of alias "b"

On Thu, Mar 7, 2013 at 10:31 AM, Harsha <ha...@defun.org> wrote:

> Hi Mohit,
>           it won't convert into string literal 'NULL' since its a tuple
> you'll see results like
> ('Hello',)
>
> --
> Harsha
>
>
> On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:
>
> > Any help would be appreciated. I'll also write something shortly and see
> > what happens.
> >
> > On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <mohitanchlia@gmail.com(mailto:
> mohitanchlia@gmail.com)>wrote:
>  >
> > > If I define and set tuple like this:
> > >
> > > Tuple t1 = mTupleFactory.newTuple(2);
> > > t1.set(0, "Hello");
> > > t1.set(1, NULL);
> > >
> > > and have schema like:
> > >
> > > b:bag{t:tuple(a:chararray, b:chararray)
> > >
> > > and then in the pig script if I do:
> > >
> > > page = foreach B generate b;
> > >
> > >
> > >
> > > What should be expected outcome? Would "generate" convert NULL into
> > > literal 'NULL' as a string? Or does it skip over that NULL.
> > >
> >
> >
> >
>
>
>

Re: Tuples in UDF and null

Posted by Harsha <ha...@defun.org>.
Hi Mohit, 
          it won't convert into string literal 'NULL' since its a tuple you'll see results like
('Hello',) 

-- 
Harsha


On Thursday, March 7, 2013 at 10:10 AM, Mohit Anchlia wrote:

> Any help would be appreciated. I'll also write something shortly and see
> what happens.
> 
> On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <mohitanchlia@gmail.com (mailto:mohitanchlia@gmail.com)>wrote:
> 
> > If I define and set tuple like this:
> > 
> > Tuple t1 = mTupleFactory.newTuple(2);
> > t1.set(0, "Hello");
> > t1.set(1, NULL);
> > 
> > and have schema like:
> > 
> > b:bag{t:tuple(a:chararray, b:chararray)
> > 
> > and then in the pig script if I do:
> > 
> > page = foreach B generate b;
> > 
> > 
> > 
> > What should be expected outcome? Would "generate" convert NULL into
> > literal 'NULL' as a string? Or does it skip over that NULL.
> > 
> 
> 
> 



Re: Tuples in UDF and null

Posted by Mohit Anchlia <mo...@gmail.com>.
Any help would be appreciated. I'll also write something shortly and see
what happens.

On Wed, Mar 6, 2013 at 4:58 PM, Mohit Anchlia <mo...@gmail.com>wrote:

> If I define and set tuple like this:
>
> Tuple t1 = mTupleFactory.newTuple(2);
> t1.set(0, "Hello");
> t1.set(1, NULL);
>
> and have schema like:
>
> b:bag{t:tuple(a:chararray, b:chararray)
>
> and then in the pig script if I do:
>
> page = foreach B generate b;
>
>
>
> What should be expected outcome? Would "generate" convert NULL into
> literal 'NULL' as a string? Or does it skip over that NULL.
>