You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by shan s <my...@gmail.com> on 2012/04/10 21:03:01 UTC

Type mismatch in key from map

I am currently getting  “Type mismatch in key from map: expected
org.apache.pig.impl.io.NullableBytesWritable, recieved
org.apache.pig.impl.io.NullableText “


I looked up the PIG-919 and related comments, but could not understand the
reason or the workaround for this problem.

Could you please kindly explain this further?



I am getting this even before my GROUP, when I do my 3 way JOIN.



A1 = JOIN AA BY rid, BB BY rid;

A2 = JOIN A1 BY BB::cid, CC by cid;

DESCRIBE A2;

A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid)));

DESCRIBE A3;

DUMP A3;





DESCRIBE looks like below.



A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid:
bytearray,A1::AA::asname: bytearray,A1::BB::rid: bytearray,A1::BB::roname:
bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid:
bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname: bytearray}

A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid: bytearray}





If map is a problem, I tried to convert it to  tuple (For A3) above, but it
still does not work, in fact A3 still describes it as map (with a {}, I
guess)  Why is that?



Appreciate your help! Thanks!!

Re: Type mismatch in key from map

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
No, you can join on bytearrays. What can't be done is have pig
thinking you are joining on bytearrays when you are actually using
strings under the covers -- that's what causes the error you are
seeing.

On Wed, Apr 11, 2012 at 7:09 AM, shan s <my...@gmail.com> wrote:
> Hi Dmitriy
> It works after explicit casting to chararray.
> So does it mean a bytearray field can't be used in JOIN or is there more to
> it?
> How to explain this behaviour ?
>
> Thanks!
> On Wed, Apr 11, 2012 at 8:45 AM, shan s <my...@gmail.com> wrote:
>
>> When I  load my data I defined all fields to be chararray in the schema. I
>> can afford to treat everything as chararray.
>>
>> rid cold be chararray. ( but no real expectations from my side, it's a
>> guid from coming from db)
>> AA and BB do come from UDF, UDF does some string processing and
>> returns substrings as tuples.
>> Also when I tried to convert the rid to chararray in A3, I get an error,
>> "can't convert to chararray." without further explanation.
>>
>> Thank You....
>>  On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>>
>>> What type do you expect rid to be?
>>> Where did AA and BB come from?
>>>
>>> D
>>>
>>> On Tue, Apr 10, 2012 at 12:03 PM, shan s <my...@gmail.com> wrote:
>>> > I am currently getting  “Type mismatch in key from map: expected
>>> > org.apache.pig.impl.io.NullableBytesWritable, recieved
>>> > org.apache.pig.impl.io.NullableText “
>>> >
>>> >
>>> > I looked up the PIG-919 and related comments, but could not understand
>>> the
>>> > reason or the workaround for this problem.
>>> >
>>> > Could you please kindly explain this further?
>>> >
>>> >
>>> >
>>> > I am getting this even before my GROUP, when I do my 3 way JOIN.
>>> >
>>> >
>>> >
>>> > A1 = JOIN AA BY rid, BB BY rid;
>>> >
>>> > A2 = JOIN A1 BY BB::cid, CC by cid;
>>> >
>>> > DESCRIBE A2;
>>> >
>>> > A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid)));
>>> >
>>> > DESCRIBE A3;
>>> >
>>> > DUMP A3;
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > DESCRIBE looks like below.
>>> >
>>> >
>>> >
>>> > A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid:
>>> > bytearray,A1::AA::asname: bytearray,A1::BB::rid:
>>> bytearray,A1::BB::roname:
>>> > bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid:
>>> > bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname:
>>> bytearray}
>>> >
>>> > A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid:
>>> bytearray}
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > If map is a problem, I tried to convert it to  tuple (For A3) above,
>>> but it
>>> > still does not work, in fact A3 still describes it as map (with a {}, I
>>> > guess)  Why is that?
>>> >
>>> >
>>> >
>>> > Appreciate your help! Thanks!!
>>>
>>
>>

Re: Type mismatch in key from map

Posted by Jon Coveney <jc...@gmail.com>.
For scalar projection to work, you have to explicitly cast the one line relation to a scalar value. This is to make sure that I is clear what is going on in the script, as accidentally projecting a relation (usually in a group b situation) is common and we want the parser to fail instead of doing an unexpected scalar projection.

On Apr 11, 2012, at 7:09 AM, shan s <my...@gmail.com> wrote:

> Hi Dmitriy
> It works after explicit casting to chararray.
> So does it mean a bytearray field can't be used in JOIN or is there more to
> it?
> How to explain this behaviour ?
> 
> Thanks!
> On Wed, Apr 11, 2012 at 8:45 AM, shan s <my...@gmail.com> wrote:
> 
>> When I  load my data I defined all fields to be chararray in the schema. I
>> can afford to treat everything as chararray.
>> 
>> rid cold be chararray. ( but no real expectations from my side, it's a
>> guid from coming from db)
>> AA and BB do come from UDF, UDF does some string processing and
>> returns substrings as tuples.
>> Also when I tried to convert the rid to chararray in A3, I get an error,
>> "can't convert to chararray." without further explanation.
>> 
>> Thank You....
>> On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>> 
>>> What type do you expect rid to be?
>>> Where did AA and BB come from?
>>> 
>>> D
>>> 
>>> On Tue, Apr 10, 2012 at 12:03 PM, shan s <my...@gmail.com> wrote:
>>>> I am currently getting  “Type mismatch in key from map: expected
>>>> org.apache.pig.impl.io.NullableBytesWritable, recieved
>>>> org.apache.pig.impl.io.NullableText “
>>>> 
>>>> 
>>>> I looked up the PIG-919 and related comments, but could not understand
>>> the
>>>> reason or the workaround for this problem.
>>>> 
>>>> Could you please kindly explain this further?
>>>> 
>>>> 
>>>> 
>>>> I am getting this even before my GROUP, when I do my 3 way JOIN.
>>>> 
>>>> 
>>>> 
>>>> A1 = JOIN AA BY rid, BB BY rid;
>>>> 
>>>> A2 = JOIN A1 BY BB::cid, CC by cid;
>>>> 
>>>> DESCRIBE A2;
>>>> 
>>>> A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid)));
>>>> 
>>>> DESCRIBE A3;
>>>> 
>>>> DUMP A3;
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> DESCRIBE looks like below.
>>>> 
>>>> 
>>>> 
>>>> A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid:
>>>> bytearray,A1::AA::asname: bytearray,A1::BB::rid:
>>> bytearray,A1::BB::roname:
>>>> bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid:
>>>> bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname:
>>> bytearray}
>>>> 
>>>> A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid:
>>> bytearray}
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> If map is a problem, I tried to convert it to  tuple (For A3) above,
>>> but it
>>>> still does not work, in fact A3 still describes it as map (with a {}, I
>>>> guess)  Why is that?
>>>> 
>>>> 
>>>> 
>>>> Appreciate your help! Thanks!!
>>> 
>> 
>> 

Re: Type mismatch in key from map

Posted by shan s <my...@gmail.com>.
Hi Dmitriy
It works after explicit casting to chararray.
So does it mean a bytearray field can't be used in JOIN or is there more to
it?
How to explain this behaviour ?

Thanks!
On Wed, Apr 11, 2012 at 8:45 AM, shan s <my...@gmail.com> wrote:

> When I  load my data I defined all fields to be chararray in the schema. I
> can afford to treat everything as chararray.
>
> rid cold be chararray. ( but no real expectations from my side, it's a
> guid from coming from db)
> AA and BB do come from UDF, UDF does some string processing and
> returns substrings as tuples.
> Also when I tried to convert the rid to chararray in A3, I get an error,
> "can't convert to chararray." without further explanation.
>
> Thank You....
>  On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <dv...@gmail.com>wrote:
>
>> What type do you expect rid to be?
>> Where did AA and BB come from?
>>
>> D
>>
>> On Tue, Apr 10, 2012 at 12:03 PM, shan s <my...@gmail.com> wrote:
>> > I am currently getting  “Type mismatch in key from map: expected
>> > org.apache.pig.impl.io.NullableBytesWritable, recieved
>> > org.apache.pig.impl.io.NullableText “
>> >
>> >
>> > I looked up the PIG-919 and related comments, but could not understand
>> the
>> > reason or the workaround for this problem.
>> >
>> > Could you please kindly explain this further?
>> >
>> >
>> >
>> > I am getting this even before my GROUP, when I do my 3 way JOIN.
>> >
>> >
>> >
>> > A1 = JOIN AA BY rid, BB BY rid;
>> >
>> > A2 = JOIN A1 BY BB::cid, CC by cid;
>> >
>> > DESCRIBE A2;
>> >
>> > A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid)));
>> >
>> > DESCRIBE A3;
>> >
>> > DUMP A3;
>> >
>> >
>> >
>> >
>> >
>> > DESCRIBE looks like below.
>> >
>> >
>> >
>> > A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid:
>> > bytearray,A1::AA::asname: bytearray,A1::BB::rid:
>> bytearray,A1::BB::roname:
>> > bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid:
>> > bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname:
>> bytearray}
>> >
>> > A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid:
>> bytearray}
>> >
>> >
>> >
>> >
>> >
>> > If map is a problem, I tried to convert it to  tuple (For A3) above,
>> but it
>> > still does not work, in fact A3 still describes it as map (with a {}, I
>> > guess)  Why is that?
>> >
>> >
>> >
>> > Appreciate your help! Thanks!!
>>
>
>

Re: Type mismatch in key from map

Posted by shan s <my...@gmail.com>.
When I  load my data I defined all fields to be chararray in the schema. I
can afford to treat everything as chararray.

rid cold be chararray. ( but no real expectations from my side, it's a guid
from coming from db)
AA and BB do come from UDF, UDF does some string processing and
returns substrings as tuples.
Also when I tried to convert the rid to chararray in A3, I get an error,
"can't convert to chararray." without further explanation.

Thank You....
On Wed, Apr 11, 2012 at 4:09 AM, Dmitriy Ryaboy <dv...@gmail.com> wrote:

> What type do you expect rid to be?
> Where did AA and BB come from?
>
> D
>
> On Tue, Apr 10, 2012 at 12:03 PM, shan s <my...@gmail.com> wrote:
> > I am currently getting  “Type mismatch in key from map: expected
> > org.apache.pig.impl.io.NullableBytesWritable, recieved
> > org.apache.pig.impl.io.NullableText “
> >
> >
> > I looked up the PIG-919 and related comments, but could not understand
> the
> > reason or the workaround for this problem.
> >
> > Could you please kindly explain this further?
> >
> >
> >
> > I am getting this even before my GROUP, when I do my 3 way JOIN.
> >
> >
> >
> > A1 = JOIN AA BY rid, BB BY rid;
> >
> > A2 = JOIN A1 BY BB::cid, CC by cid;
> >
> > DESCRIBE A2;
> >
> > A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid)));
> >
> > DESCRIBE A3;
> >
> > DUMP A3;
> >
> >
> >
> >
> >
> > DESCRIBE looks like below.
> >
> >
> >
> > A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid:
> > bytearray,A1::AA::asname: bytearray,A1::BB::rid:
> bytearray,A1::BB::roname:
> > bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid:
> > bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname:
> bytearray}
> >
> > A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid:
> bytearray}
> >
> >
> >
> >
> >
> > If map is a problem, I tried to convert it to  tuple (For A3) above, but
> it
> > still does not work, in fact A3 still describes it as map (with a {}, I
> > guess)  Why is that?
> >
> >
> >
> > Appreciate your help! Thanks!!
>

Re: Type mismatch in key from map

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
What type do you expect rid to be?
Where did AA and BB come from?

D

On Tue, Apr 10, 2012 at 12:03 PM, shan s <my...@gmail.com> wrote:
> I am currently getting  “Type mismatch in key from map: expected
> org.apache.pig.impl.io.NullableBytesWritable, recieved
> org.apache.pig.impl.io.NullableText “
>
>
> I looked up the PIG-919 and related comments, but could not understand the
> reason or the workaround for this problem.
>
> Could you please kindly explain this further?
>
>
>
> I am getting this even before my GROUP, when I do my 3 way JOIN.
>
>
>
> A1 = JOIN AA BY rid, BB BY rid;
>
> A2 = JOIN A1 BY BB::cid, CC by cid;
>
> DESCRIBE A2;
>
> A3 = FOREACH A2 GENERATE FLATTEN((TOTUPLE(BB::rid)));
>
> DESCRIBE A3;
>
> DUMP A3;
>
>
>
>
>
> DESCRIBE looks like below.
>
>
>
> A2: {A1::AA::rid: bytearray,A1::AA::roname: bytearray,A1::AA::asid:
> bytearray,A1::AA::asname: bytearray,A1::BB::rid: bytearray,A1::BB::roname:
> bytearray,A1::BB::cid: bytearray,A1::BB::csname: bytearray,CC::cid:
> bytearray,CC::csname: bytearray,CC::chid: bytearray,CC::chname: bytearray}
>
> A3: {org.apache.pig.builtin.totuple_A1::BB::rid_3::A1::BB::rid: bytearray}
>
>
>
>
>
> If map is a problem, I tried to convert it to  tuple (For A3) above, but it
> still does not work, in fact A3 still describes it as map (with a {}, I
> guess)  Why is that?
>
>
>
> Appreciate your help! Thanks!!