You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by deepak kumar v <de...@gmail.com> on 2011/03/17 07:24:33 UTC

Unable to access Map within a tuple.

Hi,
Below are list of tuples generated after flattening a bag .

(day, age, name, address,  ['k1#v1','k2#v2']),
(12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
(12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])

process = foreach inputs generate com.yahoo.peblpig.udf.InvokeProcess($0);
Here $0 some how gets only (day, age, name, address) and the map is skipped.
*How can i access the map? *
With
$1 i get "Out of bound access. Trying to access non-existent column: 1.
Schema {bytearray} has 1 column(s)",
$0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
to org.apache.pig.data.Tuple
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)

Also,
With
tuples = foreach flattenedTuples generate $0
generates
(day, age, name, address),
(12/2,22,deepak,newyork),
(12/3,22,deepak,newjersy)

After flatenning if i dump, i see the map in the resultant tuples, but $0
instead referring to entire tuple, referes only to data part (map skipped)
Regards,
Deepak

Re: Unable to access Map within a tuple.

Posted by deepak kumar v <de...@gmail.com>.
Passing * to UDF did the trick.
Thank you.

Deepak

On Tue, Mar 29, 2011 at 11:42 AM, deepak kumar v <de...@gmail.com>wrote:

> The problem is schema is lost after flattening.
>
> inputBag has schema of {outTuple : (channelMap: map [ ] ) }
>
> However after flattening the schema is
>
> flatTuples : {data : chararray}
>
> How can we specify schema for a call to flatten.
>
>
> On Wed, Mar 23, 2011 at 11:35 PM, Daniel Dai <ji...@yahoo-inc.com>wrote:
>
>>  2) should be the right approach, more concisely, you can say:
>> processed = foreach flatTuple generate com.myUDF.UDF(*)
>>
>> It seems to be something wrong with your schema propagation. How do you
>> generate inputBag? If it is generated by UDF, make sure your UDF declare
>> proper output schema. You can check the schema by using "describe".
>>
>> Daniel
>>
>>
>> On 03/22/2011 09:30 PM, deepak kumar v wrote:
>>
>> Hi Daniel,
>>
>> I have a bag of tuples
>> inputBag =
>> { (day, age, name, address,  ['k1#v1','k2#v2']),
>> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
>> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>> }
>>
>>  I need to invoke a UDF for each tuple, so i have to flatten the bag
>> which i do as
>>
>>  flatTuples = foreach inputBag generate FLATTEN($0)
>>
>>  Now i get a list of tuples
>>  (day, age, name, address,  ['k1#v1','k2#v2']),
>> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
>> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>>
>>  I tried few options to invoke my UDF for each tuple
>>
>>  1)
>> processed = foreach flatTuple generate com.myUDF.UDF($0).
>> I am expecting $0 will point to entire tuple  (day, age, name, address,
>>  ['k1#v1','k2#v2']), but
>> $0 within my UDF returns only (day, age, name, address), *For some
>> unknown reason the map is not passed into UDF.*
>>
>>  2)
>> As option 1 did not work, i assumed that $0 points to item0 , $1 points to
>> item1 and so on of the flattened tuples. As a result
>>  processed = foreach flatTuple generate com.myUDF.UDF($0, $1, $2, $3,
>> $4).
>>  would pass each of the item of input tuple into the UDF, But this threw
>> the following error
>>  $1 i get "Out of bound access. Trying to access non-existent column: 1.
>> Schema {bytearray} has 1 column(s)",
>>
>>  3)
>> As above option did not worked i tried
>>  processed = foreach flatTuple generate com.myUDF.UDF($0.$0,
>> $0.$1, $0.$2, $0.$3, $0.$4).
>>  Assuming $0 would point to input tuple and each of $0, $1... would point
>> to individual items in the the tuple
>> But this threw the following error
>>  $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be
>> cast
>> to org.apache.pig.data.Tuple
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>>
>>  Regards,
>> Deepak
>>
>>
>>  item 0-3 are of type char array and item4 is a map.
>>
>>  I iterate through these tuples
>>
>>  On Fri, Mar 18, 2011 at 8:48 AM, Daniel Dai <ji...@yahoo-inc.com>wrote:
>>
>>> Hi, Deepak,
>>> Can you be more specific? I did some simple test and cannot reproduce.
>>> What is your query? UDF?
>>>
>>> Daniel
>>>
>>>
>>> On 03/16/2011 11:24 PM, deepak kumar v wrote:
>>>
>>>> Hi,
>>>> Below are list of tuples generated after flattening a bag .
>>>>
>>>> (day, age, name, address,  ['k1#v1','k2#v2']),
>>>> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
>>>> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>>>>
>>>> process = foreach inputs generate  com.myUDF.UDF($0);
>>>> Here $0 some how gets only (day, age, name, address) and the map is
>>>> skipped.
>>>> *How can i access the map? *
>>>> With
>>>> $1 i get "Out of bound access. Trying to access non-existent column: 1.
>>>> Schema {bytearray} has 1 column(s)",
>>>> $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be
>>>> cast
>>>> to org.apache.pig.data.Tuple
>>>> at
>>>>
>>>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>>>>
>>>> Also,
>>>> With
>>>> tuples = foreach flattenedTuples generate $0
>>>> generates
>>>> (day, age, name, address),
>>>> (12/2,22,deepak,newyork),
>>>> (12/3,22,deepak,newjersy)
>>>>
>>>> After flatenning if i dump, i see the map in the resultant tuples, but
>>>> $0
>>>> instead referring to entire tuple, referes only to data part (map
>>>> skipped)
>>>> Regards,
>>>> Deepak
>>>>
>>>
>>>
>>
>>
>

Re: Unable to access Map within a tuple.

Posted by deepak kumar v <de...@gmail.com>.
The problem is schema is lost after flattening.

inputBag has schema of {outTuple : (channelMap: map [ ] ) }

However after flattening the schema is

flatTuples : {data : chararray}

How can we specify schema for a call to flatten.

On Wed, Mar 23, 2011 at 11:35 PM, Daniel Dai <ji...@yahoo-inc.com> wrote:

>  2) should be the right approach, more concisely, you can say:
> processed = foreach flatTuple generate com.myUDF.UDF(*)
>
> It seems to be something wrong with your schema propagation. How do you
> generate inputBag? If it is generated by UDF, make sure your UDF declare
> proper output schema. You can check the schema by using "describe".
>
> Daniel
>
>
> On 03/22/2011 09:30 PM, deepak kumar v wrote:
>
> Hi Daniel,
>
> I have a bag of tuples
> inputBag =
> { (day, age, name, address,  ['k1#v1','k2#v2']),
> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
> }
>
>  I need to invoke a UDF for each tuple, so i have to flatten the bag which
> i do as
>
>  flatTuples = foreach inputBag generate FLATTEN($0)
>
>  Now i get a list of tuples
>  (day, age, name, address,  ['k1#v1','k2#v2']),
> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>
>  I tried few options to invoke my UDF for each tuple
>
>  1)
> processed = foreach flatTuple generate com.myUDF.UDF($0).
> I am expecting $0 will point to entire tuple  (day, age, name, address,
>  ['k1#v1','k2#v2']), but
> $0 within my UDF returns only (day, age, name, address), *For some unknown
> reason the map is not passed into UDF.*
>
>  2)
> As option 1 did not work, i assumed that $0 points to item0 , $1 points to
> item1 and so on of the flattened tuples. As a result
>  processed = foreach flatTuple generate com.myUDF.UDF($0, $1, $2, $3, $4).
>
>  would pass each of the item of input tuple into the UDF, But this threw
> the following error
>  $1 i get "Out of bound access. Trying to access non-existent column: 1.
> Schema {bytearray} has 1 column(s)",
>
>  3)
> As above option did not worked i tried
>  processed = foreach flatTuple generate com.myUDF.UDF($0.$0,
> $0.$1, $0.$2, $0.$3, $0.$4).
>  Assuming $0 would point to input tuple and each of $0, $1... would point
> to individual items in the the tuple
> But this threw the following error
>  $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
> to org.apache.pig.data.Tuple
> at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>
>  Regards,
> Deepak
>
>
>  item 0-3 are of type char array and item4 is a map.
>
>  I iterate through these tuples
>
>  On Fri, Mar 18, 2011 at 8:48 AM, Daniel Dai <ji...@yahoo-inc.com>wrote:
>
>> Hi, Deepak,
>> Can you be more specific? I did some simple test and cannot reproduce.
>> What is your query? UDF?
>>
>> Daniel
>>
>>
>> On 03/16/2011 11:24 PM, deepak kumar v wrote:
>>
>>> Hi,
>>> Below are list of tuples generated after flattening a bag .
>>>
>>> (day, age, name, address,  ['k1#v1','k2#v2']),
>>> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
>>> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>>>
>>> process = foreach inputs generate  com.myUDF.UDF($0);
>>> Here $0 some how gets only (day, age, name, address) and the map is
>>> skipped.
>>> *How can i access the map? *
>>> With
>>> $1 i get "Out of bound access. Trying to access non-existent column: 1.
>>> Schema {bytearray} has 1 column(s)",
>>> $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be
>>> cast
>>> to org.apache.pig.data.Tuple
>>> at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>>>
>>> Also,
>>> With
>>> tuples = foreach flattenedTuples generate $0
>>> generates
>>> (day, age, name, address),
>>> (12/2,22,deepak,newyork),
>>> (12/3,22,deepak,newjersy)
>>>
>>> After flatenning if i dump, i see the map in the resultant tuples, but $0
>>> instead referring to entire tuple, referes only to data part (map
>>> skipped)
>>> Regards,
>>> Deepak
>>>
>>
>>
>
>

Re: Unable to access Map within a tuple.

Posted by Daniel Dai <ji...@yahoo-inc.com>.
2) should be the right approach, more concisely, you can say:
processed = foreach flatTuple generate com.myUDF.UDF(*)

It seems to be something wrong with your schema propagation. How do you 
generate inputBag? If it is generated by UDF, make sure your UDF declare 
proper output schema. You can check the schema by using "describe".

Daniel

On 03/22/2011 09:30 PM, deepak kumar v wrote:
> Hi Daniel,
>
> I have a bag of tuples
> inputBag =
> { (day, age, name, address,  ['k1#v1','k2#v2']),
> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
> }
>
> I need to invoke a UDF for each tuple, so i have to flatten the bag 
> which i do as
>
> flatTuples = foreach inputBag generate FLATTEN($0)
>
> Now i get a list of tuples
> (day, age, name, address,  ['k1#v1','k2#v2']),
> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>
> I tried few options to invoke my UDF for each tuple
>
> 1)
> processed = foreach flatTuple generate com.myUDF.UDF($0).
> I am expecting $0 will point to entire tuple  (day, age, name, 
> address,  ['k1#v1','k2#v2']), but
> $0 within my UDF returns only (day, age, name, address), *For some 
> unknown reason the map is not passed into UDF.*
>
> 2)
> As option 1 did not work, i assumed that $0 points to item0 , $1 
> points to item1 and so on of the flattened tuples. As a result
> processed = foreach flatTuple generate com.myUDF.UDF($0, $1, $2, $3, $4).
> would pass each of the item of input tuple into the UDF, But this 
> threw the following error
> $1 i get "Out of bound access. Trying to access non-existent column: 1.
> Schema {bytearray} has 1 column(s)",
>
> 3)
> As above option did not worked i tried
> processed = foreach flatTuple generate com.myUDF.UDF($0.$0, 
> $0.$1, $0.$2, $0.$3, $0.$4).
> Assuming $0 would point to input tuple and each of $0, $1... would 
> point to individual items in the the tuple
> But this threw the following error
> $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
> to org.apache.pig.data.Tuple
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>
> Regards,
> Deepak
>
>
> item 0-3 are of type char array and item4 is a map.
>
> I iterate through these tuples
>
> On Fri, Mar 18, 2011 at 8:48 AM, Daniel Dai <jianyong@yahoo-inc.com 
> <ma...@yahoo-inc.com>> wrote:
>
>     Hi, Deepak,
>     Can you be more specific? I did some simple test and cannot
>     reproduce. What is your query? UDF?
>
>     Daniel
>
>
>     On 03/16/2011 11:24 PM, deepak kumar v wrote:
>
>         Hi,
>         Below are list of tuples generated after flattening a bag .
>
>         (day, age, name, address,  ['k1#v1','k2#v2']),
>         (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
>         (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>
>         process = foreach inputs generate com.myUDF.UDF($0);
>         Here $0 some how gets only (day, age, name, address) and the
>         map is skipped.
>         *How can i access the map? *
>         With
>         $1 i get "Out of bound access. Trying to access non-existent
>         column: 1.
>         Schema {bytearray} has 1 column(s)",
>         $0.$1 throws java.lang.ClassCastException: java.lang.String
>         cannot be cast
>         to org.apache.pig.data.Tuple
>         at
>         org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>
>         Also,
>         With
>         tuples = foreach flattenedTuples generate $0
>         generates
>         (day, age, name, address),
>         (12/2,22,deepak,newyork),
>         (12/3,22,deepak,newjersy)
>
>         After flatenning if i dump, i see the map in the resultant
>         tuples, but $0
>         instead referring to entire tuple, referes only to data part
>         (map skipped)
>         Regards,
>         Deepak
>
>
>


Re: Unable to access Map within a tuple.

Posted by deepak kumar v <de...@gmail.com>.
Hi Daniel,

I have a bag of tuples
inputBag =
{ (day, age, name, address,  ['k1#v1','k2#v2']),
(12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
(12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
}

I need to invoke a UDF for each tuple, so i have to flatten the bag which i
do as

flatTuples = foreach inputBag generate FLATTEN($0)

Now i get a list of tuples
(day, age, name, address,  ['k1#v1','k2#v2']),
(12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
(12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])

I tried few options to invoke my UDF for each tuple

1)
processed = foreach flatTuple generate com.myUDF.UDF($0).
I am expecting $0 will point to entire tuple  (day, age, name, address,
 ['k1#v1','k2#v2']), but
$0 within my UDF returns only (day, age, name, address), *For some unknown
reason the map is not passed into UDF.*

2)
As option 1 did not work, i assumed that $0 points to item0 , $1 points to
item1 and so on of the flattened tuples. As a result
processed = foreach flatTuple generate com.myUDF.UDF($0, $1, $2, $3, $4).
would pass each of the item of input tuple into the UDF, But this threw the
following error
$1 i get "Out of bound access. Trying to access non-existent column: 1.
Schema {bytearray} has 1 column(s)",

3)
As above option did not worked i tried
processed = foreach flatTuple generate com.myUDF.UDF($0.$0,
$0.$1, $0.$2, $0.$3, $0.$4).
Assuming $0 would point to input tuple and each of $0, $1... would point to
individual items in the the tuple
But this threw the following error
$0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
to org.apache.pig.data.Tuple
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)

Regards,
Deepak


item 0-3 are of type char array and item4 is a map.

I iterate through these tuples

On Fri, Mar 18, 2011 at 8:48 AM, Daniel Dai <ji...@yahoo-inc.com> wrote:

> Hi, Deepak,
> Can you be more specific? I did some simple test and cannot reproduce. What
> is your query? UDF?
>
> Daniel
>
>
> On 03/16/2011 11:24 PM, deepak kumar v wrote:
>
>> Hi,
>> Below are list of tuples generated after flattening a bag .
>>
>> (day, age, name, address,  ['k1#v1','k2#v2']),
>> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
>> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>>
>> process = foreach inputs generate com.myUDF.UDF($0);
>> Here $0 some how gets only (day, age, name, address) and the map is
>> skipped.
>> *How can i access the map? *
>> With
>> $1 i get "Out of bound access. Trying to access non-existent column: 1.
>> Schema {bytearray} has 1 column(s)",
>> $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
>> to org.apache.pig.data.Tuple
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>>
>> Also,
>> With
>> tuples = foreach flattenedTuples generate $0
>> generates
>> (day, age, name, address),
>> (12/2,22,deepak,newyork),
>> (12/3,22,deepak,newjersy)
>>
>> After flatenning if i dump, i see the map in the resultant tuples, but $0
>> instead referring to entire tuple, referes only to data part (map skipped)
>> Regards,
>> Deepak
>>
>
>

Re: Unable to access Map within a tuple.

Posted by deepak kumar v <de...@gmail.com>.
Hi Daniel,

I have a bag of tuples
inputBag =
{ (day, age, name, address,  ['k1#v1','k2#v2']),
(12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
(12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
}

I need to invoke a UDF for each tuple, so i have to flatten the bag which i
do as

flatTuples = foreach inputBag generate FLATTEN($0)

Now i get a list of tuples
(day, age, name, address,  ['k1#v1','k2#v2']),
(12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
(12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])

I tried few options to invoke my UDF for each tuple

1)
processed = foreach flatTuple generate com.myUDF.UDF($0).
I am expecting $0 will point to entire tuple  (day, age, name, address,
 ['k1#v1','k2#v2']), but
$0 within my UDF returns only (day, age, name, address), *For some unknown
reason the map is not passed into UDF.*

2)
As option 1 did not work, i assumed that $0 points to item0 , $1 points to
item1 and so on of the flattened tuples. As a result
processed = foreach flatTuple generate com.myUDF.UDF($0, $1, $2, $3, $4).
would pass each of the item of input tuple into the UDF, But this threw the
following error
$1 i get "Out of bound access. Trying to access non-existent column: 1.
Schema {bytearray} has 1 column(s)",

3)
As above option did not worked i tried
processed = foreach flatTuple generate com.myUDF.UDF($0.$0,
$0.$1, $0.$2, $0.$3, $0.$4).
Assuming $0 would point to input tuple and each of $0, $1... would point to
individual items in the the tuple
But this threw the following error
$0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
to org.apache.pig.data.Tuple
at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)

Regards,
Deepak


item 0-3 are of type char array and item4 is a map.

I iterate through these tuples

On Fri, Mar 18, 2011 at 8:48 AM, Daniel Dai <ji...@yahoo-inc.com> wrote:

> Hi, Deepak,
> Can you be more specific? I did some simple test and cannot reproduce. What
> is your query? UDF?
>
> Daniel
>
>
> On 03/16/2011 11:24 PM, deepak kumar v wrote:
>
>> Hi,
>> Below are list of tuples generated after flattening a bag .
>>
>> (day, age, name, address,  ['k1#v1','k2#v2']),
>> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
>> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>>
>> process = foreach inputs generate com.myUDF.UDF($0);
>> Here $0 some how gets only (day, age, name, address) and the map is
>> skipped.
>> *How can i access the map? *
>> With
>> $1 i get "Out of bound access. Trying to access non-existent column: 1.
>> Schema {bytearray} has 1 column(s)",
>> $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
>> to org.apache.pig.data.Tuple
>> at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>>
>> Also,
>> With
>> tuples = foreach flattenedTuples generate $0
>> generates
>> (day, age, name, address),
>> (12/2,22,deepak,newyork),
>> (12/3,22,deepak,newjersy)
>>
>> After flatenning if i dump, i see the map in the resultant tuples, but $0
>> instead referring to entire tuple, referes only to data part (map skipped)
>> Regards,
>> Deepak
>>
>
>

Re: Unable to access Map within a tuple.

Posted by Daniel Dai <ji...@yahoo-inc.com>.
Hi, Deepak,
Can you be more specific? I did some simple test and cannot reproduce. 
What is your query? UDF?

Daniel

On 03/16/2011 11:24 PM, deepak kumar v wrote:
> Hi,
> Below are list of tuples generated after flattening a bag .
>
> (day, age, name, address,  ['k1#v1','k2#v2']),
> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>
> process = foreach inputs generate com.yahoo.peblpig.udf.InvokeProcess($0);
> Here $0 some how gets only (day, age, name, address) and the map is skipped.
> *How can i access the map? *
> With
> $1 i get "Out of bound access. Trying to access non-existent column: 1.
> Schema {bytearray} has 1 column(s)",
> $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
> to org.apache.pig.data.Tuple
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>
> Also,
> With
> tuples = foreach flattenedTuples generate $0
> generates
> (day, age, name, address),
> (12/2,22,deepak,newyork),
> (12/3,22,deepak,newjersy)
>
> After flatenning if i dump, i see the map in the resultant tuples, but $0
> instead referring to entire tuple, referes only to data part (map skipped)
> Regards,
> Deepak


Re: Unable to access Map within a tuple.

Posted by Daniel Dai <ji...@yahoo-inc.com>.
Hi, Deepak,
Can you be more specific? I did some simple test and cannot reproduce. 
What is your query? UDF?

Daniel

On 03/16/2011 11:24 PM, deepak kumar v wrote:
> Hi,
> Below are list of tuples generated after flattening a bag .
>
> (day, age, name, address,  ['k1#v1','k2#v2']),
> (12/2,22,deepak,newyork,  ['k1#v1','k2#v2']),
> (12/3,22,deepak,newjersy,  ['k1#v1','k2#v2'])
>
> process = foreach inputs generate com.yahoo.peblpig.udf.InvokeProcess($0);
> Here $0 some how gets only (day, age, name, address) and the map is skipped.
> *How can i access the map? *
> With
> $1 i get "Out of bound access. Trying to access non-existent column: 1.
> Schema {bytearray} has 1 column(s)",
> $0.$1 throws java.lang.ClassCastException: java.lang.String cannot be cast
> to org.apache.pig.data.Tuple
> at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POProject.getNext(POProject.java:389)
>
> Also,
> With
> tuples = foreach flattenedTuples generate $0
> generates
> (day, age, name, address),
> (12/2,22,deepak,newyork),
> (12/3,22,deepak,newjersy)
>
> After flatenning if i dump, i see the map in the resultant tuples, but $0
> instead referring to entire tuple, referes only to data part (map skipped)
> Regards,
> Deepak