You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by ey-chih chow <ey...@gmail.com> on 2013/11/27 23:14:13 UTC
Pig syntax to access fields of records in an array
Hi,
We have an Avro file of which a field that is an array of tuples as follows:
cam:bag{ARRAY_ELEM:tuple(BIDCOUNT: int, ...
I tried to access BIDCOUNT with 'cam.BIDCOUNT'. It is not working. Any
body knows how to access BIDCOUNT? Thanks.
Ey-Chih Chow
Re: Pig syntax to access fields of records in an array
Posted by ey-chih chow <ey...@gmail.com>.
Basically, I would like to know if a field f of an Avro record is of the
type array. How can I access the first element of the field in Pig?
Thanks.
Ey-Chih Chow
On Thu, Nov 28, 2013 at 12:52 PM, ey-chih chow <ey...@gmail.com> wrote:
> Sorry, in the previous post, the avro schema of the field should be:
>
> {
> "name" : "com",
> "type" : {
> "type" : "array",
> "items" : {
> "type" : "record",
> "name" : "campaignRecord",
> "doc" : "RTB json logs flattened.",
> "fields" : [ {
> "name" : "BIDCOUNT",
> "type" : "int"
> }]
> }
> }
> }
>
> Thanks.
>
>
> Ey-Chih Chow
>
>
> On Thu, Nov 28, 2013 at 12:33 PM, ey-chih chow <ey...@gmail.com> wrote:
>
>> I have a Pig script. The script begins with a load statement that loads
>> data in an avro file. The schema of data in the file has a field com that
>> is defined in the following way in the schema:
>>
>> {
>> "name" : "com",
>> "type" : {
>> "type" : "array",
>> "items" : {
>> "type" : "record",
>> "name" : "campaignRecord",
>> "doc" : "RTB json logs flattened.",
>> "fields" : [ {
>> "name" : "BIDCOUNT",
>> "type" : "int"
>> }}
>> }
>> }
>>
>>
>> After the load statement, there is a group-by statement that does a group
>> by on some other fields. After the group-by, we have the following
>> statement:
>>
>> FOREACH gstmt group AS key,SUM(RTBALLLOGS.com.BIDCOUNT) AS BIDCOUNT;
>>
>> This statement is not working with the following message when I debug the
>> script with Eclipse:
>>
>> Cannot find field BIDCOUNT in com:bag{ARRAY_ELEM:tuple(BIDCOUNT;int))
>>
>> Thanks.
>>
>> Ey-Chih
>>
>>
>> On Thu, Nov 28, 2013 at 9:39 AM, Ruslan Al-Fakikh <me...@gmail.com>wrote:
>>
>>> I think your expression ends up with a bag with just that column. Can you
>>> give the full context where it is not working?
>>> 28 нояб. 2013 г. 2:14 пользователь "ey-chih chow" <ey...@gmail.com>
>>> написал:
>>>
>>> > Hi,
>>> >
>>> > We have an Avro file of which a field that is an array of tuples as
>>> > follows:
>>> >
>>> >
>>> > cam:bag{ARRAY_ELEM:tuple(BIDCOUNT: int, ...
>>> >
>>> >
>>> > I tried to access BIDCOUNT with 'cam.BIDCOUNT'. It is not working.
>>> Any
>>> > body knows how to access BIDCOUNT? Thanks.
>>> >
>>> >
>>> > Ey-Chih Chow
>>> >
>>>
>>
>>
>
Re: Pig syntax to access fields of records in an array
Posted by ey-chih chow <ey...@gmail.com>.
Sorry, in the previous post, the avro schema of the field should be:
{
"name" : "com",
"type" : {
"type" : "array",
"items" : {
"type" : "record",
"name" : "campaignRecord",
"doc" : "RTB json logs flattened.",
"fields" : [ {
"name" : "BIDCOUNT",
"type" : "int"
}]
}
}
}
Thanks.
Ey-Chih Chow
On Thu, Nov 28, 2013 at 12:33 PM, ey-chih chow <ey...@gmail.com> wrote:
> I have a Pig script. The script begins with a load statement that loads
> data in an avro file. The schema of data in the file has a field com that
> is defined in the following way in the schema:
>
> {
> "name" : "com",
> "type" : {
> "type" : "array",
> "items" : {
> "type" : "record",
> "name" : "campaignRecord",
> "doc" : "RTB json logs flattened.",
> "fields" : [ {
> "name" : "BIDCOUNT",
> "type" : "int"
> }}
> }
> }
>
>
> After the load statement, there is a group-by statement that does a group
> by on some other fields. After the group-by, we have the following
> statement:
>
> FOREACH gstmt group AS key,SUM(RTBALLLOGS.com.BIDCOUNT) AS BIDCOUNT;
>
> This statement is not working with the following message when I debug the
> script with Eclipse:
>
> Cannot find field BIDCOUNT in com:bag{ARRAY_ELEM:tuple(BIDCOUNT;int))
>
> Thanks.
>
> Ey-Chih
>
>
> On Thu, Nov 28, 2013 at 9:39 AM, Ruslan Al-Fakikh <me...@gmail.com>wrote:
>
>> I think your expression ends up with a bag with just that column. Can you
>> give the full context where it is not working?
>> 28 нояб. 2013 г. 2:14 пользователь "ey-chih chow" <ey...@gmail.com>
>> написал:
>>
>> > Hi,
>> >
>> > We have an Avro file of which a field that is an array of tuples as
>> > follows:
>> >
>> >
>> > cam:bag{ARRAY_ELEM:tuple(BIDCOUNT: int, ...
>> >
>> >
>> > I tried to access BIDCOUNT with 'cam.BIDCOUNT'. It is not working. Any
>> > body knows how to access BIDCOUNT? Thanks.
>> >
>> >
>> > Ey-Chih Chow
>> >
>>
>
>
Re: Pig syntax to access fields of records in an array
Posted by ey-chih chow <ey...@gmail.com>.
I have a Pig script. The script begins with a load statement that loads
data in an avro file. The schema of data in the file has a field com that
is defined in the following way in the schema:
{
"name" : "com",
"type" : {
"type" : "array",
"items" : {
"type" : "record",
"name" : "campaignRecord",
"doc" : "RTB json logs flattened.",
"fields" : [ {
"name" : "BIDCOUNT",
"type" : "int"
}}
}
}
After the load statement, there is a group-by statement that does a group
by on some other fields. After the group-by, we have the following
statement:
FOREACH gstmt group AS key,SUM(RTBALLLOGS.com.BIDCOUNT) AS BIDCOUNT;
This statement is not working with the following message when I debug the
script with Eclipse:
Cannot find field BIDCOUNT in com:bag{ARRAY_ELEM:tuple(BIDCOUNT;int))
Thanks.
Ey-Chih
On Thu, Nov 28, 2013 at 9:39 AM, Ruslan Al-Fakikh <me...@gmail.com>wrote:
> I think your expression ends up with a bag with just that column. Can you
> give the full context where it is not working?
> 28 нояб. 2013 г. 2:14 пользователь "ey-chih chow" <ey...@gmail.com>
> написал:
>
> > Hi,
> >
> > We have an Avro file of which a field that is an array of tuples as
> > follows:
> >
> >
> > cam:bag{ARRAY_ELEM:tuple(BIDCOUNT: int, ...
> >
> >
> > I tried to access BIDCOUNT with 'cam.BIDCOUNT'. It is not working. Any
> > body knows how to access BIDCOUNT? Thanks.
> >
> >
> > Ey-Chih Chow
> >
>
Re: Pig syntax to access fields of records in an array
Posted by Ruslan Al-Fakikh <me...@gmail.com>.
I think your expression ends up with a bag with just that column. Can you
give the full context where it is not working?
28 нояб. 2013 г. 2:14 пользователь "ey-chih chow" <ey...@gmail.com>
написал:
> Hi,
>
> We have an Avro file of which a field that is an array of tuples as
> follows:
>
>
> cam:bag{ARRAY_ELEM:tuple(BIDCOUNT: int, ...
>
>
> I tried to access BIDCOUNT with 'cam.BIDCOUNT'. It is not working. Any
> body knows how to access BIDCOUNT? Thanks.
>
>
> Ey-Chih Chow
>