You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Guy Bayes <fa...@gmail.com> on 2009/07/19 06:36:48 UTC

oddness with large numeric map[] values

Hello all, new to this list, new to pig, running into some odd behavior with
map[] data types.

Please forgive me if these are known issues or problems with my syntax,

What am i doing wrong here? missing some cast somewhere?

This works:

grunt> cat data2
[apache#2000000000000000000000zzz,foo#foo1]
grunt> T1 = load 'data2' as (f1:map[]);
grunt> dump T1;
2009-07-18 21:35:11,871 [Thread-11] WARN  org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2009-07-18 21:35:21,889 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2009-07-18 21:35:21,889 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
([foo#foo1,apache#2000000000000000000000zzz])

This doesn't return anything.

grunt> cat data
[apache#2000000000000000000000,foo#foo1]
grunt> T1 = load 'data' as (f1:map[]);
grunt> dump T1;
2009-07-18 21:36:09,873 [Thread-21] WARN  org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications should
implement Tool for the same.
2009-07-18 21:36:19,886 [main] WARN
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).
2009-07-18 21:36:19,886 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- 100% complete
2009-07-18 21:36:19,886 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher
- Success!
()

Re: oddness with large numeric map[] values

Posted by Guy Bayes <fa...@gmail.com>.
awesome Santhosh and thanks for the responsiveness!

2009/7/19 Santhosh Srinivasan <sm...@yahoo-inc.com>

> With Pig-880 (https://issues.apache.org/jira/browse/PIG-880), the value
> per key in the text data will be treated as bytearray. When Pig-880 is
> committed (implicit/explicit) casts will be required to interpret the data.
>
> Thanks,
> Santhosh
>
> -----Original Message-----
> From: zjffdu [mailto:zjffdu@gmail.com]
> Sent: Sunday, July 19, 2009 4:40 PM
> To: pig-user@hadoop.apache.org
> Subject: RE: oddness with large numeric map[] values
>
> +1, I am also curious to know about this.  I think it make no sense if
> there's no such feature.
>
>
>
> -----Original Message-----
> From: Guy Bayes [mailto:fatal.error@gmail.com]
> Sent: 2009年7月18日 23:50
> To: pig-user@hadoop.apache.org
> Subject: Re: oddness with large numeric map[] values
>
> yup the problem is that the actual data actually contains strings that look
> like numbers occasionally.
>
> is there any way to tell apache to treat it like a string always?
>
> On Sat, Jul 18, 2009 at 11:23 PM, Santhosh Srinivasan
> <sm...@yahoo-inc.com>wrote:
>
> > I just read your email all over again. The reason for the failure is the
> > following.
> >
> > 1. [apache#2000000000000000000000zzz,foo#foo1] - Here the value for the
> > key apache is treated as a string.
> >
> > 2. [apache#2000000000000000000000,foo#foo1] - Here the value for the key
> > apache is treated as an integer. Since 2000000000000000000000 is too big
> > to fit into an integer it failed and inserted a null. Try adding an L at
> > the end of 2000000000000000000000, i.e., 2000000000000000000000L
> >
> > Santhosh
> >
> > -----Original Message-----
> > From: Guy Bayes [mailto:fatal.error@gmail.com]
> > Sent: Saturday, July 18, 2009 11:21 PM
> > To: pig-user@hadoop.apache.org
> > Subject: Re: oddness with large numeric map[] values
> >
> > this is reproducible (at least for me) with any large number in the
> > value
> > column of any map schema declaration. Tried it on pig v0.2 and 0.3
> >
> > On Sat, Jul 18, 2009 at 11:09 PM, Santhosh Srinivasan
> > <sm...@yahoo-inc.com>wrote:
> >
> > > In the second case, there is a warning message:
> > >
> > > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
> > time(s).
> > >
> > > The conversion of the text data into a map failed for some reason. If
> > > you examine the system logs on the job tracker UI, you should probably
> > > see why the conversion failed. You can also achieve the same result by
> > > turning warning aggregation off with the -w option.
> > >
> > > Santhosh
> > >
> > >
> > > -----Original Message-----
> > > From: Guy Bayes [mailto:fatal.error@gmail.com]
> > > Sent: Saturday, July 18, 2009 9:37 PM
> > > To: pig-user@hadoop.apache.org
> > > Subject: oddness with large numeric map[] values
> > >
> > > Hello all, new to this list, new to pig, running into some odd
> > behavior
> > > with
> > > map[] data types.
> > >
> > > Please forgive me if these are known issues or problems with my
> > syntax,
> > >
> > > What am i doing wrong here? missing some cast somewhere?
> > >
> > > This works:
> > >
> > > grunt> cat data2
> > > [apache#2000000000000000000000zzz,foo#foo1]
> > > grunt> T1 = load 'data2' as (f1:map[]);
> > > grunt> dump T1;
> > > 2009-07-18 21:35:11,871 [Thread-11] WARN
> > > org.apache.hadoop.mapred.JobClient
> > > - Use GenericOptionsParser for parsing the arguments. Applications
> > > should
> > > implement Tool for the same.
> > > 2009-07-18 21:35:21,889 [main] INFO
> > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > > uncher
> > > - 100% complete
> > > 2009-07-18 21:35:21,889 [main] INFO
> > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > > uncher
> > > - Success!
> > > ([foo#foo1,apache#2000000000000000000000zzz])
> > >
> > > This doesn't return anything.
> > >
> > > grunt> cat data
> > > [apache#2000000000000000000000,foo#foo1]
> > > grunt> T1 = load 'data' as (f1:map[]);
> > > grunt> dump T1;
> > > 2009-07-18 21:36:09,873 [Thread-21] WARN
> > > org.apache.hadoop.mapred.JobClient
> > > - Use GenericOptionsParser for parsing the arguments. Applications
> > > should
> > > implement Tool for the same.
> > > 2009-07-18 21:36:19,886 [main] WARN
> > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > > uncher
> > > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
> > time(s).
> > > 2009-07-18 21:36:19,886 [main] INFO
> > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > > uncher
> > > - 100% complete
> > > 2009-07-18 21:36:19,886 [main] INFO
> > >
> > org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > > uncher
> > > - Success!
> > > ()
> > >
> >
> >
> >
> > --
> > you may be acquainted with the night
> > but i have seen the darkness in the day
> > and you must know it is a terrifying sight...
> >
>
>
>
> --
> you may be acquainted with the night
> but i have seen the darkness in the day
> and you must know it is a terrifying sight...
>
>


-- 
you may be acquainted with the night
but i have seen the darkness in the day
and you must know it is a terrifying sight...

RE: oddness with large numeric map[] values

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
Its already captured as part of Pig-724(https://issues.apache.org/jira/browse/PIG-724) which in turn is linked to Pig-880.

Thanks,
Santhosh 

-----Original Message-----
From: Fatal.error [mailto:fatal.error@gmail.com] 
Sent: Sunday, July 19, 2009 12:16 PM
To: pig-user@hadoop.apache.org
Subject: Re: oddness with large numeric map[] values

So should I leave this alone or file a seperate bug report you think?
Thanks
Guy



On Jul 19, 2009, at 8:50 AM, "Santhosh Srinivasan" <sm...@yahoo-inc.com>  
wrote:

> With Pig-880 (https://issues.apache.org/jira/browse/PIG-880), the  
> value per key in the text data will be treated as bytearray. When  
> Pig-880 is committed (implicit/explicit) casts will be required to  
> interpret the data.
>
> Thanks,
> Santhosh
>
> -----Original Message-----
> From: zjffdu [mailto:zjffdu@gmail.com]
> Sent: Sunday, July 19, 2009 4:40 PM
> To: pig-user@hadoop.apache.org
> Subject: RE: oddness with large numeric map[] values
>
> +1, I am also curious to know about this.  I think it make no sense  
> if
> there's no such feature.
>
>
>
> -----Original Message-----
> From: Guy Bayes [mailto:fatal.error@gmail.com]
> Sent: 2009年7月18日 23:50
> To: pig-user@hadoop.apache.org
> Subject: Re: oddness with large numeric map[] values
>
> yup the problem is that the actual data actually contains strings  
> that look
> like numbers occasionally.
>
> is there any way to tell apache to treat it like a string always?
>
> On Sat, Jul 18, 2009 at 11:23 PM, Santhosh Srinivasan
> <sm...@yahoo-inc.com>wrote:
>
>> I just read your email all over again. The reason for the failure  
>> is the
>> following.
>>
>> 1. [apache#2000000000000000000000zzz,foo#foo1] - Here the value for  
>> the
>> key apache is treated as a string.
>>
>> 2. [apache#2000000000000000000000,foo#foo1] - Here the value for  
>> the key
>> apache is treated as an integer. Since 2000000000000000000000 is  
>> too big
>> to fit into an integer it failed and inserted a null. Try adding an  
>> L at
>> the end of 2000000000000000000000, i.e., 2000000000000000000000L
>>
>> Santhosh
>>
>> -----Original Message-----
>> From: Guy Bayes [mailto:fatal.error@gmail.com]
>> Sent: Saturday, July 18, 2009 11:21 PM
>> To: pig-user@hadoop.apache.org
>> Subject: Re: oddness with large numeric map[] values
>>
>> this is reproducible (at least for me) with any large number in the
>> value
>> column of any map schema declaration. Tried it on pig v0.2 and 0.3
>>
>> On Sat, Jul 18, 2009 at 11:09 PM, Santhosh Srinivasan
>> <sm...@yahoo-inc.com>wrote:
>>
>>> In the second case, there is a warning message:
>>>
>>> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
>> time(s).
>>>
>>> The conversion of the text data into a map failed for some reason.  
>>> If
>>> you examine the system logs on the job tracker UI, you should  
>>> probably
>>> see why the conversion failed. You can also achieve the same  
>>> result by
>>> turning warning aggregation off with the -w option.
>>>
>>> Santhosh
>>>
>>>
>>> -----Original Message-----
>>> From: Guy Bayes [mailto:fatal.error@gmail.com]
>>> Sent: Saturday, July 18, 2009 9:37 PM
>>> To: pig-user@hadoop.apache.org
>>> Subject: oddness with large numeric map[] values
>>>
>>> Hello all, new to this list, new to pig, running into some odd
>> behavior
>>> with
>>> map[] data types.
>>>
>>> Please forgive me if these are known issues or problems with my
>> syntax,
>>>
>>> What am i doing wrong here? missing some cast somewhere?
>>>
>>> This works:
>>>
>>> grunt> cat data2
>>> [apache#2000000000000000000000zzz,foo#foo1]
>>> grunt> T1 = load 'data2' as (f1:map[]);
>>> grunt> dump T1;
>>> 2009-07-18 21:35:11,871 [Thread-11] WARN
>>> org.apache.hadoop.mapred.JobClient
>>> - Use GenericOptionsParser for parsing the arguments. Applications
>>> should
>>> implement Tool for the same.
>>> 2009-07-18 21:35:21,889 [main] INFO
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - 100% complete
>>> 2009-07-18 21:35:21,889 [main] INFO
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - Success!
>>> ([foo#foo1,apache#2000000000000000000000zzz])
>>>
>>> This doesn't return anything.
>>>
>>> grunt> cat data
>>> [apache#2000000000000000000000,foo#foo1]
>>> grunt> T1 = load 'data' as (f1:map[]);
>>> grunt> dump T1;
>>> 2009-07-18 21:36:09,873 [Thread-21] WARN
>>> org.apache.hadoop.mapred.JobClient
>>> - Use GenericOptionsParser for parsing the arguments. Applications
>>> should
>>> implement Tool for the same.
>>> 2009-07-18 21:36:19,886 [main] WARN
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
>> time(s).
>>> 2009-07-18 21:36:19,886 [main] INFO
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - 100% complete
>>> 2009-07-18 21:36:19,886 [main] INFO
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - Success!
>>> ()
>>>
>>
>>
>>
>> --
>> you may be acquainted with the night
>> but i have seen the darkness in the day
>> and you must know it is a terrifying sight...
>>
>
>
>
> -- 
> you may be acquainted with the night
> but i have seen the darkness in the day
> and you must know it is a terrifying sight...
>

Re: oddness with large numeric map[] values

Posted by "Fatal.error" <fa...@gmail.com>.
So should I leave this alone or file a seperate bug report you think?
Thanks
Guy



On Jul 19, 2009, at 8:50 AM, "Santhosh Srinivasan" <sm...@yahoo-inc.com>  
wrote:

> With Pig-880 (https://issues.apache.org/jira/browse/PIG-880), the  
> value per key in the text data will be treated as bytearray. When  
> Pig-880 is committed (implicit/explicit) casts will be required to  
> interpret the data.
>
> Thanks,
> Santhosh
>
> -----Original Message-----
> From: zjffdu [mailto:zjffdu@gmail.com]
> Sent: Sunday, July 19, 2009 4:40 PM
> To: pig-user@hadoop.apache.org
> Subject: RE: oddness with large numeric map[] values
>
> +1, I am also curious to know about this.  I think it make no sense  
> if
> there's no such feature.
>
>
>
> -----Original Message-----
> From: Guy Bayes [mailto:fatal.error@gmail.com]
> Sent: 2009年7月18日 23:50
> To: pig-user@hadoop.apache.org
> Subject: Re: oddness with large numeric map[] values
>
> yup the problem is that the actual data actually contains strings  
> that look
> like numbers occasionally.
>
> is there any way to tell apache to treat it like a string always?
>
> On Sat, Jul 18, 2009 at 11:23 PM, Santhosh Srinivasan
> <sm...@yahoo-inc.com>wrote:
>
>> I just read your email all over again. The reason for the failure  
>> is the
>> following.
>>
>> 1. [apache#2000000000000000000000zzz,foo#foo1] - Here the value for  
>> the
>> key apache is treated as a string.
>>
>> 2. [apache#2000000000000000000000,foo#foo1] - Here the value for  
>> the key
>> apache is treated as an integer. Since 2000000000000000000000 is  
>> too big
>> to fit into an integer it failed and inserted a null. Try adding an  
>> L at
>> the end of 2000000000000000000000, i.e., 2000000000000000000000L
>>
>> Santhosh
>>
>> -----Original Message-----
>> From: Guy Bayes [mailto:fatal.error@gmail.com]
>> Sent: Saturday, July 18, 2009 11:21 PM
>> To: pig-user@hadoop.apache.org
>> Subject: Re: oddness with large numeric map[] values
>>
>> this is reproducible (at least for me) with any large number in the
>> value
>> column of any map schema declaration. Tried it on pig v0.2 and 0.3
>>
>> On Sat, Jul 18, 2009 at 11:09 PM, Santhosh Srinivasan
>> <sm...@yahoo-inc.com>wrote:
>>
>>> In the second case, there is a warning message:
>>>
>>> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
>> time(s).
>>>
>>> The conversion of the text data into a map failed for some reason.  
>>> If
>>> you examine the system logs on the job tracker UI, you should  
>>> probably
>>> see why the conversion failed. You can also achieve the same  
>>> result by
>>> turning warning aggregation off with the -w option.
>>>
>>> Santhosh
>>>
>>>
>>> -----Original Message-----
>>> From: Guy Bayes [mailto:fatal.error@gmail.com]
>>> Sent: Saturday, July 18, 2009 9:37 PM
>>> To: pig-user@hadoop.apache.org
>>> Subject: oddness with large numeric map[] values
>>>
>>> Hello all, new to this list, new to pig, running into some odd
>> behavior
>>> with
>>> map[] data types.
>>>
>>> Please forgive me if these are known issues or problems with my
>> syntax,
>>>
>>> What am i doing wrong here? missing some cast somewhere?
>>>
>>> This works:
>>>
>>> grunt> cat data2
>>> [apache#2000000000000000000000zzz,foo#foo1]
>>> grunt> T1 = load 'data2' as (f1:map[]);
>>> grunt> dump T1;
>>> 2009-07-18 21:35:11,871 [Thread-11] WARN
>>> org.apache.hadoop.mapred.JobClient
>>> - Use GenericOptionsParser for parsing the arguments. Applications
>>> should
>>> implement Tool for the same.
>>> 2009-07-18 21:35:21,889 [main] INFO
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - 100% complete
>>> 2009-07-18 21:35:21,889 [main] INFO
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - Success!
>>> ([foo#foo1,apache#2000000000000000000000zzz])
>>>
>>> This doesn't return anything.
>>>
>>> grunt> cat data
>>> [apache#2000000000000000000000,foo#foo1]
>>> grunt> T1 = load 'data' as (f1:map[]);
>>> grunt> dump T1;
>>> 2009-07-18 21:36:09,873 [Thread-21] WARN
>>> org.apache.hadoop.mapred.JobClient
>>> - Use GenericOptionsParser for parsing the arguments. Applications
>>> should
>>> implement Tool for the same.
>>> 2009-07-18 21:36:19,886 [main] WARN
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
>> time(s).
>>> 2009-07-18 21:36:19,886 [main] INFO
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - 100% complete
>>> 2009-07-18 21:36:19,886 [main] INFO
>>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
 

>>> uncher
>>> - Success!
>>> ()
>>>
>>
>>
>>
>> --
>> you may be acquainted with the night
>> but i have seen the darkness in the day
>> and you must know it is a terrifying sight...
>>
>
>
>
> -- 
> you may be acquainted with the night
> but i have seen the darkness in the day
> and you must know it is a terrifying sight...
>

RE: oddness with large numeric map[] values

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
With Pig-880 (https://issues.apache.org/jira/browse/PIG-880), the value per key in the text data will be treated as bytearray. When Pig-880 is committed (implicit/explicit) casts will be required to interpret the data.

Thanks,
Santhosh

-----Original Message-----
From: zjffdu [mailto:zjffdu@gmail.com] 
Sent: Sunday, July 19, 2009 4:40 PM
To: pig-user@hadoop.apache.org
Subject: RE: oddness with large numeric map[] values

+1, I am also curious to know about this.  I think it make no sense if
there's no such feature.



-----Original Message-----
From: Guy Bayes [mailto:fatal.error@gmail.com] 
Sent: 2009年7月18日 23:50
To: pig-user@hadoop.apache.org
Subject: Re: oddness with large numeric map[] values

yup the problem is that the actual data actually contains strings that look
like numbers occasionally.

is there any way to tell apache to treat it like a string always?

On Sat, Jul 18, 2009 at 11:23 PM, Santhosh Srinivasan
<sm...@yahoo-inc.com>wrote:

> I just read your email all over again. The reason for the failure is the
> following.
>
> 1. [apache#2000000000000000000000zzz,foo#foo1] - Here the value for the
> key apache is treated as a string.
>
> 2. [apache#2000000000000000000000,foo#foo1] - Here the value for the key
> apache is treated as an integer. Since 2000000000000000000000 is too big
> to fit into an integer it failed and inserted a null. Try adding an L at
> the end of 2000000000000000000000, i.e., 2000000000000000000000L
>
> Santhosh
>
> -----Original Message-----
> From: Guy Bayes [mailto:fatal.error@gmail.com]
> Sent: Saturday, July 18, 2009 11:21 PM
> To: pig-user@hadoop.apache.org
> Subject: Re: oddness with large numeric map[] values
>
> this is reproducible (at least for me) with any large number in the
> value
> column of any map schema declaration. Tried it on pig v0.2 and 0.3
>
> On Sat, Jul 18, 2009 at 11:09 PM, Santhosh Srinivasan
> <sm...@yahoo-inc.com>wrote:
>
> > In the second case, there is a warning message:
> >
> > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
> time(s).
> >
> > The conversion of the text data into a map failed for some reason. If
> > you examine the system logs on the job tracker UI, you should probably
> > see why the conversion failed. You can also achieve the same result by
> > turning warning aggregation off with the -w option.
> >
> > Santhosh
> >
> >
> > -----Original Message-----
> > From: Guy Bayes [mailto:fatal.error@gmail.com]
> > Sent: Saturday, July 18, 2009 9:37 PM
> > To: pig-user@hadoop.apache.org
> > Subject: oddness with large numeric map[] values
> >
> > Hello all, new to this list, new to pig, running into some odd
> behavior
> > with
> > map[] data types.
> >
> > Please forgive me if these are known issues or problems with my
> syntax,
> >
> > What am i doing wrong here? missing some cast somewhere?
> >
> > This works:
> >
> > grunt> cat data2
> > [apache#2000000000000000000000zzz,foo#foo1]
> > grunt> T1 = load 'data2' as (f1:map[]);
> > grunt> dump T1;
> > 2009-07-18 21:35:11,871 [Thread-11] WARN
> > org.apache.hadoop.mapred.JobClient
> > - Use GenericOptionsParser for parsing the arguments. Applications
> > should
> > implement Tool for the same.
> > 2009-07-18 21:35:21,889 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - 100% complete
> > 2009-07-18 21:35:21,889 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Success!
> > ([foo#foo1,apache#2000000000000000000000zzz])
> >
> > This doesn't return anything.
> >
> > grunt> cat data
> > [apache#2000000000000000000000,foo#foo1]
> > grunt> T1 = load 'data' as (f1:map[]);
> > grunt> dump T1;
> > 2009-07-18 21:36:09,873 [Thread-21] WARN
> > org.apache.hadoop.mapred.JobClient
> > - Use GenericOptionsParser for parsing the arguments. Applications
> > should
> > implement Tool for the same.
> > 2009-07-18 21:36:19,886 [main] WARN
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
> time(s).
> > 2009-07-18 21:36:19,886 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - 100% complete
> > 2009-07-18 21:36:19,886 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Success!
> > ()
> >
>
>
>
> --
> you may be acquainted with the night
> but i have seen the darkness in the day
> and you must know it is a terrifying sight...
>



-- 
you may be acquainted with the night
but i have seen the darkness in the day
and you must know it is a terrifying sight...


RE: oddness with large numeric map[] values

Posted by zjffdu <zj...@gmail.com>.
+1, I am also curious to know about this.  I think it make no sense if
there's no such feature.



-----Original Message-----
From: Guy Bayes [mailto:fatal.error@gmail.com] 
Sent: 2009年7月18日 23:50
To: pig-user@hadoop.apache.org
Subject: Re: oddness with large numeric map[] values

yup the problem is that the actual data actually contains strings that look
like numbers occasionally.

is there any way to tell apache to treat it like a string always?

On Sat, Jul 18, 2009 at 11:23 PM, Santhosh Srinivasan
<sm...@yahoo-inc.com>wrote:

> I just read your email all over again. The reason for the failure is the
> following.
>
> 1. [apache#2000000000000000000000zzz,foo#foo1] - Here the value for the
> key apache is treated as a string.
>
> 2. [apache#2000000000000000000000,foo#foo1] - Here the value for the key
> apache is treated as an integer. Since 2000000000000000000000 is too big
> to fit into an integer it failed and inserted a null. Try adding an L at
> the end of 2000000000000000000000, i.e., 2000000000000000000000L
>
> Santhosh
>
> -----Original Message-----
> From: Guy Bayes [mailto:fatal.error@gmail.com]
> Sent: Saturday, July 18, 2009 11:21 PM
> To: pig-user@hadoop.apache.org
> Subject: Re: oddness with large numeric map[] values
>
> this is reproducible (at least for me) with any large number in the
> value
> column of any map schema declaration. Tried it on pig v0.2 and 0.3
>
> On Sat, Jul 18, 2009 at 11:09 PM, Santhosh Srinivasan
> <sm...@yahoo-inc.com>wrote:
>
> > In the second case, there is a warning message:
> >
> > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
> time(s).
> >
> > The conversion of the text data into a map failed for some reason. If
> > you examine the system logs on the job tracker UI, you should probably
> > see why the conversion failed. You can also achieve the same result by
> > turning warning aggregation off with the -w option.
> >
> > Santhosh
> >
> >
> > -----Original Message-----
> > From: Guy Bayes [mailto:fatal.error@gmail.com]
> > Sent: Saturday, July 18, 2009 9:37 PM
> > To: pig-user@hadoop.apache.org
> > Subject: oddness with large numeric map[] values
> >
> > Hello all, new to this list, new to pig, running into some odd
> behavior
> > with
> > map[] data types.
> >
> > Please forgive me if these are known issues or problems with my
> syntax,
> >
> > What am i doing wrong here? missing some cast somewhere?
> >
> > This works:
> >
> > grunt> cat data2
> > [apache#2000000000000000000000zzz,foo#foo1]
> > grunt> T1 = load 'data2' as (f1:map[]);
> > grunt> dump T1;
> > 2009-07-18 21:35:11,871 [Thread-11] WARN
> > org.apache.hadoop.mapred.JobClient
> > - Use GenericOptionsParser for parsing the arguments. Applications
> > should
> > implement Tool for the same.
> > 2009-07-18 21:35:21,889 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - 100% complete
> > 2009-07-18 21:35:21,889 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Success!
> > ([foo#foo1,apache#2000000000000000000000zzz])
> >
> > This doesn't return anything.
> >
> > grunt> cat data
> > [apache#2000000000000000000000,foo#foo1]
> > grunt> T1 = load 'data' as (f1:map[]);
> > grunt> dump T1;
> > 2009-07-18 21:36:09,873 [Thread-21] WARN
> > org.apache.hadoop.mapred.JobClient
> > - Use GenericOptionsParser for parsing the arguments. Applications
> > should
> > implement Tool for the same.
> > 2009-07-18 21:36:19,886 [main] WARN
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
> time(s).
> > 2009-07-18 21:36:19,886 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - 100% complete
> > 2009-07-18 21:36:19,886 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Success!
> > ()
> >
>
>
>
> --
> you may be acquainted with the night
> but i have seen the darkness in the day
> and you must know it is a terrifying sight...
>



-- 
you may be acquainted with the night
but i have seen the darkness in the day
and you must know it is a terrifying sight...


Re: oddness with large numeric map[] values

Posted by Guy Bayes <fa...@gmail.com>.
yup the problem is that the actual data actually contains strings that look
like numbers occasionally.

is there any way to tell apache to treat it like a string always?

On Sat, Jul 18, 2009 at 11:23 PM, Santhosh Srinivasan <sm...@yahoo-inc.com>wrote:

> I just read your email all over again. The reason for the failure is the
> following.
>
> 1. [apache#2000000000000000000000zzz,foo#foo1] - Here the value for the
> key apache is treated as a string.
>
> 2. [apache#2000000000000000000000,foo#foo1] - Here the value for the key
> apache is treated as an integer. Since 2000000000000000000000 is too big
> to fit into an integer it failed and inserted a null. Try adding an L at
> the end of 2000000000000000000000, i.e., 2000000000000000000000L
>
> Santhosh
>
> -----Original Message-----
> From: Guy Bayes [mailto:fatal.error@gmail.com]
> Sent: Saturday, July 18, 2009 11:21 PM
> To: pig-user@hadoop.apache.org
> Subject: Re: oddness with large numeric map[] values
>
> this is reproducible (at least for me) with any large number in the
> value
> column of any map schema declaration. Tried it on pig v0.2 and 0.3
>
> On Sat, Jul 18, 2009 at 11:09 PM, Santhosh Srinivasan
> <sm...@yahoo-inc.com>wrote:
>
> > In the second case, there is a warning message:
> >
> > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
> time(s).
> >
> > The conversion of the text data into a map failed for some reason. If
> > you examine the system logs on the job tracker UI, you should probably
> > see why the conversion failed. You can also achieve the same result by
> > turning warning aggregation off with the -w option.
> >
> > Santhosh
> >
> >
> > -----Original Message-----
> > From: Guy Bayes [mailto:fatal.error@gmail.com]
> > Sent: Saturday, July 18, 2009 9:37 PM
> > To: pig-user@hadoop.apache.org
> > Subject: oddness with large numeric map[] values
> >
> > Hello all, new to this list, new to pig, running into some odd
> behavior
> > with
> > map[] data types.
> >
> > Please forgive me if these are known issues or problems with my
> syntax,
> >
> > What am i doing wrong here? missing some cast somewhere?
> >
> > This works:
> >
> > grunt> cat data2
> > [apache#2000000000000000000000zzz,foo#foo1]
> > grunt> T1 = load 'data2' as (f1:map[]);
> > grunt> dump T1;
> > 2009-07-18 21:35:11,871 [Thread-11] WARN
> > org.apache.hadoop.mapred.JobClient
> > - Use GenericOptionsParser for parsing the arguments. Applications
> > should
> > implement Tool for the same.
> > 2009-07-18 21:35:21,889 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - 100% complete
> > 2009-07-18 21:35:21,889 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Success!
> > ([foo#foo1,apache#2000000000000000000000zzz])
> >
> > This doesn't return anything.
> >
> > grunt> cat data
> > [apache#2000000000000000000000,foo#foo1]
> > grunt> T1 = load 'data' as (f1:map[]);
> > grunt> dump T1;
> > 2009-07-18 21:36:09,873 [Thread-21] WARN
> > org.apache.hadoop.mapred.JobClient
> > - Use GenericOptionsParser for parsing the arguments. Applications
> > should
> > implement Tool for the same.
> > 2009-07-18 21:36:19,886 [main] WARN
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
> time(s).
> > 2009-07-18 21:36:19,886 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - 100% complete
> > 2009-07-18 21:36:19,886 [main] INFO
> >
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> > uncher
> > - Success!
> > ()
> >
>
>
>
> --
> you may be acquainted with the night
> but i have seen the darkness in the day
> and you must know it is a terrifying sight...
>



-- 
you may be acquainted with the night
but i have seen the darkness in the day
and you must know it is a terrifying sight...

RE: oddness with large numeric map[] values

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
I just read your email all over again. The reason for the failure is the
following.

1. [apache#2000000000000000000000zzz,foo#foo1] - Here the value for the
key apache is treated as a string.

2. [apache#2000000000000000000000,foo#foo1] - Here the value for the key
apache is treated as an integer. Since 2000000000000000000000 is too big
to fit into an integer it failed and inserted a null. Try adding an L at
the end of 2000000000000000000000, i.e., 2000000000000000000000L

Santhosh 

-----Original Message-----
From: Guy Bayes [mailto:fatal.error@gmail.com] 
Sent: Saturday, July 18, 2009 11:21 PM
To: pig-user@hadoop.apache.org
Subject: Re: oddness with large numeric map[] values

this is reproducible (at least for me) with any large number in the
value
column of any map schema declaration. Tried it on pig v0.2 and 0.3

On Sat, Jul 18, 2009 at 11:09 PM, Santhosh Srinivasan
<sm...@yahoo-inc.com>wrote:

> In the second case, there is a warning message:
>
> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
time(s).
>
> The conversion of the text data into a map failed for some reason. If
> you examine the system logs on the job tracker UI, you should probably
> see why the conversion failed. You can also achieve the same result by
> turning warning aggregation off with the -w option.
>
> Santhosh
>
>
> -----Original Message-----
> From: Guy Bayes [mailto:fatal.error@gmail.com]
> Sent: Saturday, July 18, 2009 9:37 PM
> To: pig-user@hadoop.apache.org
> Subject: oddness with large numeric map[] values
>
> Hello all, new to this list, new to pig, running into some odd
behavior
> with
> map[] data types.
>
> Please forgive me if these are known issues or problems with my
syntax,
>
> What am i doing wrong here? missing some cast somewhere?
>
> This works:
>
> grunt> cat data2
> [apache#2000000000000000000000zzz,foo#foo1]
> grunt> T1 = load 'data2' as (f1:map[]);
> grunt> dump T1;
> 2009-07-18 21:35:11,871 [Thread-11] WARN
> org.apache.hadoop.mapred.JobClient
> - Use GenericOptionsParser for parsing the arguments. Applications
> should
> implement Tool for the same.
> 2009-07-18 21:35:21,889 [main] INFO
>
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - 100% complete
> 2009-07-18 21:35:21,889 [main] INFO
>
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - Success!
> ([foo#foo1,apache#2000000000000000000000zzz])
>
> This doesn't return anything.
>
> grunt> cat data
> [apache#2000000000000000000000,foo#foo1]
> grunt> T1 = load 'data' as (f1:map[]);
> grunt> dump T1;
> 2009-07-18 21:36:09,873 [Thread-21] WARN
> org.apache.hadoop.mapred.JobClient
> - Use GenericOptionsParser for parsing the arguments. Applications
> should
> implement Tool for the same.
> 2009-07-18 21:36:19,886 [main] WARN
>
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1
time(s).
> 2009-07-18 21:36:19,886 [main] INFO
>
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - 100% complete
> 2009-07-18 21:36:19,886 [main] INFO
>
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - Success!
> ()
>



-- 
you may be acquainted with the night
but i have seen the darkness in the day
and you must know it is a terrifying sight...

Re: oddness with large numeric map[] values

Posted by Guy Bayes <fa...@gmail.com>.
this is reproducible (at least for me) with any large number in the value
column of any map schema declaration. Tried it on pig v0.2 and 0.3

On Sat, Jul 18, 2009 at 11:09 PM, Santhosh Srinivasan <sm...@yahoo-inc.com>wrote:

> In the second case, there is a warning message:
>
> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).
>
> The conversion of the text data into a map failed for some reason. If
> you examine the system logs on the job tracker UI, you should probably
> see why the conversion failed. You can also achieve the same result by
> turning warning aggregation off with the -w option.
>
> Santhosh
>
>
> -----Original Message-----
> From: Guy Bayes [mailto:fatal.error@gmail.com]
> Sent: Saturday, July 18, 2009 9:37 PM
> To: pig-user@hadoop.apache.org
> Subject: oddness with large numeric map[] values
>
> Hello all, new to this list, new to pig, running into some odd behavior
> with
> map[] data types.
>
> Please forgive me if these are known issues or problems with my syntax,
>
> What am i doing wrong here? missing some cast somewhere?
>
> This works:
>
> grunt> cat data2
> [apache#2000000000000000000000zzz,foo#foo1]
> grunt> T1 = load 'data2' as (f1:map[]);
> grunt> dump T1;
> 2009-07-18 21:35:11,871 [Thread-11] WARN
> org.apache.hadoop.mapred.JobClient
> - Use GenericOptionsParser for parsing the arguments. Applications
> should
> implement Tool for the same.
> 2009-07-18 21:35:21,889 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - 100% complete
> 2009-07-18 21:35:21,889 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - Success!
> ([foo#foo1,apache#2000000000000000000000zzz])
>
> This doesn't return anything.
>
> grunt> cat data
> [apache#2000000000000000000000,foo#foo1]
> grunt> T1 = load 'data' as (f1:map[]);
> grunt> dump T1;
> 2009-07-18 21:36:09,873 [Thread-21] WARN
> org.apache.hadoop.mapred.JobClient
> - Use GenericOptionsParser for parsing the arguments. Applications
> should
> implement Tool for the same.
> 2009-07-18 21:36:19,886 [main] WARN
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).
> 2009-07-18 21:36:19,886 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - 100% complete
> 2009-07-18 21:36:19,886 [main] INFO
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
> uncher
> - Success!
> ()
>



-- 
you may be acquainted with the night
but i have seen the darkness in the day
and you must know it is a terrifying sight...

RE: oddness with large numeric map[] values

Posted by Santhosh Srinivasan <sm...@yahoo-inc.com>.
In the second case, there is a warning message:

- Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).

The conversion of the text data into a map failed for some reason. If
you examine the system logs on the job tracker UI, you should probably
see why the conversion failed. You can also achieve the same result by
turning warning aggregation off with the -w option.

Santhosh
 

-----Original Message-----
From: Guy Bayes [mailto:fatal.error@gmail.com] 
Sent: Saturday, July 18, 2009 9:37 PM
To: pig-user@hadoop.apache.org
Subject: oddness with large numeric map[] values

Hello all, new to this list, new to pig, running into some odd behavior
with
map[] data types.

Please forgive me if these are known issues or problems with my syntax,

What am i doing wrong here? missing some cast somewhere?

This works:

grunt> cat data2
[apache#2000000000000000000000zzz,foo#foo1]
grunt> T1 = load 'data2' as (f1:map[]);
grunt> dump T1;
2009-07-18 21:35:11,871 [Thread-11] WARN
org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications
should
implement Tool for the same.
2009-07-18 21:35:21,889 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
uncher
- 100% complete
2009-07-18 21:35:21,889 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
uncher
- Success!
([foo#foo1,apache#2000000000000000000000zzz])

This doesn't return anything.

grunt> cat data
[apache#2000000000000000000000,foo#foo1]
grunt> T1 = load 'data' as (f1:map[]);
grunt> dump T1;
2009-07-18 21:36:09,873 [Thread-21] WARN
org.apache.hadoop.mapred.JobClient
- Use GenericOptionsParser for parsing the arguments. Applications
should
implement Tool for the same.
2009-07-18 21:36:19,886 [main] WARN
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
uncher
- Encountered Warning FIELD_DISCARDED_TYPE_CONVERSION_FAILED 1 time(s).
2009-07-18 21:36:19,886 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
uncher
- 100% complete
2009-07-18 21:36:19,886 [main] INFO
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLa
uncher
- Success!
()