You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@pig.apache.org by Lakshminarayana Motamarri <na...@gmail.com> on 2011/06/16 12:36:11 UTC

ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Hi all,

*I am receiving the following exception:*
org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught
error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught
exception processing input row  [null]]
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:269)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.IOException: Caught exception processing input row
[null]
    at
org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:70)
    at
org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:57)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
    ... 10 more
Caused by: java.lang.NullPointerException
    ... 13 more

*My Code:*
*FFW2 = Load 'final_free_w2.txt';
FFW3 = Load 'final_free_w3.txt';
FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
    STORE FF23_Filtered INTO 'FF23_Filtered.txt';

    REGISTER
/home/training/Desktop/1pig/pig-0.7.0/contrib/piggybank/piggybank.jar
A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
B = FOREACH A GENERATE appID,
org.apache.pig.piggybank.evaluation.math.MAX((double)rank2, (double)rank3);
store B into 'FF23_FJM.txt'; *


--> Can any one pls let me know, what is the exact reason which is causing
above exception...
I also made sure that, the file* FF23_Filtered.txt* is not NULL.

---
Thanks & Regards,
Narayan.

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Dmitriy Ryaboy <dv...@gmail.com>.
MySQL has a function called "greatest" which does max of several
values (as opposed to max, which is an aggregate function over a
column).  Here's what it returns:

select greatest(1, 2)
2

select greatest(1,null)
null

On the other hand, the max aggregate function returns 2 when a table
column has 3 rows, with values (null, 1, 2).

So much for consistency.

So what's the answer here? I have no idea. Erroring on underspecified
behaviors and letting users handle null cases as makes sense to them
at least doesn't cause bizarre hard-to-find data bugs 12 hours into a
27-step computation.

D

On Thu, Jun 16, 2011 at 12:16 PM, Jonathan Coveney <jc...@gmail.com> wrote:
> Do we want the Max function to be able to handle nulls? Seems fairly natural
> for it to be able to.
>
> 2011/6/16 Daniel Dai <ji...@yahoo-inc.com>
>
>> Jonathan is right. math.MAX does not handle null input. Check for null
>> before feeding into MAX is necessary.
>>
>> Daniel
>>
>>
>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
>>
>>> Can you check if your rank2 or rank3 values are ever null? If they are,
>>> there are some ad hoc fixes which you can do until this is fixed (and it
>>> is
>>> easy to fix, just a question of deciding what the desired handling of null
>>> values should be). I would just do something like...
>>>
>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>> B = FILTER A BY rank2 is null AND rank3 is null;
>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as rank2, (
>>> rank3 is null ? rank2 : rank3 ) as rank3;
>>>
>>> Obvoiusly you could tweak that for whatever you want to happen if a value
>>> is
>>> null.
>>>
>>> 2011/6/16 Jonathan Coveney<jc...@gmail.com>
>>>
>>>  Hm, just to make sure, I ran this against trunk (to see if it's just a
>>>> 0.7.0 thing or not).
>>>>
>>>> A = LOAD 'test.txt'; --this is just a blank one line file
>>>> B = FOREACH A GENERATE
>>>> org.apache.pig.piggybank.**evaluation.math.MAX(1,null);
>>>>
>>>> I also tested fedding it files from test.txt etc. It fails when there is
>>>> a
>>>> null value. The cast does not.
>>>>
>>>> 2011/6/16 Lakshminarayana Motamarri<na...@gmail.com>
>>>> >
>>>>
>>>>  Hi all,
>>>>>
>>>>> *I am receiving the following exception:*
>>>>> org.apache.pig.backend.**executionengine.ExecException: ERROR 2078:
>>>>> Caught
>>>>> error from UDF: org.apache.pig.piggybank.**evaluation.math.DoubleMax
>>>>> [Caught
>>>>> exception processing input row  [null]]
>>>>>    at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:229)
>>>>>    at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:263)
>>>>>    at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> relationalOperators.POForEach.**processPlan(POForEach.java:**269)
>>>>>    at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> relationalOperators.POForEach.**getNext(POForEach.java:204)
>>>>>    at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>> mapReduceLayer.PigMapBase.**runPipeline(PigMapBase.java:**249)
>>>>>    at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>> mapReduceLayer.PigMapBase.map(**PigMapBase.java:240)
>>>>>    at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>> mapReduceLayer.PigMapOnly$Map.**map(PigMapOnly.java:65)
>>>>>    at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
>>>>>    at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
>>>>> java:358)
>>>>>    at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:307)
>>>>>    at org.apache.hadoop.mapred.**Child.main(Child.java:170)
>>>>> Caused by: java.io.IOException: Caught exception processing input row
>>>>> [null]
>>>>>    at
>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>>> exec(DoubleMax.java:70)
>>>>>    at
>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>>> exec(DoubleMax.java:57)
>>>>>    at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:201)
>>>>>    ... 10 more
>>>>> Caused by: java.lang.NullPointerException
>>>>>    ... 13 more
>>>>>
>>>>> *My Code:*
>>>>> *FFW2 = Load 'final_free_w2.txt';
>>>>> FFW3 = Load 'final_free_w3.txt';
>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>>>    STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>>>
>>>>>    REGISTER
>>>>> /home/training/Desktop/1pig/**pig-0.7.0/contrib/piggybank/**
>>>>> piggybank.jar
>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>> B = FOREACH A GENERATE appID,
>>>>> org.apache.pig.piggybank.**evaluation.math.MAX((double)**rank2,
>>>>> (double)rank3);
>>>>> store B into 'FF23_FJM.txt'; *
>>>>>
>>>>>
>>>>> -->  Can any one pls let me know, what is the exact reason which is
>>>>> causing
>>>>> above exception...
>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>>>
>>>>> ---
>>>>> Thanks&  Regards,
>>>>> Narayan.
>>>>>
>>>>>
>>>>
>>
>

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Jonathan Coveney <jc...@gmail.com>.
Patch submitted. Pretty trivial, hopefully it's an adequate fix...

2011/6/17 Jonathan Coveney <jc...@gmail.com>

> I made a Jira.
>
> https://issues.apache.org/jira/browse/PIG-2132
>
> Should be pretty easy to fix. I'll probably do so over the weekend if
> nobody else gets to it first.
>
>
> 2011/6/17 Alan Gates <ga...@yahoo-inc.com>
>
>> MAX should definitely handle null, and it should ignore it.  The goal for
>> our SQL like built in aggregate functions (MIN, MAX, COUNT, SUM, AVG) is to
>> be SQL like.  SQL ignores nulls in these functions.  It's inconsistent, but
>> it's usually what people.  So, we should be consistently inconsistent like
>> SQL. :)
>>
>> Alan.
>>
>>
>> On Jun 16, 2011, at 1:07 PM, Daniel Dai wrote:
>>
>>  I take back this after I saw Dmitriy's reply. Seems to be it is not that
>>> straightforward.
>>>
>>> Daniel
>>>
>>> On 06/16/2011 01:00 PM, Daniel Dai wrote:
>>>
>>>> Yes, I think it is better if MAX can handle NULL. Can you open a Jira?
>>>>
>>>> Daniel
>>>>
>>>> On 06/16/2011 12:16 PM, Jonathan Coveney wrote:
>>>>
>>>>> Do we want the Max function to be able to handle nulls? Seems fairly
>>>>> natural
>>>>> for it to be able to.
>>>>>
>>>>> 2011/6/16 Daniel Dai<ji...@yahoo-inc.com>
>>>>>
>>>>>  Jonathan is right. math.MAX does not handle null input. Check for null
>>>>>> before feeding into MAX is necessary.
>>>>>>
>>>>>> Daniel
>>>>>>
>>>>>>
>>>>>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
>>>>>>
>>>>>>  Can you check if your rank2 or rank3 values are ever null? If they
>>>>>>> are,
>>>>>>> there are some ad hoc fixes which you can do until this is fixed (and
>>>>>>> it
>>>>>>> is
>>>>>>> easy to fix, just a question of deciding what the desired handling of
>>>>>>> null
>>>>>>> values should be). I would just do something like...
>>>>>>>
>>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>>>> B = FILTER A BY rank2 is null AND rank3 is null;
>>>>>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as
>>>>>>> rank2, (
>>>>>>> rank3 is null ? rank2 : rank3 ) as rank3;
>>>>>>>
>>>>>>> Obvoiusly you could tweak that for whatever you want to happen if a
>>>>>>> value
>>>>>>> is
>>>>>>> null.
>>>>>>>
>>>>>>> 2011/6/16 Jonathan Coveney<jc...@gmail.com>
>>>>>>>
>>>>>>>  Hm, just to make sure, I ran this against trunk (to see if it's just
>>>>>>> a
>>>>>>>
>>>>>>>> 0.7.0 thing or not).
>>>>>>>>
>>>>>>>> A = LOAD 'test.txt'; --this is just a blank one line file
>>>>>>>> B = FOREACH A GENERATE
>>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX(1,null);
>>>>>>>>
>>>>>>>> I also tested fedding it files from test.txt etc. It fails when
>>>>>>>> there is
>>>>>>>> a
>>>>>>>> null value. The cast does not.
>>>>>>>>
>>>>>>>> 2011/6/16 Lakshminarayana Motamarri<narayana.gupta123@****gmail.com<http://gmail.com>
>>>>>>>> <narayana.gupta123@**gmail.com <na...@gmail.com>>
>>>>>>>>  Hi all,
>>>>>>>>
>>>>>>>>> *I am receiving the following exception:*
>>>>>>>>> org.apache.pig.backend.****executionengine.ExecException: ERROR
>>>>>>>>> 2078:
>>>>>>>>> Caught
>>>>>>>>> error from UDF: org.apache.pig.piggybank.****
>>>>>>>>> evaluation.math.DoubleMax
>>>>>>>>> [Caught
>>>>>>>>> exception processing input row  [null]]
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.**
>>>>>>>>> **java:229)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.**
>>>>>>>>> **java:263)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> relationalOperators.POForEach.****processPlan(POForEach.java:***
>>>>>>>>> *269)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> relationalOperators.POForEach.****getNext(POForEach.java:204)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> mapReduceLayer.PigMapBase.****runPipeline(PigMapBase.java:****249)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> mapReduceLayer.PigMapBase.map(****PigMapBase.java:240)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> mapReduceLayer.PigMapOnly$Map.****map(PigMapOnly.java:65)
>>>>>>>>>    at org.apache.hadoop.mapred.****MapRunner.run(MapRunner.java:**
>>>>>>>>> **50)
>>>>>>>>>    at org.apache.hadoop.mapred.****MapTask.runOldMapper(MapTask.**
>>>>>>>>> **
>>>>>>>>> java:358)
>>>>>>>>>    at org.apache.hadoop.mapred.****MapTask.run(MapTask.java:307)
>>>>>>>>>    at org.apache.hadoop.mapred.****Child.main(Child.java:170)
>>>>>>>>> Caused by: java.io.IOException: Caught exception processing input
>>>>>>>>> row
>>>>>>>>> [null]
>>>>>>>>>    at
>>>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.**
>>>>>>>>> exec(DoubleMax.java:70)
>>>>>>>>>    at
>>>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.**
>>>>>>>>> exec(DoubleMax.java:57)
>>>>>>>>>    at
>>>>>>>>>
>>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>>> physicalLayer.**
>>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.**
>>>>>>>>> **java:201)
>>>>>>>>>    ... 10 more
>>>>>>>>> Caused by: java.lang.NullPointerException
>>>>>>>>>    ... 13 more
>>>>>>>>>
>>>>>>>>> *My Code:*
>>>>>>>>> *FFW2 = Load 'final_free_w2.txt';
>>>>>>>>> FFW3 = Load 'final_free_w3.txt';
>>>>>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>>>>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>>>>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>>>>>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>>>>>>>    STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>>>>>>>
>>>>>>>>>    REGISTER
>>>>>>>>> /home/training/Desktop/1pig/****pig-0.7.0/contrib/piggybank/**
>>>>>>>>> piggybank.jar
>>>>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>>>>>> B = FOREACH A GENERATE appID,
>>>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX((double)****
>>>>>>>>> rank2,
>>>>>>>>> (double)rank3);
>>>>>>>>> store B into 'FF23_FJM.txt'; *
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -->    Can any one pls let me know, what is the exact reason which
>>>>>>>>> is
>>>>>>>>> causing
>>>>>>>>> above exception...
>>>>>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>>>>>>>
>>>>>>>>> ---
>>>>>>>>> Thanks&    Regards,
>>>>>>>>> Narayan.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>
>>
>

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Jonathan Coveney <jc...@gmail.com>.
I made a Jira.

https://issues.apache.org/jira/browse/PIG-2132

Should be pretty easy to fix. I'll probably do so over the weekend if nobody
else gets to it first.

2011/6/17 Alan Gates <ga...@yahoo-inc.com>

> MAX should definitely handle null, and it should ignore it.  The goal for
> our SQL like built in aggregate functions (MIN, MAX, COUNT, SUM, AVG) is to
> be SQL like.  SQL ignores nulls in these functions.  It's inconsistent, but
> it's usually what people.  So, we should be consistently inconsistent like
> SQL. :)
>
> Alan.
>
>
> On Jun 16, 2011, at 1:07 PM, Daniel Dai wrote:
>
>  I take back this after I saw Dmitriy's reply. Seems to be it is not that
>> straightforward.
>>
>> Daniel
>>
>> On 06/16/2011 01:00 PM, Daniel Dai wrote:
>>
>>> Yes, I think it is better if MAX can handle NULL. Can you open a Jira?
>>>
>>> Daniel
>>>
>>> On 06/16/2011 12:16 PM, Jonathan Coveney wrote:
>>>
>>>> Do we want the Max function to be able to handle nulls? Seems fairly
>>>> natural
>>>> for it to be able to.
>>>>
>>>> 2011/6/16 Daniel Dai<ji...@yahoo-inc.com>
>>>>
>>>>  Jonathan is right. math.MAX does not handle null input. Check for null
>>>>> before feeding into MAX is necessary.
>>>>>
>>>>> Daniel
>>>>>
>>>>>
>>>>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
>>>>>
>>>>>  Can you check if your rank2 or rank3 values are ever null? If they
>>>>>> are,
>>>>>> there are some ad hoc fixes which you can do until this is fixed (and
>>>>>> it
>>>>>> is
>>>>>> easy to fix, just a question of deciding what the desired handling of
>>>>>> null
>>>>>> values should be). I would just do something like...
>>>>>>
>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>>> B = FILTER A BY rank2 is null AND rank3 is null;
>>>>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as
>>>>>> rank2, (
>>>>>> rank3 is null ? rank2 : rank3 ) as rank3;
>>>>>>
>>>>>> Obvoiusly you could tweak that for whatever you want to happen if a
>>>>>> value
>>>>>> is
>>>>>> null.
>>>>>>
>>>>>> 2011/6/16 Jonathan Coveney<jc...@gmail.com>
>>>>>>
>>>>>>  Hm, just to make sure, I ran this against trunk (to see if it's just
>>>>>> a
>>>>>>
>>>>>>> 0.7.0 thing or not).
>>>>>>>
>>>>>>> A = LOAD 'test.txt'; --this is just a blank one line file
>>>>>>> B = FOREACH A GENERATE
>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX(1,null);
>>>>>>>
>>>>>>> I also tested fedding it files from test.txt etc. It fails when there
>>>>>>> is
>>>>>>> a
>>>>>>> null value. The cast does not.
>>>>>>>
>>>>>>> 2011/6/16 Lakshminarayana Motamarri<narayana.gupta123@****gmail.com<http://gmail.com>
>>>>>>> <narayana.gupta123@**gmail.com <na...@gmail.com>>
>>>>>>>  Hi all,
>>>>>>>
>>>>>>>> *I am receiving the following exception:*
>>>>>>>> org.apache.pig.backend.****executionengine.ExecException: ERROR
>>>>>>>> 2078:
>>>>>>>> Caught
>>>>>>>> error from UDF: org.apache.pig.piggybank.****
>>>>>>>> evaluation.math.DoubleMax
>>>>>>>> [Caught
>>>>>>>> exception processing input row  [null]]
>>>>>>>>    at
>>>>>>>>
>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>> physicalLayer.**
>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.****java:229)
>>>>>>>>    at
>>>>>>>>
>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>> physicalLayer.**
>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.****java:263)
>>>>>>>>    at
>>>>>>>>
>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>> physicalLayer.**
>>>>>>>> relationalOperators.POForEach.****processPlan(POForEach.java:***
>>>>>>>> *269)
>>>>>>>>    at
>>>>>>>>
>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>> physicalLayer.**
>>>>>>>> relationalOperators.POForEach.****getNext(POForEach.java:204)
>>>>>>>>    at
>>>>>>>>
>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>> mapReduceLayer.PigMapBase.****runPipeline(PigMapBase.java:****249)
>>>>>>>>    at
>>>>>>>>
>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>> mapReduceLayer.PigMapBase.map(****PigMapBase.java:240)
>>>>>>>>    at
>>>>>>>>
>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>> mapReduceLayer.PigMapOnly$Map.****map(PigMapOnly.java:65)
>>>>>>>>    at org.apache.hadoop.mapred.****MapRunner.run(MapRunner.java:***
>>>>>>>> *50)
>>>>>>>>    at org.apache.hadoop.mapred.****MapTask.runOldMapper(MapTask.***
>>>>>>>> *
>>>>>>>> java:358)
>>>>>>>>    at org.apache.hadoop.mapred.****MapTask.run(MapTask.java:307)
>>>>>>>>    at org.apache.hadoop.mapred.****Child.main(Child.java:170)
>>>>>>>> Caused by: java.io.IOException: Caught exception processing input
>>>>>>>> row
>>>>>>>> [null]
>>>>>>>>    at
>>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.**
>>>>>>>> exec(DoubleMax.java:70)
>>>>>>>>    at
>>>>>>>> org.apache.pig.piggybank.****evaluation.math.DoubleMax.**
>>>>>>>> exec(DoubleMax.java:57)
>>>>>>>>    at
>>>>>>>>
>>>>>>>> org.apache.pig.backend.hadoop.****executionengine.**
>>>>>>>> physicalLayer.**
>>>>>>>> expressionOperators.****POUserFunc.getNext(POUserFunc.****java:201)
>>>>>>>>    ... 10 more
>>>>>>>> Caused by: java.lang.NullPointerException
>>>>>>>>    ... 13 more
>>>>>>>>
>>>>>>>> *My Code:*
>>>>>>>> *FFW2 = Load 'final_free_w2.txt';
>>>>>>>> FFW3 = Load 'final_free_w3.txt';
>>>>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>>>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>>>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>>>>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>>>>>>    STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>>>>>>
>>>>>>>>    REGISTER
>>>>>>>> /home/training/Desktop/1pig/****pig-0.7.0/contrib/piggybank/**
>>>>>>>> piggybank.jar
>>>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>>>>> B = FOREACH A GENERATE appID,
>>>>>>>> org.apache.pig.piggybank.****evaluation.math.MAX((double)****rank2,
>>>>>>>> (double)rank3);
>>>>>>>> store B into 'FF23_FJM.txt'; *
>>>>>>>>
>>>>>>>>
>>>>>>>> -->    Can any one pls let me know, what is the exact reason which
>>>>>>>> is
>>>>>>>> causing
>>>>>>>> above exception...
>>>>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>>>>>>
>>>>>>>> ---
>>>>>>>> Thanks&    Regards,
>>>>>>>> Narayan.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>
>

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Alan Gates <ga...@yahoo-inc.com>.
MAX should definitely handle null, and it should ignore it.  The goal  
for our SQL like built in aggregate functions (MIN, MAX, COUNT, SUM,  
AVG) is to be SQL like.  SQL ignores nulls in these functions.  It's  
inconsistent, but it's usually what people.  So, we should be  
consistently inconsistent like SQL. :)

Alan.

On Jun 16, 2011, at 1:07 PM, Daniel Dai wrote:

> I take back this after I saw Dmitriy's reply. Seems to be it is not  
> that
> straightforward.
>
> Daniel
>
> On 06/16/2011 01:00 PM, Daniel Dai wrote:
>> Yes, I think it is better if MAX can handle NULL. Can you open a  
>> Jira?
>>
>> Daniel
>>
>> On 06/16/2011 12:16 PM, Jonathan Coveney wrote:
>>> Do we want the Max function to be able to handle nulls? Seems  
>>> fairly natural
>>> for it to be able to.
>>>
>>> 2011/6/16 Daniel Dai<ji...@yahoo-inc.com>
>>>
>>>> Jonathan is right. math.MAX does not handle null input. Check for  
>>>> null
>>>> before feeding into MAX is necessary.
>>>>
>>>> Daniel
>>>>
>>>>
>>>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
>>>>
>>>>> Can you check if your rank2 or rank3 values are ever null? If  
>>>>> they are,
>>>>> there are some ad hoc fixes which you can do until this is fixed  
>>>>> (and it
>>>>> is
>>>>> easy to fix, just a question of deciding what the desired  
>>>>> handling of null
>>>>> values should be). I would just do something like...
>>>>>
>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>> B = FILTER A BY rank2 is null AND rank3 is null;
>>>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2)  
>>>>> as rank2, (
>>>>> rank3 is null ? rank2 : rank3 ) as rank3;
>>>>>
>>>>> Obvoiusly you could tweak that for whatever you want to happen  
>>>>> if a value
>>>>> is
>>>>> null.
>>>>>
>>>>> 2011/6/16 Jonathan Coveney<jc...@gmail.com>
>>>>>
>>>>>   Hm, just to make sure, I ran this against trunk (to see if  
>>>>> it's just a
>>>>>> 0.7.0 thing or not).
>>>>>>
>>>>>> A = LOAD 'test.txt'; --this is just a blank one line file
>>>>>> B = FOREACH A GENERATE
>>>>>> org.apache.pig.piggybank.**evaluation.math.MAX(1,null);
>>>>>>
>>>>>> I also tested fedding it files from test.txt etc. It fails when  
>>>>>> there is
>>>>>> a
>>>>>> null value. The cast does not.
>>>>>>
>>>>>> 2011/6/16 Lakshminarayana  
>>>>>> Motamarri<narayana.gupta123@**gmail.com<narayana.gupta123@gmail.com 
>>>>>> >
>>>>>>   Hi all,
>>>>>>> *I am receiving the following exception:*
>>>>>>> org.apache.pig.backend.**executionengine.ExecException: ERROR  
>>>>>>> 2078:
>>>>>>> Caught
>>>>>>> error from UDF:  
>>>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax
>>>>>>> [Caught
>>>>>>> exception processing input row  [null]]
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:229)
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:263)
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>>> relationalOperators 
>>>>>>> .POForEach.**processPlan(POForEach.java:**269)
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>>> relationalOperators.POForEach.**getNext(POForEach.java:204)
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>>>> mapReduceLayer.PigMapBase.**runPipeline(PigMapBase.java:**249)
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>>>> mapReduceLayer.PigMapBase.map(**PigMapBase.java:240)
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>>>> mapReduceLayer.PigMapOnly$Map.**map(PigMapOnly.java:65)
>>>>>>>     at  
>>>>>>> org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
>>>>>>>     at  
>>>>>>> org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
>>>>>>> java:358)
>>>>>>>     at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:307)
>>>>>>>     at org.apache.hadoop.mapred.**Child.main(Child.java:170)
>>>>>>> Caused by: java.io.IOException: Caught exception processing  
>>>>>>> input row
>>>>>>> [null]
>>>>>>>     at
>>>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>>>>> exec(DoubleMax.java:70)
>>>>>>>     at
>>>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>>>>> exec(DoubleMax.java:57)
>>>>>>>     at
>>>>>>>
>>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:201)
>>>>>>>     ... 10 more
>>>>>>> Caused by: java.lang.NullPointerException
>>>>>>>     ... 13 more
>>>>>>>
>>>>>>> *My Code:*
>>>>>>> *FFW2 = Load 'final_free_w2.txt';
>>>>>>> FFW3 = Load 'final_free_w3.txt';
>>>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY  
>>>>>>> $0;
>>>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>>>>>     STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>>>>>
>>>>>>>     REGISTER
>>>>>>> /home/training/Desktop/1pig/**pig-0.7.0/contrib/piggybank/**
>>>>>>> piggybank.jar
>>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>>>> B = FOREACH A GENERATE appID,
>>>>>>> org.apache.pig.piggybank.**evaluation.math.MAX((double)**rank2,
>>>>>>> (double)rank3);
>>>>>>> store B into 'FF23_FJM.txt'; *
>>>>>>>
>>>>>>>
>>>>>>> -->    Can any one pls let me know, what is the exact reason  
>>>>>>> which is
>>>>>>> causing
>>>>>>> above exception...
>>>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>>>>>
>>>>>>> ---
>>>>>>> Thanks&    Regards,
>>>>>>> Narayan.
>>>>>>>
>>>>>>>
>


Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Daniel Dai <ji...@yahoo-inc.com>.
I take back this after I saw Dmitriy's reply. Seems to be it is not that 
straightforward.

Daniel

On 06/16/2011 01:00 PM, Daniel Dai wrote:
> Yes, I think it is better if MAX can handle NULL. Can you open a Jira?
>
> Daniel
>
> On 06/16/2011 12:16 PM, Jonathan Coveney wrote:
>> Do we want the Max function to be able to handle nulls? Seems fairly natural
>> for it to be able to.
>>
>> 2011/6/16 Daniel Dai<ji...@yahoo-inc.com>
>>
>>> Jonathan is right. math.MAX does not handle null input. Check for null
>>> before feeding into MAX is necessary.
>>>
>>> Daniel
>>>
>>>
>>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
>>>
>>>> Can you check if your rank2 or rank3 values are ever null? If they are,
>>>> there are some ad hoc fixes which you can do until this is fixed (and it
>>>> is
>>>> easy to fix, just a question of deciding what the desired handling of null
>>>> values should be). I would just do something like...
>>>>
>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>> B = FILTER A BY rank2 is null AND rank3 is null;
>>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as rank2, (
>>>> rank3 is null ? rank2 : rank3 ) as rank3;
>>>>
>>>> Obvoiusly you could tweak that for whatever you want to happen if a value
>>>> is
>>>> null.
>>>>
>>>> 2011/6/16 Jonathan Coveney<jc...@gmail.com>
>>>>
>>>>    Hm, just to make sure, I ran this against trunk (to see if it's just a
>>>>> 0.7.0 thing or not).
>>>>>
>>>>> A = LOAD 'test.txt'; --this is just a blank one line file
>>>>> B = FOREACH A GENERATE
>>>>> org.apache.pig.piggybank.**evaluation.math.MAX(1,null);
>>>>>
>>>>> I also tested fedding it files from test.txt etc. It fails when there is
>>>>> a
>>>>> null value. The cast does not.
>>>>>
>>>>> 2011/6/16 Lakshminarayana Motamarri<na...@gmail.com>
>>>>>    Hi all,
>>>>>> *I am receiving the following exception:*
>>>>>> org.apache.pig.backend.**executionengine.ExecException: ERROR 2078:
>>>>>> Caught
>>>>>> error from UDF: org.apache.pig.piggybank.**evaluation.math.DoubleMax
>>>>>> [Caught
>>>>>> exception processing input row  [null]]
>>>>>>      at
>>>>>>
>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:229)
>>>>>>      at
>>>>>>
>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:263)
>>>>>>      at
>>>>>>
>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>> relationalOperators.POForEach.**processPlan(POForEach.java:**269)
>>>>>>      at
>>>>>>
>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>> relationalOperators.POForEach.**getNext(POForEach.java:204)
>>>>>>      at
>>>>>>
>>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>>> mapReduceLayer.PigMapBase.**runPipeline(PigMapBase.java:**249)
>>>>>>      at
>>>>>>
>>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>>> mapReduceLayer.PigMapBase.map(**PigMapBase.java:240)
>>>>>>      at
>>>>>>
>>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>>> mapReduceLayer.PigMapOnly$Map.**map(PigMapOnly.java:65)
>>>>>>      at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
>>>>>>      at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
>>>>>> java:358)
>>>>>>      at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:307)
>>>>>>      at org.apache.hadoop.mapred.**Child.main(Child.java:170)
>>>>>> Caused by: java.io.IOException: Caught exception processing input row
>>>>>> [null]
>>>>>>      at
>>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>>>> exec(DoubleMax.java:70)
>>>>>>      at
>>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>>>> exec(DoubleMax.java:57)
>>>>>>      at
>>>>>>
>>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:201)
>>>>>>      ... 10 more
>>>>>> Caused by: java.lang.NullPointerException
>>>>>>      ... 13 more
>>>>>>
>>>>>> *My Code:*
>>>>>> *FFW2 = Load 'final_free_w2.txt';
>>>>>> FFW3 = Load 'final_free_w3.txt';
>>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>>>>      STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>>>>
>>>>>>      REGISTER
>>>>>> /home/training/Desktop/1pig/**pig-0.7.0/contrib/piggybank/**
>>>>>> piggybank.jar
>>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>>> B = FOREACH A GENERATE appID,
>>>>>> org.apache.pig.piggybank.**evaluation.math.MAX((double)**rank2,
>>>>>> (double)rank3);
>>>>>> store B into 'FF23_FJM.txt'; *
>>>>>>
>>>>>>
>>>>>> -->    Can any one pls let me know, what is the exact reason which is
>>>>>> causing
>>>>>> above exception...
>>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>>>>
>>>>>> ---
>>>>>> Thanks&    Regards,
>>>>>> Narayan.
>>>>>>
>>>>>>


Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Daniel Dai <ji...@yahoo-inc.com>.
Yes, I think it is better if MAX can handle NULL. Can you open a Jira?

Daniel

On 06/16/2011 12:16 PM, Jonathan Coveney wrote:
> Do we want the Max function to be able to handle nulls? Seems fairly natural
> for it to be able to.
>
> 2011/6/16 Daniel Dai<ji...@yahoo-inc.com>
>
>> Jonathan is right. math.MAX does not handle null input. Check for null
>> before feeding into MAX is necessary.
>>
>> Daniel
>>
>>
>> On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
>>
>>> Can you check if your rank2 or rank3 values are ever null? If they are,
>>> there are some ad hoc fixes which you can do until this is fixed (and it
>>> is
>>> easy to fix, just a question of deciding what the desired handling of null
>>> values should be). I would just do something like...
>>>
>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>> B = FILTER A BY rank2 is null AND rank3 is null;
>>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as rank2, (
>>> rank3 is null ? rank2 : rank3 ) as rank3;
>>>
>>> Obvoiusly you could tweak that for whatever you want to happen if a value
>>> is
>>> null.
>>>
>>> 2011/6/16 Jonathan Coveney<jc...@gmail.com>
>>>
>>>   Hm, just to make sure, I ran this against trunk (to see if it's just a
>>>> 0.7.0 thing or not).
>>>>
>>>> A = LOAD 'test.txt'; --this is just a blank one line file
>>>> B = FOREACH A GENERATE
>>>> org.apache.pig.piggybank.**evaluation.math.MAX(1,null);
>>>>
>>>> I also tested fedding it files from test.txt etc. It fails when there is
>>>> a
>>>> null value. The cast does not.
>>>>
>>>> 2011/6/16 Lakshminarayana Motamarri<na...@gmail.com>
>>>>   Hi all,
>>>>> *I am receiving the following exception:*
>>>>> org.apache.pig.backend.**executionengine.ExecException: ERROR 2078:
>>>>> Caught
>>>>> error from UDF: org.apache.pig.piggybank.**evaluation.math.DoubleMax
>>>>> [Caught
>>>>> exception processing input row  [null]]
>>>>>     at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:229)
>>>>>     at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:263)
>>>>>     at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> relationalOperators.POForEach.**processPlan(POForEach.java:**269)
>>>>>     at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> relationalOperators.POForEach.**getNext(POForEach.java:204)
>>>>>     at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>> mapReduceLayer.PigMapBase.**runPipeline(PigMapBase.java:**249)
>>>>>     at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>> mapReduceLayer.PigMapBase.map(**PigMapBase.java:240)
>>>>>     at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>>> mapReduceLayer.PigMapOnly$Map.**map(PigMapOnly.java:65)
>>>>>     at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
>>>>>     at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
>>>>> java:358)
>>>>>     at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:307)
>>>>>     at org.apache.hadoop.mapred.**Child.main(Child.java:170)
>>>>> Caused by: java.io.IOException: Caught exception processing input row
>>>>> [null]
>>>>>     at
>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>>> exec(DoubleMax.java:70)
>>>>>     at
>>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>>> exec(DoubleMax.java:57)
>>>>>     at
>>>>>
>>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:201)
>>>>>     ... 10 more
>>>>> Caused by: java.lang.NullPointerException
>>>>>     ... 13 more
>>>>>
>>>>> *My Code:*
>>>>> *FFW2 = Load 'final_free_w2.txt';
>>>>> FFW3 = Load 'final_free_w3.txt';
>>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>>>     STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>>>
>>>>>     REGISTER
>>>>> /home/training/Desktop/1pig/**pig-0.7.0/contrib/piggybank/**
>>>>> piggybank.jar
>>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>>> B = FOREACH A GENERATE appID,
>>>>> org.apache.pig.piggybank.**evaluation.math.MAX((double)**rank2,
>>>>> (double)rank3);
>>>>> store B into 'FF23_FJM.txt'; *
>>>>>
>>>>>
>>>>> -->   Can any one pls let me know, what is the exact reason which is
>>>>> causing
>>>>> above exception...
>>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>>>
>>>>> ---
>>>>> Thanks&   Regards,
>>>>> Narayan.
>>>>>
>>>>>


Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Jonathan Coveney <jc...@gmail.com>.
Do we want the Max function to be able to handle nulls? Seems fairly natural
for it to be able to.

2011/6/16 Daniel Dai <ji...@yahoo-inc.com>

> Jonathan is right. math.MAX does not handle null input. Check for null
> before feeding into MAX is necessary.
>
> Daniel
>
>
> On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
>
>> Can you check if your rank2 or rank3 values are ever null? If they are,
>> there are some ad hoc fixes which you can do until this is fixed (and it
>> is
>> easy to fix, just a question of deciding what the desired handling of null
>> values should be). I would just do something like...
>>
>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>> B = FILTER A BY rank2 is null AND rank3 is null;
>> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as rank2, (
>> rank3 is null ? rank2 : rank3 ) as rank3;
>>
>> Obvoiusly you could tweak that for whatever you want to happen if a value
>> is
>> null.
>>
>> 2011/6/16 Jonathan Coveney<jc...@gmail.com>
>>
>>  Hm, just to make sure, I ran this against trunk (to see if it's just a
>>> 0.7.0 thing or not).
>>>
>>> A = LOAD 'test.txt'; --this is just a blank one line file
>>> B = FOREACH A GENERATE
>>> org.apache.pig.piggybank.**evaluation.math.MAX(1,null);
>>>
>>> I also tested fedding it files from test.txt etc. It fails when there is
>>> a
>>> null value. The cast does not.
>>>
>>> 2011/6/16 Lakshminarayana Motamarri<na...@gmail.com>
>>> >
>>>
>>>  Hi all,
>>>>
>>>> *I am receiving the following exception:*
>>>> org.apache.pig.backend.**executionengine.ExecException: ERROR 2078:
>>>> Caught
>>>> error from UDF: org.apache.pig.piggybank.**evaluation.math.DoubleMax
>>>> [Caught
>>>> exception processing input row  [null]]
>>>>    at
>>>>
>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:229)
>>>>    at
>>>>
>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:263)
>>>>    at
>>>>
>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>> relationalOperators.POForEach.**processPlan(POForEach.java:**269)
>>>>    at
>>>>
>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>> relationalOperators.POForEach.**getNext(POForEach.java:204)
>>>>    at
>>>>
>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>> mapReduceLayer.PigMapBase.**runPipeline(PigMapBase.java:**249)
>>>>    at
>>>>
>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>> mapReduceLayer.PigMapBase.map(**PigMapBase.java:240)
>>>>    at
>>>>
>>>> org.apache.pig.backend.hadoop.**executionengine.**
>>>> mapReduceLayer.PigMapOnly$Map.**map(PigMapOnly.java:65)
>>>>    at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**50)
>>>>    at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.**
>>>> java:358)
>>>>    at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:307)
>>>>    at org.apache.hadoop.mapred.**Child.main(Child.java:170)
>>>> Caused by: java.io.IOException: Caught exception processing input row
>>>> [null]
>>>>    at
>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>> exec(DoubleMax.java:70)
>>>>    at
>>>> org.apache.pig.piggybank.**evaluation.math.DoubleMax.**
>>>> exec(DoubleMax.java:57)
>>>>    at
>>>>
>>>> org.apache.pig.backend.hadoop.**executionengine.physicalLayer.**
>>>> expressionOperators.**POUserFunc.getNext(POUserFunc.**java:201)
>>>>    ... 10 more
>>>> Caused by: java.lang.NullPointerException
>>>>    ... 13 more
>>>>
>>>> *My Code:*
>>>> *FFW2 = Load 'final_free_w2.txt';
>>>> FFW3 = Load 'final_free_w3.txt';
>>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>>    STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>>
>>>>    REGISTER
>>>> /home/training/Desktop/1pig/**pig-0.7.0/contrib/piggybank/**
>>>> piggybank.jar
>>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>>> B = FOREACH A GENERATE appID,
>>>> org.apache.pig.piggybank.**evaluation.math.MAX((double)**rank2,
>>>> (double)rank3);
>>>> store B into 'FF23_FJM.txt'; *
>>>>
>>>>
>>>> -->  Can any one pls let me know, what is the exact reason which is
>>>> causing
>>>> above exception...
>>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>>
>>>> ---
>>>> Thanks&  Regards,
>>>> Narayan.
>>>>
>>>>
>>>
>

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Jonathan Coveney <jc...@gmail.com>.
Ignore the antlr runtime thing, I simply forgot to remove it. It's a weird
hack that was necessary on my system to get pig trunk withouthadoop to work.

2011/6/18 Lakshminarayana Motamarri <na...@gmail.com>

>
> Hi all,
>
> Thanks Jonathan, once again for ur response.
>
> First of all:
> 1) what is *antlr-runtime-3.2.jar*
> I don't find in my PIG installation path: /*/*/pig/ivy/*
>
> 2) Coming to the prev problem context of NULL:
> You are right.. it would have worked..
> Later I also realized that, not just my rank columns, but the initial ID
> column is also null in one of the case... i.e. the last line of the file..
>
> so I am suppose to handle even that case...
> i.e by *A2 = FILTER A BY appID is not null;*
>
> anyways it worked out great, got the results. thanks for ur help...
>
> Thanks & Regards,
> Narayan.
>
>
> On Fri, Jun 17, 2011 at 6:55 AM, Jonathan Coveney <jc...@gmail.com>wrote:
>
>> First, when troubleshooting (and just in general), I prefer to break steps
>> out into multiple lines instead of trying to be overly expressive in one
>> line. Pig scripts in general aren't so large that breaking it out doesn't
>> aid a lot in debugging, but this is of course personal style.
>>
>> I create a file thing.txt, whose contents are as follows:
>>
>> 1,1
>> 1,2
>> 1,3
>> 1,4
>> ,
>> ,
>> 1,
>> 2,
>> ,3
>> 4,
>> 6,6
>> 4,1
>> 2,3
>>
>>
>> 8,
>> 9
>> 9
>>
>>
>> So there are some null lines, some lines with only one, the other, etc.
>> Here is the script I ran. Caveat: I'm running pig trunk.
>>
>> register /home/jcoveney/pig/build/ivy/lib/Pig/antlr-runtime-3.2.jar;
>> register /home/jcoveney/pig/contrib/piggybank/java/piggybank.jar;
>>
>> A = LOAD 'thing.txt' USING PigStorage(',') AS (rank1,rank2);
>> B = FILTER A BY rank1 is not null OR rank2 is not null;
>> C = FOREACH B GENERATE ( rank1 is null ? rank2 : rank1 ) as rank1, ( rank2
>> is null ? rank1 : rank2 ) as rank2;
>> D = FOREACH C GENERATE
>> org.apache.pig.piggybank.evaluation.math.MAX(rank1,rank2);
>>
>> This worked fine.
>>
>

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Lakshminarayana Motamarri <na...@gmail.com>.
Hi all,

Thanks Jonathan, once again for ur response.

First of all:
1) what is *antlr-runtime-3.2.jar*
I don't find in my PIG installation path: /*/*/pig/ivy/*

2) Coming to the prev problem context of NULL:
You are right.. it would have worked..
Later I also realized that, not just my rank columns, but the initial ID
column is also null in one of the case... i.e. the last line of the file..

so I am suppose to handle even that case...
i.e by *A2 = FILTER A BY appID is not null;*

anyways it worked out great, got the results. thanks for ur help...

Thanks & Regards,
Narayan.

On Fri, Jun 17, 2011 at 6:55 AM, Jonathan Coveney <jc...@gmail.com>wrote:

> First, when troubleshooting (and just in general), I prefer to break steps
> out into multiple lines instead of trying to be overly expressive in one
> line. Pig scripts in general aren't so large that breaking it out doesn't
> aid a lot in debugging, but this is of course personal style.
>
> I create a file thing.txt, whose contents are as follows:
>
> 1,1
> 1,2
> 1,3
> 1,4
> ,
> ,
> 1,
> 2,
> ,3
> 4,
> 6,6
> 4,1
> 2,3
>
>
> 8,
> 9
> 9
>
>
> So there are some null lines, some lines with only one, the other, etc.
> Here is the script I ran. Caveat: I'm running pig trunk.
>
> register /home/jcoveney/pig/build/ivy/lib/Pig/antlr-runtime-3.2.jar;
> register /home/jcoveney/pig/contrib/piggybank/java/piggybank.jar;
>
> A = LOAD 'thing.txt' USING PigStorage(',') AS (rank1,rank2);
> B = FILTER A BY rank1 is not null OR rank2 is not null;
> C = FOREACH B GENERATE ( rank1 is null ? rank2 : rank1 ) as rank1, ( rank2
> is null ? rank1 : rank2 ) as rank2;
> D = FOREACH C GENERATE
> org.apache.pig.piggybank.evaluation.math.MAX(rank1,rank2);
>
> This worked fine.
>

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Jonathan Coveney <jc...@gmail.com>.
First, when troubleshooting (and just in general), I prefer to break steps
out into multiple lines instead of trying to be overly expressive in one
line. Pig scripts in general aren't so large that breaking it out doesn't
aid a lot in debugging, but this is of course personal style.

I create a file thing.txt, whose contents are as follows:

1,1
1,2
1,3
1,4
,
,
1,
2,
,3
4,
6,6
4,1
2,3


8,
9
9


So there are some null lines, some lines with only one, the other, etc. Here
is the script I ran. Caveat: I'm running pig trunk.

register /home/jcoveney/pig/build/ivy/lib/Pig/antlr-runtime-3.2.jar;
register /home/jcoveney/pig/contrib/piggybank/java/piggybank.jar;

A = LOAD 'thing.txt' USING PigStorage(',') AS (rank1,rank2);
B = FILTER A BY rank1 is not null OR rank2 is not null;
C = FOREACH B GENERATE ( rank1 is null ? rank2 : rank1 ) as rank1, ( rank2
is null ? rank1 : rank2 ) as rank2;
D = FOREACH C GENERATE
org.apache.pig.piggybank.evaluation.math.MAX(rank1,rank2);

This worked fine.

2011/6/16 Lakshminarayana Motamarri <na...@gmail.com>

>
> Hi all,
>
> Thanks Jonathan and Daniel for prompt responses..
>
> Based on ur suggestions, I tried as following...
>
> * Code:*
>
>        REGISTER
> /home/training/Desktop/1pig/pig-0.7.0/contrib/piggybank/piggybank.jar
> *
>       // all 3 combinations of A, are followed by four combinations of B:*
> *    A = LOAD 'FF23_Filtered1.txt' AS (appID: float, rankW2: float,
> rankW3: float);
>     A = LOAD 'FF23_Filtered1.txt' AS (appID: int, rankW2: int, rankW3:
> int);
>     A = LOAD 'FF23_Filtered1.txt' AS (appID, rankW2, rankW3);
> *
>     *B = FOREACH A GENERATE appID,
> org.apache.pig.piggybank.evaluation.math.MAX((double)rankW2,
> (double)rankW3); **
>     store B into 'FF23_FJM.txt';               **//received null pointer
> exception.**
> **
>     B = FOREACH A GENERATE appID,
> org.apache.pig.piggybank.evaluation.math.MAX(((double)rankW2 is null ?
> (double)rankW3 : (double)rankW2), ((double)rankW3 is null ? (double)rankW2 :
> (double)rankW3));
>     store B into 'FF23_FJM.txt';               **//received nullpointer
> exception.*
> *
>     B = FOREACH A GENERATE appID,
> org.apache.pig.piggybank.evaluation.math.MAX(((double)rankW2 is null ?
> (double)rankW3 : (double)rankW2) AS (double)rankW2, ((double)rankW3 is null
> ? (double)rankW2 : (double)rankW3) AS (double)rankW3));       **//
> received invalid alias error**
>
>
>     B = FOREACH A GENERATE appID,
> org.apache.pig.piggybank.evaluation.math.MAX((rankW2 is null ? rankW3 :
> rankW2) AS (double)rankW2, (rankW3 is null ? rankW2 : rankW3) AS
> (double)rankW3));         **
>                                              **//invalid alias**
>
> -> As mentioned above, in all 12 combinations of the trails, I got the
> corresponding exceptions, as mentioned with B's... Please advise, if I
> missed some thing...
>
> **the details of both exceptions are:**
> 1) org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught
> error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught
> exception processing input row  [null]]
>
>     at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
>     at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
>     at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:269)
>     at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>     at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249)
>     at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
>     at
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
>     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>     at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.IOException: Caught exception processing input row
> [null]
>     at
> org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:70)
>     at
> org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:57)
>     at
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
>     ... 10 more
> Caused by: java.lang.NullPointerException
>     ... 13 more
>
> 2)---
> ERROR 1000: Error during parsing. Invalid alias: org in {appID:
> float,rankW2: float,rankW3: float}
>
> org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error
> during parsing. Invalid alias: org in {appID: float,rankW2: float,rankW3:
> float}
>     at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1037)
>     at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:981)
>     at org.apache.pig.PigServer.registerQuery(PigServer.java:383)
>     at
> org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:717)
>     at
> org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:273)
>     at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
>     at
> org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
>     at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
>     at org.apache.pig.Main.main(Main.java:363)
> Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid
> alias: org in {appID: float,rankW2: float,rankW3: float}
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:6731)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:6575)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4682)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:4579)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:4525)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:4434)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:4360)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:4326)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:4252)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:4175)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:4119)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:3528)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2938)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1314)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:893)
>     at
> org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:682)
>     at
> org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
>     at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1031)
>     ... 8 more
> *
> ---
> Thanks & Regards,
> Narayan.
>
>
> On Thu, Jun 16, 2011 at 11:30 AM, Daniel Dai <ji...@yahoo-inc.com>wrote:
>
>> Jonathan is right. math.MAX does not handle null input. Check for null
>> before feeding into MAX is necessary.
>>
>> Daniel
>
>

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Lakshminarayana Motamarri <na...@gmail.com>.
Hi all,

Thanks Jonathan and Daniel for prompt responses..

Based on ur suggestions, I tried as following...

* Code:*

       REGISTER
/home/training/Desktop/1pig/pig-0.7.0/contrib/piggybank/piggybank.jar
*
      // all 3 combinations of A, are followed by four combinations of B:*
*    A = LOAD 'FF23_Filtered1.txt' AS (appID: float, rankW2: float, rankW3:
float);
    A = LOAD 'FF23_Filtered1.txt' AS (appID: int, rankW2: int, rankW3: int);
    A = LOAD 'FF23_Filtered1.txt' AS (appID, rankW2, rankW3);
*
    *B = FOREACH A GENERATE appID,
org.apache.pig.piggybank.evaluation.math.MAX((double)rankW2,
(double)rankW3); **
    store B into 'FF23_FJM.txt';               **//received null pointer
exception.**
**
    B = FOREACH A GENERATE appID,
org.apache.pig.piggybank.evaluation.math.MAX(((double)rankW2 is null ?
(double)rankW3 : (double)rankW2), ((double)rankW3 is null ? (double)rankW2 :
(double)rankW3));
    store B into 'FF23_FJM.txt';               **//received nullpointer
exception.*
*
    B = FOREACH A GENERATE appID,
org.apache.pig.piggybank.evaluation.math.MAX(((double)rankW2 is null ?
(double)rankW3 : (double)rankW2) AS (double)rankW2, ((double)rankW3 is null
? (double)rankW2 : (double)rankW3) AS (double)rankW3));       **// received
invalid alias error**


    B = FOREACH A GENERATE appID,
org.apache.pig.piggybank.evaluation.math.MAX((rankW2 is null ? rankW3 :
rankW2) AS (double)rankW2, (rankW3 is null ? rankW2 : rankW3) AS
(double)rankW3));         **
                                           **//invalid alias**

-> As mentioned above, in all 12 combinations of the trails, I got the
corresponding exceptions, as mentioned with B's... Please advise, if I
missed some thing...

**the details of both exceptions are:**
1) org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught
error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught
exception processing input row  [null]]
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:269)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
    at
org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
    at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: java.io.IOException: Caught exception processing input row
[null]
    at
org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:70)
    at
org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:57)
    at
org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
    ... 10 more
Caused by: java.lang.NullPointerException
    ... 13 more

2)---
ERROR 1000: Error during parsing. Invalid alias: org in {appID:
float,rankW2: float,rankW3: float}

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1000: Error during
parsing. Invalid alias: org in {appID: float,rankW2: float,rankW3: float}
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1037)
    at org.apache.pig.PigServer$Graph.registerQuery(PigServer.java:981)
    at org.apache.pig.PigServer.registerQuery(PigServer.java:383)
    at
org.apache.pig.tools.grunt.GruntParser.processPig(GruntParser.java:717)
    at
org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:273)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:166)
    at
org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:142)
    at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
    at org.apache.pig.Main.main(Main.java:363)
Caused by: org.apache.pig.impl.logicalLayer.parser.ParseException: Invalid
alias: org in {appID: float,rankW2: float,rankW3: float}
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AliasFieldOrSpec(QueryParser.java:6731)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ColOrSpec(QueryParser.java:6575)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseEvalSpec(QueryParser.java:4682)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.UnaryExpr(QueryParser.java:4579)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.CastExpr(QueryParser.java:4525)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.MultiplicativeExpr(QueryParser.java:4434)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.AdditiveExpr(QueryParser.java:4360)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.InfixExpr(QueryParser.java:4326)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItem(QueryParser.java:4252)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.FlattenedGenerateItemList(QueryParser.java:4175)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.GenerateStatement(QueryParser.java:4119)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.NestedBlock(QueryParser.java:3528)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.ForEachClause(QueryParser.java:2938)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.BaseExpr(QueryParser.java:1314)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Expr(QueryParser.java:893)
    at
org.apache.pig.impl.logicalLayer.parser.QueryParser.Parse(QueryParser.java:682)
    at
org.apache.pig.impl.logicalLayer.LogicalPlanBuilder.parse(LogicalPlanBuilder.java:63)
    at org.apache.pig.PigServer$Graph.parseQuery(PigServer.java:1031)
    ... 8 more
*
---
Thanks & Regards,
Narayan.

On Thu, Jun 16, 2011 at 11:30 AM, Daniel Dai <ji...@yahoo-inc.com> wrote:

> Jonathan is right. math.MAX does not handle null input. Check for null
> before feeding into MAX is necessary.
>
> Daniel

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Daniel Dai <ji...@yahoo-inc.com>.
Jonathan is right. math.MAX does not handle null input. Check for null 
before feeding into MAX is necessary.

Daniel

On 06/16/2011 06:45 AM, Jonathan Coveney wrote:
> Can you check if your rank2 or rank3 values are ever null? If they are,
> there are some ad hoc fixes which you can do until this is fixed (and it is
> easy to fix, just a question of deciding what the desired handling of null
> values should be). I would just do something like...
>
> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
> B = FILTER A BY rank2 is null AND rank3 is null;
> C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as rank2, (
> rank3 is null ? rank2 : rank3 ) as rank3;
>
> Obvoiusly you could tweak that for whatever you want to happen if a value is
> null.
>
> 2011/6/16 Jonathan Coveney<jc...@gmail.com>
>
>> Hm, just to make sure, I ran this against trunk (to see if it's just a
>> 0.7.0 thing or not).
>>
>> A = LOAD 'test.txt'; --this is just a blank one line file
>> B = FOREACH A GENERATE
>> org.apache.pig.piggybank.evaluation.math.MAX(1,null);
>>
>> I also tested fedding it files from test.txt etc. It fails when there is a
>> null value. The cast does not.
>>
>> 2011/6/16 Lakshminarayana Motamarri<na...@gmail.com>
>>
>>> Hi all,
>>>
>>> *I am receiving the following exception:*
>>> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught
>>> error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught
>>> exception processing input row  [null]]
>>>     at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
>>>     at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
>>>     at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:269)
>>>     at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>>>     at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249)
>>>     at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
>>>     at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
>>>     at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>>     at org.apache.hadoop.mapred.Child.main(Child.java:170)
>>> Caused by: java.io.IOException: Caught exception processing input row
>>> [null]
>>>     at
>>> org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:70)
>>>     at
>>> org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:57)
>>>     at
>>>
>>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
>>>     ... 10 more
>>> Caused by: java.lang.NullPointerException
>>>     ... 13 more
>>>
>>> *My Code:*
>>> *FFW2 = Load 'final_free_w2.txt';
>>> FFW3 = Load 'final_free_w3.txt';
>>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>>     STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>>
>>>     REGISTER
>>> /home/training/Desktop/1pig/pig-0.7.0/contrib/piggybank/piggybank.jar
>>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>>> B = FOREACH A GENERATE appID,
>>> org.apache.pig.piggybank.evaluation.math.MAX((double)rank2,
>>> (double)rank3);
>>> store B into 'FF23_FJM.txt'; *
>>>
>>>
>>> -->  Can any one pls let me know, what is the exact reason which is causing
>>> above exception...
>>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>>
>>> ---
>>> Thanks&  Regards,
>>> Narayan.
>>>
>>


Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Jonathan Coveney <jc...@gmail.com>.
Can you check if your rank2 or rank3 values are ever null? If they are,
there are some ad hoc fixes which you can do until this is fixed (and it is
easy to fix, just a question of deciding what the desired handling of null
values should be). I would just do something like...

A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
B = FILTER A BY rank2 is null AND rank3 is null;
C = FOREACH A GENERATE appID, ( rank2 is null ? rank3 : rank2) as rank2, (
rank3 is null ? rank2 : rank3 ) as rank3;

Obvoiusly you could tweak that for whatever you want to happen if a value is
null.

2011/6/16 Jonathan Coveney <jc...@gmail.com>

> Hm, just to make sure, I ran this against trunk (to see if it's just a
> 0.7.0 thing or not).
>
> A = LOAD 'test.txt'; --this is just a blank one line file
> B = FOREACH A GENERATE
> org.apache.pig.piggybank.evaluation.math.MAX(1,null);
>
> I also tested fedding it files from test.txt etc. It fails when there is a
> null value. The cast does not.
>
> 2011/6/16 Lakshminarayana Motamarri <na...@gmail.com>
>
>> Hi all,
>>
>> *I am receiving the following exception:*
>> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught
>> error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught
>> exception processing input row  [null]]
>>    at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
>>    at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
>>    at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:269)
>>    at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>>    at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249)
>>    at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
>>    at
>>
>> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
>>    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
>> Caused by: java.io.IOException: Caught exception processing input row
>> [null]
>>    at
>> org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:70)
>>    at
>> org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:57)
>>    at
>>
>> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
>>    ... 10 more
>> Caused by: java.lang.NullPointerException
>>    ... 13 more
>>
>> *My Code:*
>> *FFW2 = Load 'final_free_w2.txt';
>> FFW3 = Load 'final_free_w3.txt';
>> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
>> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
>> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
>> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>>    STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>>
>>    REGISTER
>> /home/training/Desktop/1pig/pig-0.7.0/contrib/piggybank/piggybank.jar
>> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
>> B = FOREACH A GENERATE appID,
>> org.apache.pig.piggybank.evaluation.math.MAX((double)rank2,
>> (double)rank3);
>> store B into 'FF23_FJM.txt'; *
>>
>>
>> --> Can any one pls let me know, what is the exact reason which is causing
>> above exception...
>> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>>
>> ---
>> Thanks & Regards,
>> Narayan.
>>
>
>

Re: ERROR 2078: Caught error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught exception processing input row [null]]

Posted by Jonathan Coveney <jc...@gmail.com>.
Hm, just to make sure, I ran this against trunk (to see if it's just a 0.7.0
thing or not).

A = LOAD 'test.txt'; --this is just a blank one line file
B = FOREACH A GENERATE org.apache.pig.piggybank.evaluation.math.MAX(1,null);

I also tested fedding it files from test.txt etc. It fails when there is a
null value. The cast does not.

2011/6/16 Lakshminarayana Motamarri <na...@gmail.com>

> Hi all,
>
> *I am receiving the following exception:*
> org.apache.pig.backend.executionengine.ExecException: ERROR 2078: Caught
> error from UDF: org.apache.pig.piggybank.evaluation.math.DoubleMax [Caught
> exception processing input row  [null]]
>    at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:229)
>    at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:263)
>    at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:269)
>    at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNext(POForEach.java:204)
>    at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:249)
>    at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:240)
>    at
>
> org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map.map(PigMapOnly.java:65)
>    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>    at org.apache.hadoop.mapred.Child.main(Child.java:170)
> Caused by: java.io.IOException: Caught exception processing input row
> [null]
>    at
> org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:70)
>    at
> org.apache.pig.piggybank.evaluation.math.DoubleMax.exec(DoubleMax.java:57)
>    at
>
> org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:201)
>    ... 10 more
> Caused by: java.lang.NullPointerException
>    ... 13 more
>
> *My Code:*
> *FFW2 = Load 'final_free_w2.txt';
> FFW3 = Load 'final_free_w3.txt';
> FFW2_RankG_RankCate = FOREACH FFW2 GENERATE $0, $4, $3;
> FFW3_RankG_RankCate = FOREACH FFW3 GENERATE $0, $4, $3;
> FF23 = JOIN FFW2_RankG_RankCate BY $0, FFW3_RankG_RankCate BY $0;
> FF23_Filtered = Foreach FF23 Generate $0,$2,$5;
>    STORE FF23_Filtered INTO 'FF23_Filtered.txt';
>
>    REGISTER
> /home/training/Desktop/1pig/pig-0.7.0/contrib/piggybank/piggybank.jar
> A = LOAD 'FF23_Filtered.txt' AS (appID, rank2, rank3);
> B = FOREACH A GENERATE appID,
> org.apache.pig.piggybank.evaluation.math.MAX((double)rank2, (double)rank3);
> store B into 'FF23_FJM.txt'; *
>
>
> --> Can any one pls let me know, what is the exact reason which is causing
> above exception...
> I also made sure that, the file* FF23_Filtered.txt* is not NULL.
>
> ---
> Thanks & Regards,
> Narayan.
>