You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rohini Palaniswamy (JIRA)" <ji...@apache.org> on 2016/01/10 02:36:39 UTC

[jira] [Created] (PIG-4774) Fix NPE in AlgebraicMathBase UDFs

Rohini Palaniswamy created PIG-4774:
---------------------------------------

             Summary: Fix NPE in AlgebraicMathBase UDFs
                 Key: PIG-4774
                 URL: https://issues.apache.org/jira/browse/PIG-4774
             Project: Pig
          Issue Type: Bug
            Reporter: Rohini Palaniswamy
            Assignee: Rohini Palaniswamy
             Fix For: 0.16.0


For UDF backward compatibility issue after POStatus.STATUS_NULL refactory issue, PIG-4184 fixed the udfs to handle null by adding input.get(0) == null check in all the UDFs. UDFs extending AlgebraicMathBase which handle SUM, COUNT, etc was not fixed.

Script to reproduce NPE. It is an odd usage doing aggregation after join instead of group by which one user was doing and rewrite moving aggregation after group by fixed the NPE. Might be rare, but there might be other cases where user call those functions with a bag directly without group by which might cause nulls to be passed to it.

A = LOAD '/tmp/data' as (f1:int, f2:int, f3:int);
B = LOAD '/tmp/data1' as (f1:int, f2:int, f3:int);
A1 = GROUP A by f1;
A2 = FOREACH A1 GENERATE group as f1, $1;
C = JOIN B by f1 LEFT, A2 by f1;
D = FOREACH C GENERATE B::f1, (double)SUM(A2::A.f3)/SUM(A2::A.f2);
STORE D into '/tmp/out';



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)