You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@datafu.apache.org by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org> on 2014/09/08 10:50:28 UTC
[jira] [Created] (DATAFU-68) SampleByKey can throw
NullPointerException
Jarek Jarcec Cecho created DATAFU-68:
----------------------------------------
Summary: SampleByKey can throw NullPointerException
Key: DATAFU-68
URL: https://issues.apache.org/jira/browse/DATAFU-68
Project: DataFu
Issue Type: Bug
Reporter: Jarek Jarcec Cecho
Assignee: Jarek Jarcec Cecho
Fix For: 1.3.0
I've noticed that {{SampleByKey}} can throw {{NullPointerException}}:
{code}
Caused by: java.lang.NullPointerException
at datafu.pig.sampling.SampleByKey.setUDFContextSignature(SampleByKey.java:86)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.setSignature(POUserFunc.java:604)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.instantiateFunc(POUserFunc.java:127)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.<init>(POUserFunc.java:122)
at org.apache.pig.newplan.logical.expression.ExpToPhyTranslationVisitor.visit(ExpToPhyTranslationVisitor.java:505)
at org.apache.pig.newplan.logical.expression.UserFuncExpression.accept(UserFuncExpression.java:112)
at org.apache.pig.newplan.ReverseDependencyOrderWalkerWOSeenChk.walk(ReverseDependencyOrderWalkerWOSeenChk.java:69)
at org.apache.pig.newplan.logical.relational.LogToPhyTranslationVisitor.visit(LogToPhyTranslationVisitor.java:220)
at org.apache.pig.newplan.logical.relational.LOFilter.accept(LOFilter.java:79)
at org.apache.pig.newplan.DependencyOrderWalker.walk(DependencyOrderWalker.java:75)
at org.apache.pig.newplan.PlanVisitor.visit(PlanVisitor.java:52)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.compile(HExecutionEngine.java:310)
at org.apache.pig.PigServer.compilePp(PigServer.java:1380)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1305)
at org.apache.pig.PigServer.storeEx(PigServer.java:978)
at org.apache.pig.PigServer.store(PigServer.java:942)
at org.apache.pig.Pig
{code}
I've reproduced the behaviour on old 1.1.0 version, but the UDF in question did not change much since then and hence I'm assuming that trunk will be affected the same way. Script that reproduces the issue is simple:
{code}
grunt> DEFINE SampleByKey datafu.pig.sampling.SampleByKey('0.5');
grunt> data = LOAD 'datafu/input_datafu' AS (A_id:chararray, B_id:chararray, C:int);
grunt> out = FILTER data BY SampleByKey(A_id);
grunt> DUMP out;
{code}
The problem seems to be that method {{setUDFContextSignature}} can be called with {{null}} argument that breaks our code. The documentation for this method is not specific whether {{null}} is or isn't allowed. I've looked into other UDFs in Pig and it seems that they are handling the case when signature is {{null}} and hence I've decided to fix {{SampleByKey}} as well.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)