You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pig.apache.org by "Rajesh Balamohan (JIRA)" <ji...@apache.org> on 2014/09/22 03:27:33 UTC
[jira] [Created] (PIG-4188) FindQuantilesTez throwing
IndexOutOfBoundsException with small dataset
Rajesh Balamohan created PIG-4188:
-------------------------------------
Summary: FindQuantilesTez throwing IndexOutOfBoundsException with small dataset
Key: PIG-4188
URL: https://issues.apache.org/jira/browse/PIG-4188
Project: Pig
Issue Type: Bug
Components: tez
Reporter: Rajesh Balamohan
java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
at java.util.ArrayList.rangeCheck(ArrayList.java:635)
at java.util.ArrayList.get(ArrayList.java:411)
at org.apache.pig.impl.builtin.FindQuantiles.exec(FindQuantiles.java:217)
at org.apache.pig.backend.hadoop.executionengine.tez.FindQuantilesTez.exec(FindQuantilesTez.java:96)
at org.apache.pig.backend.hadoop.executionengine.tez.FindQuantilesTez.exec(FindQuantilesTez.java:35)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNext(POUserFunc.java:344)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.expressionOperators.POUserFunc.getNextTuple(POUserFunc.java:383)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.getNext(PhysicalOperator.java:355)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.processPlan(POForEach.java:379)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.relationalOperators.POForEach.getNextTuple(POForEach.java:299)
at org.apache.pig.backend.hadoop.executionengine.physicalLayer.PhysicalOperator.processInput(PhysicalOperator.java:301)
at org.apache.pig.backend.hadoop.executionengine.tez.POValueOutputTez.getNextTuple(POValueOutputTez.java:141)
at org.apache.pig.backend.hadoop.executionengine.tez.PigProcessor.runPipeline(PigProcessor.java:319)
at org.apache.pig.backend.hadoop.executionengine.tez.PigProcessor.run(PigProcessor.java:198)
at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:324)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:180)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:172)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:172)
at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.call(TezTaskRunner.java:167)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Pig script:
========
set tez.lib.uris '<appropriate_location>'
set tez.runtime.shuffle.fetch.max.task.output.at.once 2
set mapreduce.map.output.compress true;
set mapreduce.map.output.compress.codec 'org.apache.hadoop.io.compress.SnappyCodec';
set mapred.reduce.child.java.opts '-Xmx1024m';
A = load '/user/data/studenttab10' as (name, age, gpa);
B = filter A by age > 20;
C = group B by name;
D = foreach C generate group, COUNT(B) PARALLEL 16;
E = order D by $0 PARALLEL 16;
F = limit E 10;
store F into '/user/output/';
Dataset:
=======
katie underhill 44 3.49
irene thompson 72 3.42
quinn robinson 50 3.26
david quirinius 76 0.86
nick ichabod 32 2.87
fred ichabod 57 3.95
fred hernandez 18 2.17
sarah nixon 21 3.70
holly ichabod 35 2.91
fred hernandez 42 2.68
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)