You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pig.apache.org by "Daniel Dai (JIRA)" <ji...@apache.org> on 2015/02/04 22:41:34 UTC

[jira] [Created] (PIG-4410) Fix testRankWithEmptyReduce in tez mode

Daniel Dai created PIG-4410:
-------------------------------

             Summary: Fix testRankWithEmptyReduce in tez mode
                 Key: PIG-4410
                 URL: https://issues.apache.org/jira/browse/PIG-4410
             Project: Pig
          Issue Type: Bug
          Components: tez
            Reporter: Daniel Dai
            Assignee: Daniel Dai
             Fix For: 0.15.0


testRankWithEmptyReduce added in PIG-4392 failed in tez mode. The reason is POReservoirSample produce more sample than necessary. In particular, if the input of the vertex is empty, it produces a fake tuple which does not have the original data, but a marked field plus 0 rowNum. That cause the WeightedRangePartitioner fail:
{code}
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to java.lang.Integer
	at org.apache.pig.backend.hadoop.HDataType.getWritableComparableTypes(HDataType.java:115)
	at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.partitioners.WeightedRangePartitioner.getPigNullableWritable(WeightedRangePartitioner.java:192)
{code}
Another issue I found is GetMemNumRows, I erroneously add the size of mark tuple, which make the size estimation inaccurate. I put the fix in the same patch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)