You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@crunch.apache.org by "Gabriel Reid (JIRA)" <ji...@apache.org> on 2018/09/14 08:36:00 UTC

[jira] [Commented] (CRUNCH-673) Sort fails when using more reducers than records

    [ https://issues.apache.org/jira/browse/CRUNCH-673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16614516#comment-16614516 ] 

Gabriel Reid commented on CRUNCH-673:
-------------------------------------

AttachedĀ  [^CRUNCH-673.patch] which resolves this issue, as well as adding a test for this situation.

The fix definitely works, but I'm a bit unsure about the addition of the test in SortIT. The added test adds the use of MiniMRYarnCluster, which dramatically slows down the running of SortIT (although the MiniMRYarnCluster) is only used for one test case in that class. This is such an edge case that I'm wondering if it's worth slowing the build down just for that.

On the other hand, the use of TotalOrderPartitioner is pretty important in the Sort class, and currently there's no other built-in testing that makes use of it, so this might also be a good first step to introduce such testing for it.

Anyone have any thoughts either way on that?

> Sort fails when using more reducers than records
> ------------------------------------------------
>
>                 Key: CRUNCH-673
>                 URL: https://issues.apache.org/jira/browse/CRUNCH-673
>             Project: Crunch
>          Issue Type: Bug
>            Reporter: Gabriel Reid
>            Priority: Minor
>         Attachments: CRUNCH-673.patch
>
>
> We've run into an issue where running Sort with a number of reducers that is higher than the number of records to be sorted fails.
> The way in which this occurs is that a large PCollection is filtered down to almost nothing (say 10 records), and that filtered PCollection is passed in to Sort. Sort configures n reducers for the small PCollection (because it doesn't realize that it has been filtered so aggressively), so then there are for example 20 reducers configured. Reservoir sampling is used to build up the partition definitions for the TotalOrderPartitioner, but because there are only 10 records in the filtered PCollection, only 10 partitions are defined for the TotalOrderPartitioner. This then causes a precondition in TotalOrderPartitioner to fail, because the number of partitions in the partitions file doesn't match up with the number of configured reducers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)