You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2016/02/19 00:33:18 UTC
[jira] [Updated] (TEZ-3126) Auto-Reduce Parallelism: Vertex not
re-configured when reduced by less than half.
[ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jonathan Eagles updated TEZ-3126:
---------------------------------
Attachment: TEZ-3126.1.patch
Agreed that it was intentional before. My intention here is that we can actually better fill the customer's request if we round up to the correct number of tasks.
| Ideal Tasks | 20 |
| Initial Estimate | Tasks Num | Patched Tasks Num |
| 20 | 20 | 20 |
| 21 | 21 | 21 |
| 22 | 22 | 22 |
| 23 | 23 | 23 |
| 24 | 24 | 24 |
| 25 | 25 | 25 |
| 26 | 26 | 26 |
| 27 | 27 | 27 |
| 28 | 28 | 28 |
| 29 | 29 | 29 |
| 30 | 30 | 15 |
| 31 | 31 | 16 |
| 32 | 32 | 16 |
| 33 | 33 | 17 |
| 34 | 34 | 17 |
| 35 | 35 | 18 |
| 36 | 36 | 18 |
| 37 | 37 | 19 |
| 38 | 38 | 19 |
| 39 | 39 | 20 |
| 40 | 20 | 20 |
Trying to avoid running twice as many reducers. After it's reduced by more than a factor of two, the original reduction isn't too bad. Perhaps there is another way that is preferred.
> Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than half.
> ---------------------------------------------------------------------------------
>
> Key: TEZ-3126
> URL: https://issues.apache.org/jira/browse/TEZ-3126
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Jonathan Eagles
> Assignee: Jonathan Eagles
> Priority: Critical
> Attachments: TEZ-3126.1.patch
>
>
> For example, when reducing parallelism from 36 to 22. The basePartitionRange will be 1 and will not re-configure the vertex.
> {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey}
> int desiredTaskParallelism =
> (int)(
> (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/
> desiredTaskInputDataSize);
> if(desiredTaskParallelism < minTaskParallelism) {
> desiredTaskParallelism = minTaskParallelism;
> }
>
> if(desiredTaskParallelism >= currentParallelism) {
> return true;
> }
>
> // most shufflers will be assigned this range
> basePartitionRange = currentParallelism/desiredTaskParallelism;
>
> if (basePartitionRange <= 1) {
> // nothing to do if range is equal 1 partition. shuffler does it by default
> return true;
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)