You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Jonathan Eagles (JIRA)" <ji...@apache.org> on 2016/02/19 00:33:18 UTC

[jira] [Updated] (TEZ-3126) Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than half.

     [ https://issues.apache.org/jira/browse/TEZ-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Eagles updated TEZ-3126:
---------------------------------
    Attachment: TEZ-3126.1.patch

Agreed that it was intentional before. My intention here is that we can actually better fill the customer's request if we round up to the correct number of tasks.

| Ideal Tasks | 20 |
| Initial Estimate | Tasks Num | Patched Tasks  Num |
| 20 | 20 | 20 |
| 21 | 21 | 21 |
| 22 | 22 | 22 |
| 23 | 23 | 23 | 
| 24 | 24 | 24 | 
| 25 | 25 | 25 | 
| 26 | 26 | 26 | 
| 27 | 27 | 27 | 
| 28 | 28 | 28 | 
| 29 | 29 | 29 | 
| 30 | 30 | 15 | 
| 31 | 31 | 16 | 
| 32 | 32 | 16 | 
| 33 | 33 | 17 | 
| 34 | 34 | 17 | 
| 35 | 35 | 18 | 
| 36 | 36 | 18 | 
| 37 | 37 | 19 | 
| 38 | 38 | 19 | 
| 39 | 39 | 20 | 
| 40 | 20 | 20 |

Trying to avoid running twice as many reducers. After it's reduced by more than a factor of two, the original reduction isn't too bad. Perhaps there is another way that is preferred.


> Auto-Reduce Parallelism: Vertex not re-configured when reduced by less than half.
> ---------------------------------------------------------------------------------
>
>                 Key: TEZ-3126
>                 URL: https://issues.apache.org/jira/browse/TEZ-3126
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jonathan Eagles
>            Assignee: Jonathan Eagles
>            Priority: Critical
>         Attachments: TEZ-3126.1.patch
>
>
> For example, when reducing parallelism from 36 to 22. The basePartitionRange will be 1 and will not re-configure the vertex.
> {code:java|title=ShuffleVertexManager#determineParallelismAndApply|borderStyle=dashed|bgColor=lightgrey}
>     int desiredTaskParallelism = 
>         (int)(
>             (expectedTotalSourceTasksOutputSize+desiredTaskInputDataSize-1)/
>             desiredTaskInputDataSize);
>     if(desiredTaskParallelism < minTaskParallelism) {
>       desiredTaskParallelism = minTaskParallelism;
>     }
>     
>     if(desiredTaskParallelism >= currentParallelism) {
>       return true;
>     }
>     
>     // most shufflers will be assigned this range
>     basePartitionRange = currentParallelism/desiredTaskParallelism;
>     
>     if (basePartitionRange <= 1) {
>       // nothing to do if range is equal 1 partition. shuffler does it by default
>       return true;
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)