You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Jothi Padmanabhan (JIRA)" <ji...@apache.org> on 2009/01/07 13:42:44 UTC

[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

    [ https://issues.apache.org/jira/browse/HADOOP-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12661538#action_12661538 ] 

Jothi Padmanabhan commented on HADOOP-1338:
-------------------------------------------

I tested a patch where 
* Each copier thread is assigned one host
* Each copier thread would pull  'n' map outputs from a given host (until a specific size threshold has been pulled), before moving on to the next thread
* Each fetch would be one map request/response (as it exists in the trunk)

With the above patch, I did not observe any improvement at all (for a variety of map sizes with the loadgen program). The underlying presumption with this patch was that since each thread is holding on to the host, the keep-alive would kick in (by JVM?) and make a few of the connections as no-op, as these are connections made to the same host/port.  However, it looks like keep-alive is not kicking in and see similar shuffle times with and without this patch.

We did another test where the code was hacked so that the copier fetches a configurable number of maps at a time and the the TT replies to this request by clubbing the map outputs together. The received map outputs were just discarded at the reducer (neither written to disk nor memory) so that we just measured the network performance. The following are the results

||Number of Maps Per Fetch||Average Shuffle Time||Worst case Shuffle Time||
|1|1:27|4:20|
|2|1:11|2:11|
|4|47s|1:11|
|8|29s|41s|

>From this it does appear that we would benefit from modifying the fetch protocol to fetch several maps at one shot, using the same connection. Thoughts?



> Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>            Assignee: Jothi Padmanabhan
>
> We should do transfers of map outputs at the granularity of  *total-bytes-transferred* rather than the current way of transferring a single file and then closing the connection to the server. A single TaskTracker might have a couple of map output files for a given reduce, and we should transfer multiple of them (upto a certain total size) in a single connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would help since it would keep the connection open for more than one file transfer. We should limit the transfers to a certain size so that we don't hold up a jetty thread indefinitely (and cause timeouts for other clients).
> Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.