You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-dev@hadoop.apache.org by "Devaraj Das (JIRA)" <ji...@apache.org> on 2007/05/08 08:18:15 UTC

[jira] Created: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files
--------------------------------------------------------------------------------------------------

                 Key: HADOOP-1338
                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
            Reporter: Devaraj Das
             Fix For: 0.14.0


We should do transfers of map outputs at the granularity of  *total-bytes-transferred* rather than the current way of transferring a single file and then closing the connection to the server. A single TaskTracker might have a couple of map output files for a given reduce, and we should transfer multiple of them (upto a certain total size) in a single connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would help since it would keep the connection open for more than one file transfer. We should limit the transfers to a certain size so that we don't hold up a jetty thread indefinitely (and cause timeouts for other clients).
Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12496226 ] 

Devaraj Das commented on HADOOP-1338:
-------------------------------------

Interesting points here .. http://www.w3.org/Protocols/HTTP/Performance/

> Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>             Fix For: 0.14.0
>
>
> We should do transfers of map outputs at the granularity of  *total-bytes-transferred* rather than the current way of transferring a single file and then closing the connection to the server. A single TaskTracker might have a couple of map output files for a given reduce, and we should transfer multiple of them (upto a certain total size) in a single connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would help since it would keep the connection open for more than one file transfer. We should limit the transfers to a certain size so that we don't hold up a jetty thread indefinitely (and cause timeouts for other clients).
> Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494344 ] 

Doug Cutting commented on HADOOP-1338:
--------------------------------------

As a performance-enhancement, any patch for this must demonstrate a significant performance improvement before it can be committed.  I suspect that simply caching a fixed number of connections will not provide a significant performance enhancement.  Rather, we might attempt, when contacting a node, to transfer all available output that resides on that node.  So, instead of randomly shuffling the map output locations, a reduce task might first group locations by host, then randomly shuffle, fetching batches from each host.  However, if the shuffle is already keeping up with the map, even this may not improve things much, since each map node may tend to only have a single new output at a time.

> Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>             Fix For: 0.14.0
>
>
> We should do transfers of map outputs at the granularity of  *total-bytes-transferred* rather than the current way of transferring a single file and then closing the connection to the server. A single TaskTracker might have a couple of map output files for a given reduce, and we should transfer multiple of them (upto a certain total size) in a single connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would help since it would keep the connection open for more than one file transfer. We should limit the transfers to a certain size so that we don't hold up a jetty thread indefinitely (and cause timeouts for other clients).
> Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Posted by "Doug Cutting (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494360 ] 

Doug Cutting commented on HADOOP-1338:
--------------------------------------

What's the motivation for more than one wave of reduces?  Typically applications are better served by fewer output files, so that can't be it.  Is it to decrease the granularity of each reduce, so that, in the case of reduce failures, job latency is reduced?

> Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>             Fix For: 0.14.0
>
>
> We should do transfers of map outputs at the granularity of  *total-bytes-transferred* rather than the current way of transferring a single file and then closing the connection to the server. A single TaskTracker might have a couple of map output files for a given reduce, and we should transfer multiple of them (upto a certain total size) in a single connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would help since it would keep the connection open for more than one file transfer. We should limit the transfers to a certain size so that we don't hold up a jetty thread indefinitely (and cause timeouts for other clients).
> Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494358 ] 

Devaraj Das commented on HADOOP-1338:
-------------------------------------

"This might make more sense for cases where we have two or more waves of reduces" - should clarify this by saying that, although the first wave of shuffles is gated by the time the Map phase takes to complete, we should check whether the subsequent waves of shuffles would gain by this optimization. I think the first phase of shuffle might also gain to some extent.

> Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>             Fix For: 0.14.0
>
>
> We should do transfers of map outputs at the granularity of  *total-bytes-transferred* rather than the current way of transferring a single file and then closing the connection to the server. A single TaskTracker might have a couple of map output files for a given reduce, and we should transfer multiple of them (upto a certain total size) in a single connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would help since it would keep the connection open for more than one file transfer. We should limit the transfers to a certain size so that we don't hold up a jetty thread indefinitely (and cause timeouts for other clients).
> Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494450 ] 

Devaraj Das commented on HADOOP-1338:
-------------------------------------

Yes reducing the job latency in the presence of reduce failures is the reason for having multiple waves.

> Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>             Fix For: 0.14.0
>
>
> We should do transfers of map outputs at the granularity of  *total-bytes-transferred* rather than the current way of transferring a single file and then closing the connection to the server. A single TaskTracker might have a couple of map output files for a given reduce, and we should transfer multiple of them (upto a certain total size) in a single connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would help since it would keep the connection open for more than one file transfer. We should limit the transfers to a certain size so that we don't hold up a jetty thread indefinitely (and cause timeouts for other clients).
> Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-1338) Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files

Posted by "Devaraj Das (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/HADOOP-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12494355 ] 

Devaraj Das commented on HADOOP-1338:
-------------------------------------

Agree with you on this, Doug. This issue is investigative in nature. This might make more sense for cases where we have two or more waves of reduces...

> Improve the shuffle phase by using the "connection: keep-alive" and doing batch transfers of files
> --------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-1338
>                 URL: https://issues.apache.org/jira/browse/HADOOP-1338
>             Project: Hadoop
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Devaraj Das
>             Fix For: 0.14.0
>
>
> We should do transfers of map outputs at the granularity of  *total-bytes-transferred* rather than the current way of transferring a single file and then closing the connection to the server. A single TaskTracker might have a couple of map output files for a given reduce, and we should transfer multiple of them (upto a certain total size) in a single connection to the TaskTracker. Using HTTP-1.1's keep-alive connection would help since it would keep the connection open for more than one file transfer. We should limit the transfers to a certain size so that we don't hold up a jetty thread indefinitely (and cause timeouts for other clients).
> Overall, this should give us improved performance.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.