You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "sidharta seethana (JIRA)" <ji...@apache.org> on 2014/11/12 02:42:34 UTC

[jira] [Updated] (MAPREDUCE-6156) Fetcher - connect() doesn't handle connection refused correctly

     [ https://issues.apache.org/jira/browse/MAPREDUCE-6156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

sidharta seethana updated MAPREDUCE-6156:
-----------------------------------------
    Description: 
The connect() function in the fetcher assumes that whenever an IOException is thrown, the amount of time passed equals "connectionTimeout" ( see code snippet below ). This is incorrect. For example, in case the NM is down, an ConnectException is thrown immediately - and the catch block assumes a minute has passed when it is not the case.

{code}
  if (connectionTimeout < 0) {
      throw new IOException("Invalid timeout "
                            + "[timeout = " + connectionTimeout + " ms]");
    } else if (connectionTimeout > 0) {
      unit = Math.min(UNIT_CONNECT_TIMEOUT, connectionTimeout);
    }
    // set the connect timeout to the unit-connect-timeout
    connection.setConnectTimeout(unit);
    while (true) {
      try {
        connection.connect();
        break;
      } catch (IOException ioe) {
        // update the total remaining connect-timeout
        connectionTimeout -= unit;

        // throw an exception if we have waited for timeout amount of time
        // note that the updated value if timeout is used here
        if (connectionTimeout == 0) {
          throw ioe;
        }

        // reset the connect timeout for the last try
        if (connectionTimeout < unit) {
          unit = connectionTimeout;
          // reset the connect time out for the final connect
          connection.setConnectTimeout(unit);
        }
      }
    }
{code}

  was:
The connect() function in the fetcher assumes that whenever an IOException is thrown, the amount of time passed equals "connectionTimeout" ( see code snippet below ). This is incorrect. For example, in case the NM is down, an ConnectException is thrown immediately - and the catch block assumes a minute has passed when it is not the case.

{code}
  if (connectionTimeout < 0) {
      throw new IOException("Invalid timeout "
                            + "[timeout = " + connectionTimeout + " ms]");
    } else if (connectionTimeout > 0) {
      unit = Math.min(UNIT_CONNECT_TIMEOUT, connectionTimeout);
    }
    // set the connect timeout to the unit-connect-timeout
    connection.setConnectTimeout(unit);
    while (true) {
      try {
        connection.connect();
        break;
      } catch (IOException ioe) {
        // update the total remaining connect-timeout
        connectionTimeout -= unit;

        // throw an exception if we have waited for timeout amount of time
        // note that the updated value if timeout is used here
        if (connectionTimeout == 0) {
          throw ioe;
        }

        // reset the connect timeout for the last try
        if (connectionTimeout < unit) {
          unit = connectionTimeout;
          // reset the connect time out for the final connect
          connection.setConnectTimeout(unit);
        }
      }
    }
{code]


> Fetcher - connect() doesn't handle connection refused correctly 
> ----------------------------------------------------------------
>
>                 Key: MAPREDUCE-6156
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-6156
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: sidharta seethana
>            Assignee: Junping Du
>            Priority: Critical
>
> The connect() function in the fetcher assumes that whenever an IOException is thrown, the amount of time passed equals "connectionTimeout" ( see code snippet below ). This is incorrect. For example, in case the NM is down, an ConnectException is thrown immediately - and the catch block assumes a minute has passed when it is not the case.
> {code}
>   if (connectionTimeout < 0) {
>       throw new IOException("Invalid timeout "
>                             + "[timeout = " + connectionTimeout + " ms]");
>     } else if (connectionTimeout > 0) {
>       unit = Math.min(UNIT_CONNECT_TIMEOUT, connectionTimeout);
>     }
>     // set the connect timeout to the unit-connect-timeout
>     connection.setConnectTimeout(unit);
>     while (true) {
>       try {
>         connection.connect();
>         break;
>       } catch (IOException ioe) {
>         // update the total remaining connect-timeout
>         connectionTimeout -= unit;
>         // throw an exception if we have waited for timeout amount of time
>         // note that the updated value if timeout is used here
>         if (connectionTimeout == 0) {
>           throw ioe;
>         }
>         // reset the connect timeout for the last try
>         if (connectionTimeout < unit) {
>           unit = connectionTimeout;
>           // reset the connect time out for the final connect
>           connection.setConnectTimeout(unit);
>         }
>       }
>     }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)