You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Colin Patrick McCabe (JIRA)" <ji...@apache.org> on 2012/07/26 03:04:35 UTC

[jira] [Created] (MAPREDUCE-4485) container-executor should deal with file descriptors better

Colin Patrick McCabe created MAPREDUCE-4485:
-----------------------------------------------

             Summary: container-executor should deal with file descriptors better
                 Key: MAPREDUCE-4485
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: nodemanager
            Reporter: Colin Patrick McCabe
            Priority: Minor


container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
    fclose(stdout);
  }
  if (ERRORFILE != stderr) {
    fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
    fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
    return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

Another thing we should be doing in container-executor.c is closing any file descriptors we don't need.  Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us.  These FDs will contain to be open even after the {{execve}}, unless we close them manually.  This could be both a resource leak and a security breach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4485) container-executor should deal with file descriptors better

Posted by "Andy Isaacson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422801#comment-13422801 ] 

Andy Isaacson commented on MAPREDUCE-4485:
------------------------------------------

"Luckily" there does not appear to be any way to get LOGFILE != stdout or ERRORFILE != stderr in container-executor.  So this is just a latent bug rather than a pending data corruption bug.
                
> container-executor should deal with file descriptors better
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-4485
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Colin Patrick McCabe
>            Priority: Minor
>
> container-executor.c contains the following code:
> {code}
>   fclose(stdin);
>   fflush(LOGFILE);
>   if (LOGFILE != stdout) {
>     fclose(stdout);
>   }
>   if (ERRORFILE != stderr) {
>     fclose(stderr);
>   }
>   if (chdir(primary_app_dir) != 0) {
>     fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
>     return -1;
>   }
>   execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file descriptors we don't need.  Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us.  These FDs will continue to be open even after the {{execve}}, unless we close them manually.  This could be both a resource leak and a security breach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4485) container-executor should deal with file descriptors better

Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422821#comment-13422821 ] 

Colin Patrick McCabe commented on MAPREDUCE-4485:
-------------------------------------------------

@Todd: good catch.  I'll update the JIRA description.

@Andy: Yeah, it appears that LOGFILE and ERRORFILE can't be changed.  So it's a latent bug.
                
> container-executor should deal with file descriptors better
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-4485
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Colin Patrick McCabe
>            Priority: Minor
>
> container-executor.c contains the following code:
> {code}
>   fclose(stdin);
>   fflush(LOGFILE);
>   if (LOGFILE != stdout) {
>     fclose(stdout);
>   }
>   if (ERRORFILE != stderr) {
>     fclose(stderr);
>   }
>   if (chdir(primary_app_dir) != 0) {
>     fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
>     return -1;
>   }
>   execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file descriptors we don't need.  Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us.  These FDs will continue to be open even after the {{execve}}, unless we close them manually.  This could be both a resource leak and a security breach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4485) container-executor should deal with file descriptors better

Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Colin Patrick McCabe updated MAPREDUCE-4485:
--------------------------------------------

    Description: 
container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
    fclose(stdout);
  }
  if (ERRORFILE != stderr) {
    fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
    fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
    return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

Another thing we should be doing in container-executor.c is closing any file descriptors we don't need.  Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us.  These FDs will continue to be open even after the {{execve}}, unless we close them manually.  This could be both a resource leak and a security breach.

  was:
container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
    fclose(stdout);
  }
  if (ERRORFILE != stderr) {
    fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
    fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
    return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

Another thing we should be doing in container-executor.c is closing any file descriptors we don't need.  Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us.  These FDs will contain to be open even after the {{execve}}, unless we close them manually.  This could be both a resource leak and a security breach.

    
> container-executor should deal with file descriptors better
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-4485
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Colin Patrick McCabe
>            Priority: Minor
>
> container-executor.c contains the following code:
> {code}
>   fclose(stdin);
>   fflush(LOGFILE);
>   if (LOGFILE != stdout) {
>     fclose(stdout);
>   }
>   if (ERRORFILE != stderr) {
>     fclose(stderr);
>   }
>   if (chdir(primary_app_dir) != 0) {
>     fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
>     return -1;
>   }
>   execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file descriptors we don't need.  Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us.  These FDs will continue to be open even after the {{execve}}, unless we close them manually.  This could be both a resource leak and a security breach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (MAPREDUCE-4485) container-executor should deal with file descriptors better

Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422797#comment-13422797 ] 

Todd Lipcon commented on MAPREDUCE-4485:
----------------------------------------

When you fork, Java already takes care of closing all the open file descriptors. See UNIXProcess_md.c
                
> container-executor should deal with file descriptors better
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-4485
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Colin Patrick McCabe
>            Priority: Minor
>
> container-executor.c contains the following code:
> {code}
>   fclose(stdin);
>   fflush(LOGFILE);
>   if (LOGFILE != stdout) {
>     fclose(stdout);
>   }
>   if (ERRORFILE != stderr) {
>     fclose(stderr);
>   }
>   if (chdir(primary_app_dir) != 0) {
>     fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
>     return -1;
>   }
>   execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file descriptors we don't need.  Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us.  These FDs will continue to be open even after the {{execve}}, unless we close them manually.  This could be both a resource leak and a security breach.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (MAPREDUCE-4485) container-executor should deal with stdout, stderr better

Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Colin Patrick McCabe updated MAPREDUCE-4485:
--------------------------------------------

          Description: 
container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
    fclose(stdout);
  }
  if (ERRORFILE != stderr) {
    fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
    fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
    return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

It looks like LOGFILE and ERRORFILE are always set to stdout and stderr at the moment.  However, this is a latent bug that should be fixed in case these are ever made configurable (which seems to have been the intent).

  was:
container-executor.c contains the following code:

{code}
  fclose(stdin);
  fflush(LOGFILE);
  if (LOGFILE != stdout) {
    fclose(stdout);
  }
  if (ERRORFILE != stderr) {
    fclose(stderr);
  }
  if (chdir(primary_app_dir) != 0) {
    fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
    return -1;
  }
  execvp(args[0], args);
{code}

Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!

The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.

Another thing we should be doing in container-executor.c is closing any file descriptors we don't need.  Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us.  These FDs will continue to be open even after the {{execve}}, unless we close them manually.  This could be both a resource leak and a security breach.

     Target Version/s: 2.0.1-alpha
    Affects Version/s: 2.0.1-alpha
              Summary: container-executor should deal with stdout, stderr better  (was: container-executor should deal with file descriptors better)
    
> container-executor should deal with stdout, stderr better
> ---------------------------------------------------------
>
>                 Key: MAPREDUCE-4485
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.0.1-alpha
>            Reporter: Colin Patrick McCabe
>            Priority: Minor
>
> container-executor.c contains the following code:
> {code}
>   fclose(stdin);
>   fflush(LOGFILE);
>   if (LOGFILE != stdout) {
>     fclose(stdout);
>   }
>   if (ERRORFILE != stderr) {
>     fclose(stderr);
>   }
>   if (chdir(primary_app_dir) != 0) {
>     fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
>     return -1;
>   }
>   execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number.  So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1.  This means that any printf statements in the program will be now printing to /my/important/file.  Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null.  {{dup2}} can be used for this purpose.
> It looks like LOGFILE and ERRORFILE are always set to stdout and stderr at the moment.  However, this is a latent bug that should be fixed in case these are ever made configurable (which seems to have been the intent).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira