You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Colin Patrick McCabe (JIRA)" <ji...@apache.org> on 2012/07/26 03:04:35 UTC
[jira] [Created] (MAPREDUCE-4485) container-executor should deal
with file descriptors better
Colin Patrick McCabe created MAPREDUCE-4485:
-----------------------------------------------
Summary: container-executor should deal with file descriptors better
Key: MAPREDUCE-4485
URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
Project: Hadoop Map/Reduce
Issue Type: Bug
Components: nodemanager
Reporter: Colin Patrick McCabe
Priority: Minor
container-executor.c contains the following code:
{code}
fclose(stdin);
fflush(LOGFILE);
if (LOGFILE != stdout) {
fclose(stdout);
}
if (ERRORFILE != stderr) {
fclose(stderr);
}
if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
}
execvp(args[0], args);
{code}
Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
Another thing we should be doing in container-executor.c is closing any file descriptors we don't need. Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us. These FDs will contain to be open even after the {{execve}}, unless we close them manually. This could be both a resource leak and a security breach.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4485) container-executor should deal
with file descriptors better
Posted by "Andy Isaacson (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422801#comment-13422801 ]
Andy Isaacson commented on MAPREDUCE-4485:
------------------------------------------
"Luckily" there does not appear to be any way to get LOGFILE != stdout or ERRORFILE != stderr in container-executor. So this is just a latent bug rather than a pending data corruption bug.
> container-executor should deal with file descriptors better
> -----------------------------------------------------------
>
> Key: MAPREDUCE-4485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: nodemanager
> Reporter: Colin Patrick McCabe
> Priority: Minor
>
> container-executor.c contains the following code:
> {code}
> fclose(stdin);
> fflush(LOGFILE);
> if (LOGFILE != stdout) {
> fclose(stdout);
> }
> if (ERRORFILE != stderr) {
> fclose(stderr);
> }
> if (chdir(primary_app_dir) != 0) {
> fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
> return -1;
> }
> execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file descriptors we don't need. Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us. These FDs will continue to be open even after the {{execve}}, unless we close them manually. This could be both a resource leak and a security breach.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4485) container-executor should deal
with file descriptors better
Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422821#comment-13422821 ]
Colin Patrick McCabe commented on MAPREDUCE-4485:
-------------------------------------------------
@Todd: good catch. I'll update the JIRA description.
@Andy: Yeah, it appears that LOGFILE and ERRORFILE can't be changed. So it's a latent bug.
> container-executor should deal with file descriptors better
> -----------------------------------------------------------
>
> Key: MAPREDUCE-4485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: nodemanager
> Reporter: Colin Patrick McCabe
> Priority: Minor
>
> container-executor.c contains the following code:
> {code}
> fclose(stdin);
> fflush(LOGFILE);
> if (LOGFILE != stdout) {
> fclose(stdout);
> }
> if (ERRORFILE != stderr) {
> fclose(stderr);
> }
> if (chdir(primary_app_dir) != 0) {
> fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
> return -1;
> }
> execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file descriptors we don't need. Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us. These FDs will continue to be open even after the {{execve}}, unless we close them manually. This could be both a resource leak and a security breach.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4485) container-executor should deal
with file descriptors better
Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin Patrick McCabe updated MAPREDUCE-4485:
--------------------------------------------
Description:
container-executor.c contains the following code:
{code}
fclose(stdin);
fflush(LOGFILE);
if (LOGFILE != stdout) {
fclose(stdout);
}
if (ERRORFILE != stderr) {
fclose(stderr);
}
if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
}
execvp(args[0], args);
{code}
Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
Another thing we should be doing in container-executor.c is closing any file descriptors we don't need. Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us. These FDs will continue to be open even after the {{execve}}, unless we close them manually. This could be both a resource leak and a security breach.
was:
container-executor.c contains the following code:
{code}
fclose(stdin);
fflush(LOGFILE);
if (LOGFILE != stdout) {
fclose(stdout);
}
if (ERRORFILE != stderr) {
fclose(stderr);
}
if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
}
execvp(args[0], args);
{code}
Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
Another thing we should be doing in container-executor.c is closing any file descriptors we don't need. Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us. These FDs will contain to be open even after the {{execve}}, unless we close them manually. This could be both a resource leak and a security breach.
> container-executor should deal with file descriptors better
> -----------------------------------------------------------
>
> Key: MAPREDUCE-4485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: nodemanager
> Reporter: Colin Patrick McCabe
> Priority: Minor
>
> container-executor.c contains the following code:
> {code}
> fclose(stdin);
> fflush(LOGFILE);
> if (LOGFILE != stdout) {
> fclose(stdout);
> }
> if (ERRORFILE != stderr) {
> fclose(stderr);
> }
> if (chdir(primary_app_dir) != 0) {
> fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
> return -1;
> }
> execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file descriptors we don't need. Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us. These FDs will continue to be open even after the {{execve}}, unless we close them manually. This could be both a resource leak and a security breach.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAPREDUCE-4485) container-executor should deal
with file descriptors better
Posted by "Todd Lipcon (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13422797#comment-13422797 ]
Todd Lipcon commented on MAPREDUCE-4485:
----------------------------------------
When you fork, Java already takes care of closing all the open file descriptors. See UNIXProcess_md.c
> container-executor should deal with file descriptors better
> -----------------------------------------------------------
>
> Key: MAPREDUCE-4485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: nodemanager
> Reporter: Colin Patrick McCabe
> Priority: Minor
>
> container-executor.c contains the following code:
> {code}
> fclose(stdin);
> fflush(LOGFILE);
> if (LOGFILE != stdout) {
> fclose(stdout);
> }
> if (ERRORFILE != stderr) {
> fclose(stderr);
> }
> if (chdir(primary_app_dir) != 0) {
> fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
> return -1;
> }
> execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
> Another thing we should be doing in container-executor.c is closing any file descriptors we don't need. Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us. These FDs will continue to be open even after the {{execve}}, unless we close them manually. This could be both a resource leak and a security breach.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (MAPREDUCE-4485) container-executor should deal
with stdout, stderr better
Posted by "Colin Patrick McCabe (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAPREDUCE-4485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Colin Patrick McCabe updated MAPREDUCE-4485:
--------------------------------------------
Description:
container-executor.c contains the following code:
{code}
fclose(stdin);
fflush(LOGFILE);
if (LOGFILE != stdout) {
fclose(stdout);
}
if (ERRORFILE != stderr) {
fclose(stderr);
}
if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
}
execvp(args[0], args);
{code}
Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
It looks like LOGFILE and ERRORFILE are always set to stdout and stderr at the moment. However, this is a latent bug that should be fixed in case these are ever made configurable (which seems to have been the intent).
was:
container-executor.c contains the following code:
{code}
fclose(stdin);
fflush(LOGFILE);
if (LOGFILE != stdout) {
fclose(stdout);
}
if (ERRORFILE != stderr) {
fclose(stderr);
}
if (chdir(primary_app_dir) != 0) {
fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
return -1;
}
execvp(args[0], args);
{code}
Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
Another thing we should be doing in container-executor.c is closing any file descriptors we don't need. Because container-executor was forked off of the JVM, any file that was open at the time the JVM called fork() will also be open for us. These FDs will continue to be open even after the {{execve}}, unless we close them manually. This could be both a resource leak and a security breach.
Target Version/s: 2.0.1-alpha
Affects Version/s: 2.0.1-alpha
Summary: container-executor should deal with stdout, stderr better (was: container-executor should deal with file descriptors better)
> container-executor should deal with stdout, stderr better
> ---------------------------------------------------------
>
> Key: MAPREDUCE-4485
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-4485
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.0.1-alpha
> Reporter: Colin Patrick McCabe
> Priority: Minor
>
> container-executor.c contains the following code:
> {code}
> fclose(stdin);
> fflush(LOGFILE);
> if (LOGFILE != stdout) {
> fclose(stdout);
> }
> if (ERRORFILE != stderr) {
> fclose(stderr);
> }
> if (chdir(primary_app_dir) != 0) {
> fprintf(LOGFILE, "Failed to chdir to app dir - %s\n", strerror(errno));
> return -1;
> }
> execvp(args[0], args);
> {code}
> Whenever you open a new file descriptor, its number is the lowest available number. So if {{stdout}} (fd number 1) has been closed, and you do open("/my/important/file"), you'll get assigned file descriptor 1. This means that any printf statements in the program will be now printing to /my/important/file. Oops!
> The correct way to get rid of stdin, stdout, or stderr is not to close them, but to make them point to /dev/null. {{dup2}} can be used for this purpose.
> It looks like LOGFILE and ERRORFILE are always set to stdout and stderr at the moment. However, this is a latent bug that should be fixed in case these are ever made configurable (which seems to have been the intent).
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira