You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Ivan Mitic (JIRA)" <ji...@apache.org> on 2012/06/07 02:07:23 UTC

[jira] [Updated] (MAPREDUCE-4322) Fix command-line length abort issues on Windows

     [ https://issues.apache.org/jira/browse/MAPREDUCE-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ivan Mitic updated MAPREDUCE-4322:
----------------------------------

    Attachment: MAPREDUCE-4322-branch-1-win.patch

Attaching the patch.

The fix is to separate the classpath into an environment variable instead of passing it via "java -classpath".

In some test cases we've seen command line length go slightly above 8192 characters what is the Windows command line limit. ~4k goes into the classpath, and the rest goes on other command line arguments. By separating out the classpath we now have plenty of room for other args. 

The patch also introduces checks on the command length before it is executed, and surfaces a nice error message if the length exceeds the limit. Otherwise, we would only see that the child task exited with non 0 code, and we would not have any context on the reason for a failure.
                
> Fix command-line length abort issues on Windows
> -----------------------------------------------
>
>                 Key: MAPREDUCE-4322
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4322
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>         Environment: Windows, downstream applications with long aggregate classpaths
>            Reporter: John Gordon
>            Assignee: Ivan Mitic
>         Attachments: MAPREDUCE-4322-branch-1-win.patch
>
>   Original Estimate: 12h
>  Remaining Estimate: 12h
>
> When a task is started on the tasktracker, it creates a small batch file to invoke java and runs that batch.  Within the batch file, the invocation of Java currently has -classpath ${CLASSPATH} inline to the command.  That line often exceeds 8000 characters.  This is ok for most linux distributions because the line limit env variable is often set much higher than this.  However, for Windows this cause cmd to abort execution.  This surfaces in Hadoop as an unknown failure mode for the task.
> I think the easiest and most natural way to fix this is to push the -classpath option into a config file to take the longest variable part of the line and put it somewhere that scales better.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira