You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Sameer Paranjpye (JIRA)" <ji...@apache.org> on 2008/04/22 11:26:25 UTC

[jira] Issue Comment Edited: (HADOOP-3280) virtual address space limits break streaming apps

    [ https://issues.apache.org/jira/browse/HADOOP-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591217#action_12591217 ] 

sameerp edited comment on HADOOP-3280 at 4/22/08 2:25 AM:
-------------------------------------------------------------------

The prologue setting, if implemented, should IMO apply in all task contexts: streaming, pipes or plain old Java. I don't think that it should default to setting a memory limit based on -Xmx. The -Xmx switch can be used if needed in JVM args to specify a threshold that triggers garbage collection. It can, for example,  be set to a value lower than a memory limit specified through a ulimit.

> PipeMapRed uses a shell post-HADOOP-2765 to set the ulimits, but doesn't need to assume a particular shell otherwise.

The ulimit was certainly introduced in HADOOP-2765, but the _TaskLog.captureOutAndErr_ method has existed and assumed bash for quite a while now. I believe the consensus at the time was that assuming bash was pretty benign. I'm not entirely sure about this, Owen or Devaraj likely know better.

      was (Author: sameerp):
    The prologue setting, if implemented, should IMO apply in all task contexts: streaming, pipes or plain old Java. I don't think that it should default to setting a memory limit based on -Xmx. The -Xmx switch can be used if needed in JVM args to specify a threshold that triggers garbage collection. It can, for example,  be set to a value lower than a memory limit specified through a ulimit.

> PipeMapRed uses a shell post-HADOOP-2765 to set the ulimits, but doesn't need to assume a particular shell otherwise.

The ulimit was certainly introduced in HADOOP-2765, but the _TaskLog.captureOutAndErr_ method has existed and assumed bash for quite a while now. I believe the consensus at the time was that assuming bash was pretty benign.
  
> virtual address space limits break streaming apps
> -------------------------------------------------
>
>                 Key: HADOOP-3280
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3280
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Rick Cox
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3280_0_20080418.patch
>
>
> HADOOP-2765 added a mandatory, hard virtual address space limit to streaming apps based on the Java process's -Xmx setting.
> This makes it impossible to run a 64-bit streaming app that needs large address spaces under a 32-bit JVM, even if one is otherwise willing to dramatically increase the -Xmx setting without cause. Also, unlike Java's -Xmx limit, the virtual address space limit for an arbitrary UNIX process does not necessarily correspond to RAM usage, so it's likely to be a relatively difficult to configure limit.
> 2765 was originally opened to allow an optional wrapper script around streaming tasks, one use case for which was setting a ulimit. That approach seems much less intrusive and more flexible than the final implementation. The ulimit can also be trivially set by the streaming task itself without any support from Hadoop.
> Marking this as an 0.17 blocker because it will break deployed apps and there is no workaround available.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.