You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/03/02 19:09:00 UTC

[jira] [Commented] (FLINK-8826) In Flip6 mode, when starting yarn cluster, configured taskmanager.heap.mb is ignored

    [ https://issues.apache.org/jira/browse/FLINK-8826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16384000#comment-16384000 ] 

ASF GitHub Bot commented on FLINK-8826:
---------------------------------------

GitHub user tillrohrmann opened a pull request:

    https://github.com/apache/flink/pull/5625

    [FLINK-8826] [flip6] Start Yarn TaskExecutor with proper slots and memory

    ## What is the purpose of the change
    
    Read the default TaskManager memory and number of slots from the configuration
    when the YarnResourceManager is started.
    
    ## Verifying this change
    
    - Added `YarnConfigurationITCase`
    
    ## Does this pull request potentially affect one of the following parts:
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
      - The serializers: (no)
      - The runtime per-record code paths (performance sensitive): (no)
      - Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes)
      - The S3 file system connector: (no)
    
    ## Documentation
    
      - Does this pull request introduce a new feature? (no)
      - If yes, how is the feature documented? (not applicable)


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/tillrohrmann/flink fixCheckpointCoordinator

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/5625.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #5625
    
----
commit ef080927aba095a72929cd8be390b46afa4dcab8
Author: Till Rohrmann <tr...@...>
Date:   2018-03-02T14:27:13Z

    [FLINK-8840] [yarn] Pull YarnClient and YarnConfiguration instantiation out of AbstractYarnClusterClient
    
    For better testability, this commit moves the YarnClient and YarnConfiguration out of
    the AbstractYarnClusterDescriptor.

commit 173e272c1a0250ecdc4a4f975a2f7991b9dd53c1
Author: Till Rohrmann <tr...@...>
Date:   2018-03-01T19:09:55Z

    [hotfix] [flip6] Harden JobMaster#triggerSavepoint
    
    Check first whether the CheckpointCoordinator has been set before triggering
    a savepoint. If it has not been set, then return a failure message.

commit 2d2a6d5a84a586bf8ef59656aa7754e94b6e034b
Author: Till Rohrmann <tr...@...>
Date:   2018-03-01T22:35:25Z

    [FLINK-8826] [flip6] Start Yarn TaskExecutor with proper slots and memory
    
    Read the default TaskManager memory and number of slots from the configuration
    when the YarnResourceManager is started.

commit d8f3cfa0c89f465ad995ebab7470f37b37b18678
Author: Till Rohrmann <tr...@...>
Date:   2018-03-02T11:18:05Z

    [hotfix] Set default number of TaskManagers in FlinkYarnSessionCli for Flip6

commit 0475dc343f0fa703bcf82a585482bdd15ae168ea
Author: Till Rohrmann <tr...@...>
Date:   2018-03-02T11:42:43Z

    [hotfix] Print correct web monitor URL in FlinkYarnSessionCli

----


> In Flip6 mode, when starting yarn cluster, configured taskmanager.heap.mb is ignored
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-8826
>                 URL: https://issues.apache.org/jira/browse/FLINK-8826
>             Project: Flink
>          Issue Type: Bug
>          Components: ResourceManager, YARN
>    Affects Versions: 1.5.0
>            Reporter: Piotr Nowojski
>            Assignee: Till Rohrmann
>            Priority: Blocker
>
> When I tried running some job on the cluster, despite setting 
> taskmanager.heap.mb = 3072
> taskmanager.network.memory.fraction: 0.4
> and reported in the console
> {code:java}
> Cluster specification: ClusterSpecification{masterMemoryMB=768, taskManagerMemoryMB=3072, numberTaskManagers=92, slotsPerTaskManager=1}{code}
> The actual settings were:
> {noformat}
>  
> 2018-03-01 14:53:18,918 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  - --------------------------------------------------------------------------------
> 2018-03-01 14:53:18,921 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  Starting YARN TaskExecutor runner (Version: 1.5-SNAPSHOT, Rev:e92eb39, Date:28.02.2018 @ 17:43:39 UTC)
> 2018-03-01 14:53:18,921 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  OS current user: yarn
> 2018-03-01 14:53:19,780 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  Current Hadoop/Kerberos user: hadoop
> 2018-03-01 14:53:19,781 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.161-b14
> 2018-03-01 14:53:19,781 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  Maximum heap size: 245 MiBytes
> 2018-03-01 14:53:19,781 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  JAVA_HOME: /usr/lib/jvm/java-openjdk
> 2018-03-01 14:53:19,783 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  Hadoop version: 2.4.1
> 2018-03-01 14:53:19,783 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  JVM Options:
> 2018-03-01 14:53:19,783 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -     -Xms255m
> 2018-03-01 14:53:19,784 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -     -Xmx255m
> 2018-03-01 14:53:19,784 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -     -XX:MaxDirectMemorySize=769m
> 2018-03-01 14:53:19,784 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -     -Dlog.file=/var/log/hadoop-yarn/containers/application_1516373731080_1150/container_1516373731080_1150_01_000105/taskmanager.log
> 2018-03-01 14:53:19,784 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -     -Dlogback.configurationFile=file:./logback.xml
> 2018-03-01 14:53:19,784 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -     -Dlog4j.configuration=file:./log4j.properties
> 2018-03-01 14:53:19,784 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -  Program Arguments:
> 2018-03-01 14:53:19,784 INFO  org.apache.flink.yarn.YarnTaskExecutorRunner                  -     --configDir{noformat}
> Heap was set to 255, while with default cuts of it should be 1383. 255MB seems like coming from default taskmanager.heap.mb value of 1024.
> When starting in non flip6 everything works as expected:
> {noformat}
>  
> 2018-03-01 14:04:49,650 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            - --------------------------------------------------------------------------------
> 2018-03-01 14:04:49,700 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  Starting YARN TaskManager (Version: 1.5-SNAPSHOT, Rev:e92eb39, Date:28.02.2018 @ 17:43:39 UTC)
> 2018-03-01 14:04:49,700 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  OS current user: yarn
> 2018-03-01 14:04:53,277 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  Current Hadoop/Kerberos user: hadoop
> 2018-03-01 14:04:53,278 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  JVM: OpenJDK 64-Bit Server VM - Oracle Corporation - 1.8/25.161-b14
> 2018-03-01 14:04:53,279 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  Maximum heap size: 1326 MiBytes
> 2018-03-01 14:04:53,279 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  JAVA_HOME: /usr/lib/jvm/java-openjdk
> 2018-03-01 14:04:53,282 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  Hadoop version: 2.4.1
> 2018-03-01 14:04:53,284 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  JVM Options:
> 2018-03-01 14:04:53,284 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -     -Xms1383m
> 2018-03-01 14:04:53,284 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -     -Xmx1383m
> 2018-03-01 14:04:53,284 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -     -XX:MaxDirectMemorySize=1689m
> 2018-03-01 14:04:53,284 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -     -Dlog.file=/var/log/hadoop-yarn/containers/application_1516373731080_1138/container_1516373731080_1138_01_000063/taskmanager.log
> 2018-03-01 14:04:53,285 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -     -Dlogback.configurationFile=file:./logback.xml
> 2018-03-01 14:04:53,286 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -     -Dlog4j.configuration=file:./log4j.properties
> 2018-03-01 14:04:53,287 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -  Program Arguments:
> 2018-03-01 14:04:53,287 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -     --configDir
> 2018-03-01 14:04:53,287 INFO  org.apache.flink.yarn.YarnTaskManagerRunnerFactory            -     .{noformat}
>  
> CC [~till.rohrmann]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)