You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by greghogan <gi...@git.apache.org> on 2017/02/01 17:25:27 UTC

[GitHub] flink pull request #3249: [FLINK-3163] [scripts] Configure Flink for NUMA sy...

GitHub user greghogan opened a pull request:

    https://github.com/apache/flink/pull/3249

    [FLINK-3163] [scripts] Configure Flink for NUMA systems

    Start a TaskManager on each NUMA node on each worker when the new configuration option 'taskmanager.compute.numa' is enabled.
    
    This does not affect the runtime process for the JobManager (or future ResourceManager) as the startup scripts do not provide a simple means of disambiguating masters and slaves. I expect most large clusters to run these master processes on separate machines, and for small clusters the JobManager can run alongside a TaskManager.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/greghogan/flink 3163_configure_flink_for_numa_systems

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/3249.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3249
    
----
commit 57767e67dc7306d18df07d5224c81a8d359df620
Author: Greg Hogan <co...@greghogan.com>
Date:   2017-02-01T17:13:49Z

    [FLINK-3163] [scripts] Configure Flink for NUMA systems
    
    Start a TaskManager on each NUMA node on each worker when the new
    configuration option 'taskmanager.compute.numa' is enabled.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request #3249: [FLINK-3163] [scripts] Configure Flink for NUMA sy...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/3249


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/3249
  
    I think this is a great idea. Can we also get this integrated with the Yarn / Mesos / Docker setup scripts and code? Keeping all these different deployment options on par would be nice.
    
    Minor comment: I think you can also name the parameter `taskmanager.numa`, rather than `taskmanager.compute.numa`, unless we plan to have further options under the `taskmanager.compute.` namespace.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/3249
  
    Added note specifying NUMA support is applicable for standalone only.
    
    This is a much harder feature to support in a multi-application environment, which is likely why none of these cluster managers have added support.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/3249
  
    @StephanEwen thanks for the review. I'll verify, test, and merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems

Posted by greghogan <gi...@git.apache.org>.
Github user greghogan commented on the issue:

    https://github.com/apache/flink/pull/3249
  
    @StephanEwen from the discussion of FLINK-3163 I also had the idea of `taskmanager.compute.fraction` where the number of slots would be a multiple of the number of cores / vcores. Since Flink processes these as opaque strings the only purpose is to help organize [config page](https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html).
    
    I have found YARN-5764, MESOS-5342, and MESOS-314 discussing NUMA support for containers but all are works in progress. I see that Docker supports `--cpuset-cpus` and `--cpuset-mems` in [docker run](https://docs.docker.com/engine/reference/run/) and in [docker compose](https://docs.docker.com/compose/compose-file) config version 2 (using `cpuset`). It's not clear how to dynamically bind Flink to numa nodes without scripting Flink's docker commands.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/3249
  
    Looks good.
    
    +1 from my side!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink issue #3249: [FLINK-3163] [scripts] Configure Flink for NUMA systems

Posted by StephanEwen <gi...@git.apache.org>.
Github user StephanEwen commented on the issue:

    https://github.com/apache/flink/pull/3249
  
    We can have this in YARN at least as well, because YARN starts its TaskManagers in each container via a bash command. We can also merge this one first, but then it would be good to add to the docs that this applies only to standalone mode at the moment.
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---