You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by NicoK <gi...@git.apache.org> on 2017/08/04 14:00:45 UTC
[GitHub] flink pull request #4481: [FLINK-7316][network] always use off-heap network ...
GitHub user NicoK opened a pull request:
https://github.com/apache/flink/pull/4481
[FLINK-7316][network] always use off-heap network buffers
## What is the purpose of the change
For now, network buffers may be on-heap or off-heap along with Flink memory settings. As a step towards passing our own (off-heap) buffers through netty to avoid unnecessary buffer copies, we make network buffers always off-heap
## Brief change log
- always use off-heap buffers for the `NetworkBufferPool`
- move `memoryType` from `NetworkEnvironmentConfiguration` to `TaskManagerServicesConfiguration`
- adapt heap size calculations in bash scripts and Java source code
## Verifying this change
This change is already covered by existing tests, such as: `TaskManagerServicesTest` for the heap szie calculations; tests under `flink/runtime/io/network` for most other aspects of the direct use of network buffers, especially `flink/runtime/io/network/buffer`; all integration tests with a full stack and non-local communication.
Actually, we even increase the test coverage since most network buffer tests only tested on-heap so far which now does not exist anymore. These tests now cover the only existing option: off-heap network buffers.
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): (no)
- The public API, i.e., is any changed class annotated with `@Public(Evolving)`: (no)
- The serializers: (no)
- The runtime per-record code paths (performance sensitive): (yes - as in the network communication part)
- Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Yarn/Mesos, ZooKeeper: (yes - memory settings, but they effectively do not change except for the network buffers being off-heap now)
## Documentation
- Does this pull request introduce a new feature? (no)
- If yes, how is the feature documented? (docs, JavaDocs)
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/NicoK/flink flink-7316
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/4481.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4481
----
commit d87206435cabf3bf29560083b639077b103708b8
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-07-31T10:06:14Z
[hotfix] fix some typos
commit 4a46e615f38c3dde41d465ec3e26426dbf14df80
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-02T09:34:54Z
[FLINK-7310][core] always use the HybridMemorySegment
Since we'd like to use our own off-heap buffers for network communication, we
cannot use HeapMemorySegment anymore and need to rely on HybridMemorySegment.
We thus drop any code that loads the HeapMemorySegment (it is still available
if needed) in favour of the HybridMemorySegment which is able to work on both
heap and off-heap memory.
For the performance penalty of this change compared to using HeapMemorySegment
alone, see this interesting blob article (from 2015):
https://flink.apache.org/news/2015/09/16/off-heap-memory.html
commit 1d838c82cef412a8ec143308e20a4d0d7882f3e8
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-02T09:35:16Z
[hotfix][tests] add missing test descriptions
commit 70b0985a62082766498e847f7a4f25e84b6c1f06
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-02T09:27:49Z
[hotfix][core] add additional final methods in final classes
This applies the scheme of HeapMemorySegment to HybridMemorySegment where core
methods are also marked "final" to be more future-proof.
commit bedf14708b7aba88761f05c19abaf7f26d16dd20
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-04T13:15:32Z
[FLINK-7312][checkstyle] remove trailing whitespace
commit 67e37971a4f8d5c40290e7a9c8ae2e6a2e1deb68
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-04T13:20:28Z
[FLINK-7312][checkstyle] organise imports
commit d8657c8f02ca16af1f9e08621a8f73adb5d26959
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-04T13:24:16Z
[FLINK-7312][checkstyle] add, adapt and improve comments
commit 654b599569e3a3e5ac063d253311b67652a33c1d
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-04T13:26:40Z
[FLINK-7312][checkstyle] remove redundant "public" keyword in interfaces
commit 7117de1adeefd624ae958370d9614162d18bd9ed
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-04T13:27:36Z
[FLINK-7312][checkstyle] ignore some spurious warnings
commit dd150af85551f64c7f3a260a013d59b7d773f94a
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-04T13:35:15Z
[FLINK-7312][checkstyle] enable checkstyle for `flink/core/memory/*`
We deliberately ignore redundant modifiers for now since we want `final`
modifiers on `final` classes for increased future-proofness.
commit adbfea59e0618c0212a820d724d677b97955c3c6
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-01T11:24:00Z
[FLINK-7316][network] always use off-heap network buffers
This is another step at using or own (off-heap) buffers for network
communication that we pass through netty in order to avoid unnecessary buffer
copies.
commit 139fdfc166ee465ce61d8f582cbbbf8c890ecccc
Author: Nico Kruber <ni...@data-artisans.com>
Date: 2017-08-04T13:59:48Z
[FLINK-7316][docs] add a note of network buffers always being off-heap
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #4481: [FLINK-7316][network] always use off-heap network buffers
Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on the issue:
https://github.com/apache/flink/pull/4481
by cherry-picking the commits from #4506, plus some fixes for code which was changed in the wrong way previously, the failing yarn tests should now be fixed
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #4481: [FLINK-7316][network] always use off-heap network buffers
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on the issue:
https://github.com/apache/flink/pull/4481
LGTM 👍
---
[GitHub] flink issue #4481: [FLINK-7316][network] always use off-heap network buffers
Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on the issue:
https://github.com/apache/flink/pull/4481
FYI: I just rebased this PR onto current `master` to make this mergable and support further extensions
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #4481: [FLINK-7316][network] always use off-heap network buffers
Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on the issue:
https://github.com/apache/flink/pull/4481
actually, I need to fix the test failures in `ContaineredTaskManagerParametersTest` and some failure in the `flink-yarn-tests` first...
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #4481: [FLINK-7316][network] always use off-heap network buffers
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on the issue:
https://github.com/apache/flink/pull/4481
Will merge this now.
---
[GitHub] flink issue #4481: [FLINK-7316][network] always use off-heap network buffers
Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on the issue:
https://github.com/apache/flink/pull/4481
ok, one test fixed, the other is not so simple but maybe @tillrohrmann can help with it:
Inside `ContaineredTaskManagerParameters#create()`, we calculate the amount of off-heap space that we need and for yarn, we use exactly this amount for setting the `-XX:MaxDirectMemorySize` JVM property without letting room for other components and libraries. This worked so far for the network buffers when memory as a whole was set to off-/on-heap and the flink-reserved memory was not completely used. Now, however, if set to on-heap, the `-XX:MaxDirectMemorySize` is too sharp. I'm unsure about the solutions:
1) remove setting `-XX:MaxDirectMemorySize` and let the JVM adjust automatically, or
2) add some "sane" default to our off-heap usage?
The same may apply to Mesos if `ResourceProfile(cpuCores, heapMemoryInMB, directMemoryInMB, nativeMemoryInMB)` is used. At the moment, only the other constructors are used leading to solution 1.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink issue #4481: [FLINK-7316][network] always use off-heap network buffers
Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on the issue:
https://github.com/apache/flink/pull/4481
rebased again - this should be good to go. @StefanRRichter can you continue with this?
---
[GitHub] flink pull request #4481: [FLINK-7316][network] always use off-heap network ...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/4481#discussion_r131673257
--- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/NetworkBufferPool.java ---
@@ -274,7 +260,7 @@ private void redistributeBuffers() throws IOException {
return;
}
- /**
--- End diff --
Ok, then just leave it :D
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request #4481: [FLINK-7316][network] always use off-heap network ...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/flink/pull/4481
---
[GitHub] flink pull request #4481: [FLINK-7316][network] always use off-heap network ...
Posted by StefanRRichter <gi...@git.apache.org>.
Github user StefanRRichter commented on a diff in the pull request:
https://github.com/apache/flink/pull/4481#discussion_r131648828
--- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/NetworkBufferPool.java ---
@@ -274,7 +260,7 @@ private void redistributeBuffers() throws IOException {
return;
}
- /**
--- End diff --
Revert please.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request #4481: [FLINK-7316][network] always use off-heap network ...
Posted by NicoK <gi...@git.apache.org>.
Github user NicoK commented on a diff in the pull request:
https://github.com/apache/flink/pull/4481#discussion_r131672655
--- Diff: flink-runtime/src/main/java/org/apache/flink/runtime/io/network/buffer/NetworkBufferPool.java ---
@@ -274,7 +260,7 @@ private void redistributeBuffers() throws IOException {
return;
}
- /**
--- End diff --
Actually, this was intentional since this was marked as a "dangling javadoc" by IntelliJ (it is just a longer inline-comment). I don't have too strong feelings about it though so we could either keep it or revert it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---