You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chris Thomson (Jira)" <ji...@apache.org> on 2022/11/28 22:06:00 UTC
[jira] [Created] (FLINK-30232) shading of netty epoll shared library does not account for ARM64 platform
Chris Thomson created FLINK-30232:
-------------------------------------
Summary: shading of netty epoll shared library does not account for ARM64 platform
Key: FLINK-30232
URL: https://issues.apache.org/jira/browse/FLINK-30232
Project: Flink
Issue Type: Bug
Components: BuildSystem / Shaded
Affects Versions: 1.15.2
Environment: Kubernetes 1.23 provided by AWS managed Kubernetes service (EKS) with Graviton 2 based EC2 instances (ARM64) using Flink 1.15.2, native epoll enabled (taskmanager.network.netty.transport: epoll)
Reporter: Chris Thomson
While evaluating migration of Flink application to Graviton 2 based EC2 instances in a AWS managed Kubernetes service (EKS) using Kubernetes 1.23, found that the shaded Netty library renames the AMD64 version of the shared library as part of relocation of the Netty library but does not rename the matching ARM64 shared library. This results in the following error when `taskmanager.network.netty.transport: epoll` is used:
{{Suppressed: java.lang.UnsatisfiedLinkError: no org_apache_flink_shaded_netty4_netty_transport_native_epoll in java.library.path}}
{{at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860) ~[?:1.8.0_352]}}
{{at java.lang.Runtime.loadLibrary0(Runtime.java:843) ~[?:1.8.0_352]}}
{{at java.lang.System.loadLibrary(System.java:1136) ~[?:1.8.0_352]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_352]}}
{{at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_352]}}
{{at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_352]}}
{{at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_352]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:335) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_352]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:327) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:293) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:136) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:309) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.<clinit>(Native.java:85) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:40) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.<clinit>(EpollEventLoop.java:51) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:185) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:36) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:84) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:60) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:49) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:59) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:113) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:100) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:77) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.io.network.netty.NettyClient.initEpollBootstrap(NettyClient.java:164) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.io.network.netty.NettyClient.init(NettyClient.java:79) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.start(NettyConnectionManager.java:87) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.io.network.NettyShuffleEnvironment.start(NettyShuffleEnvironment.java:329) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerServices.fromConfiguration(TaskManagerServices.java:293) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:623) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.createTaskExecutorService(TaskManagerRunner.java:559) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManagerRunnerServices(TaskManagerRunner.java:245) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.start(TaskManagerRunner.java:288) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:481) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerProcessSecurely$5(TaskManagerRunner.java:525) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerProcessSecurely(TaskManagerRunner.java:525) [flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerProcessSecurely(TaskManagerRunner.java:505) [flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner.main(KubernetesTaskExecutorRunner.java:39) [flink-dist-1.15.2.jar:1.15.2]}}
{{Caused by: java.io.FileNotFoundException: META-INF/native/liborg_apache_flink_shaded_netty4_netty_transport_native_epoll_aarch_64.so}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:170) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:306) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.<clinit>(Native.java:85) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:40) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{... 25 more}}
[https://github.com/apache/flink-shaded/blob/3082afc952e68366e9fefe4d1181c4666969ee67/flink-shaded-netty-4/pom.xml#L97] appears to be where the problem is, it only renames the x86_64 shared library, it doesn’t account for aarch_64 shared library.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)