You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chris Thomson (Jira)" <ji...@apache.org> on 2022/11/28 22:06:00 UTC

[jira] [Created] (FLINK-30232) shading of netty epoll shared library does not account for ARM64 platform

Chris Thomson created FLINK-30232:
-------------------------------------

             Summary: shading of netty epoll shared library does not account for ARM64 platform
                 Key: FLINK-30232
                 URL: https://issues.apache.org/jira/browse/FLINK-30232
             Project: Flink
          Issue Type: Bug
          Components: BuildSystem / Shaded
    Affects Versions: 1.15.2
         Environment: Kubernetes 1.23 provided by AWS managed Kubernetes service (EKS) with Graviton 2 based EC2 instances (ARM64) using Flink 1.15.2, native epoll enabled (taskmanager.network.netty.transport: epoll)
            Reporter: Chris Thomson


While evaluating migration of Flink application to Graviton 2 based EC2 instances in a AWS managed Kubernetes service (EKS) using Kubernetes 1.23, found that the shaded Netty library renames the AMD64 version of the shared library as part of relocation of the Netty library but does not rename the matching ARM64 shared library. This results in the following error when `taskmanager.network.netty.transport: epoll` is used:



 

 

{{Suppressed: java.lang.UnsatisfiedLinkError: no org_apache_flink_shaded_netty4_netty_transport_native_epoll in java.library.path}}
{{at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1860) ~[?:1.8.0_352]}}
{{at java.lang.Runtime.loadLibrary0(Runtime.java:843) ~[?:1.8.0_352]}}
{{at java.lang.System.loadLibrary(System.java:1136) ~[?:1.8.0_352]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryUtil.loadLibrary(NativeLibraryUtil.java:38) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_352]}}
{{at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_352]}}
{{at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_352]}}
{{at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_352]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader$1.run(NativeLibraryLoader.java:335) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_352]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.loadLibraryByHelper(NativeLibraryLoader.java:327) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.loadLibrary(NativeLibraryLoader.java:293) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:136) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:309) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.<clinit>(Native.java:85) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:40) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoop.<clinit>(EpollEventLoop.java:51) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:185) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.newChild(EpollEventLoopGroup.java:36) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:84) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:60) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.util.concurrent.MultithreadEventExecutorGroup.<init>(MultithreadEventExecutorGroup.java:49) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.MultithreadEventLoopGroup.<init>(MultithreadEventLoopGroup.java:59) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:113) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:100) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.EpollEventLoopGroup.<init>(EpollEventLoopGroup.java:77) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.io.network.netty.NettyClient.initEpollBootstrap(NettyClient.java:164) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.io.network.netty.NettyClient.init(NettyClient.java:79) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.io.network.netty.NettyConnectionManager.start(NettyConnectionManager.java:87) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.io.network.NettyShuffleEnvironment.start(NettyShuffleEnvironment.java:329) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerServices.fromConfiguration(TaskManagerServices.java:293) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManager(TaskManagerRunner.java:623) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.createTaskExecutorService(TaskManagerRunner.java:559) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.startTaskManagerRunnerServices(TaskManagerRunner.java:245) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.start(TaskManagerRunner.java:288) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:481) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.lambda$runTaskManagerProcessSecurely$5(TaskManagerRunner.java:525) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.security.contexts.NoOpSecurityContext.runSecured(NoOpSecurityContext.java:28) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerProcessSecurely(TaskManagerRunner.java:525) [flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManagerProcessSecurely(TaskManagerRunner.java:505) [flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.kubernetes.taskmanager.KubernetesTaskExecutorRunner.main(KubernetesTaskExecutorRunner.java:39) [flink-dist-1.15.2.jar:1.15.2]}}
{{Caused by: java.io.FileNotFoundException: META-INF/native/liborg_apache_flink_shaded_netty4_netty_transport_native_epoll_aarch_64.so}}
{{at org.apache.flink.shaded.netty4.io.netty.util.internal.NativeLibraryLoader.load(NativeLibraryLoader.java:170) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.loadNativeLibrary(Native.java:306) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Native.<clinit>(Native.java:85) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{at org.apache.flink.shaded.netty4.io.netty.channel.epoll.Epoll.<clinit>(Epoll.java:40) ~[flink-dist-1.15.2.jar:1.15.2]}}
{{... 25 more}}

 

[https://github.com/apache/flink-shaded/blob/3082afc952e68366e9fefe4d1181c4666969ee67/flink-shaded-netty-4/pom.xml#L97] appears to be where the problem is, it only renames the x86_64 shared library, it doesn’t account for aarch_64 shared library.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)