You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Gyula Fora (Jira)" <ji...@apache.org> on 2022/10/07 14:31:00 UTC

[jira] [Closed] (FLINK-29535) Flink Operator Certificate renew issue

     [ https://issues.apache.org/jira/browse/FLINK-29535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gyula Fora closed FLINK-29535.
------------------------------
    Resolution: Duplicate

please reopen it in case the other fix is not working

> Flink Operator Certificate renew issue
> --------------------------------------
>
>                 Key: FLINK-29535
>                 URL: https://issues.apache.org/jira/browse/FLINK-29535
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>            Reporter: Sebastian Struß
>            Priority: Major
>
> It seems that there is an issue with the Kubernetes Operator (at least in version 1.1.0) when it comes to certificates for the webhook.
> We've seen this error message pop up in the logs:
> | |
> |An exceptionCaught() event was fired, and it reached at the tail of the pipeline. It usually means the last handler in the pipeline did not handle the exception.|
> | 
> and
> javax.net.ssl.SSLHandshakeException: Received fatal alert: bad_certificate at sun.security.ssl.Alert.createSSLException(Unknown Source) ~[?:?] at sun.security.ssl.Alert.createSSLException(Unknown Source) ~[?:?] at sun.security.ssl.TransportContext.fatal(Unknown Source) ~[?:?] at sun.security.ssl.Alert$AlertConsumer.consume(Unknown Source) ~[?:?] at sun.security.ssl.TransportContext.dispatch(Unknown Source) ~[?:?] at sun.security.ssl.SSLTransport.decode(Unknown Source) ~[?:?] at sun.security.ssl.SSLEngineImpl.decode(Unknown Source) ~[?:?] at sun.security.ssl.SSLEngineImpl.readRecord(Unknown Source) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) ~[?:?] at sun.security.ssl.SSLEngineImpl.unwrap(Unknown Source) ~[?:?] at javax.net.ssl.SSLEngine.unwrap(Unknown Source) ~[?:?] at org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler$SslEngineType$3.unwrap(SslHandler.java:296) ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1342) ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decodeJdkCompatible(SslHandler.java:1235) ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at org.apache.flink.shaded.netty4.io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1284) ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:507) ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0] at org.apache.flink.shaded.netty4.io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:446) ~[flink-kubernetes-operator-1.1.0-shaded.jar:1.1.0]|
> It happens when our fluxcd is trying to update the FlinkDeployment resource.
> This seems to trigger a webhook to an endpoint (in the operator) which is serving a (then) invalid certificate.
> We've noticed this after 18 days of it running, so maybe something shortlived was not renewed correctly?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)