You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/08 10:28:41 UTC

[GitHub] [pulsar] kenbaev opened a new issue #10171: Broker fails. Log contains only failed metrics requests (500 error code)

kenbaev opened a new issue #10171:
URL: https://github.com/apache/pulsar/issues/10171


   **Describe the bug**
   Pulsar broker for no apparent reasons stops returning metrics for Prometheus. The following entries appears in the broker log
   `11:25:52.830 [pulsar-web-42-7] INFO  org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:25:22 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002
   11:44:07.847 [pulsar-web-42-20] INFO  org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:37 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30001
   11:43:52.838 [pulsar-web-42-5] INFO  org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:22 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002
   11:43:37.841 [pulsar-web-42-17] INFO  org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:07 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30004
   11:43:22.857 [pulsar-web-42-3] INFO  org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:42:52 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30011
   11:43:07.842 [pulsar-web-42-14] INFO  org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:42:37 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002`
   
   At the same time, problems with connecting to the fallen broker begin on the proxy
   `11:25:11.666 [pulsar-proxy-io-2-1] WARN org.apache.pulsar.proxy.server.LookupProxyHandler - [non-persistent://loadtest/pcm-reply/reply-unlock-project-lb-24] failed to get Partitioned metadata : Disconnected from server at newpulsar-broker.newpulsar.svc.cluster.local/10.247.106.218:6650
   org.apache.pulsar.client.api.PulsarClientException: Disconnected from server at newpulsar-broker.newpulsar.svc.cluster.local/10.247.106.218:6650
   		at org.apache.pulsar.client.impl.ClientCnx.channelInactive(ClientCnx.java:245) [org.apache.pulsar-pulsar-client-original-2.7.0.6.jar:2.7.0.6]
   		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:389) [io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354) [io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:818) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final]
   		at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   		at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
   		at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]`
   
   **To Reproduce**
   The problem happened several times. We were unable to identify the Steps to reproduce it
   
   **Expected behavior**
   Stable work of pulsar broker
   
   **Desktop (please complete the following information):**
    - Apache Pulsar 2.7.0.6
    - Kubernetes 1.18.3
    - Cluster with 3 ZK, 5 bokies, 5 brokers, 5 proxies


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #10171: Broker fails. Log contains only failed metrics requests (500 error code)

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #10171:
URL: https://github.com/apache/pulsar/issues/10171#issuecomment-1058891596


   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] kenbaev commented on issue #10171: Broker fails. Log contains only failed metrics requests (500 error code)

Posted by GitBox <gi...@apache.org>.
kenbaev commented on issue #10171:
URL: https://github.com/apache/pulsar/issues/10171#issuecomment-821123968


   Broker crashes again
   On the client side i see 
   `"ERROR":"Short read when reading frame size: read tcp 10.247.144.5:41024->10.96.137.199:6650: i/o timeout"`
   `"MESSAGE":"Failed to perform initial handshake"}`
   
   On the proxy side i see this lines
   `09:57:00.425 [pulsar-proxy-io-2-1] INFO  org.apache.pulsar.proxy.server.ProxyConnection - [/10.247.144.5:41024] New connection opened
   09:57:00.427 [pulsar-proxy-io-2-1] INFO  org.apache.pulsar.proxy.server.ProxyConnection - [/10.247.144.5:41024] complete connection, init proxy handler. authenticated with none role null, hasProxyToBrokerUrl: true
   09:57:30.430 [pulsar-proxy-io-2-1] INFO  org.apache.pulsar.proxy.server.ProxyConnection - [/10.247.144.5:41024] Connection closed`
   
   I've set proxyLogLevel=1 on my proxies, but it did not help.
   
   On the broker's side there are no log entries at all (corresponding to this connection)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] kenbaev commented on issue #10171: Broker fails. Log contains only failed metrics requests (500 error code)

Posted by GitBox <gi...@apache.org>.
kenbaev commented on issue #10171:
URL: https://github.com/apache/pulsar/issues/10171#issuecomment-815665891


   We also see problems at the service level. We use Go client and it fails with error
   `Failed to perform initial handshake`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org