You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/08 10:28:41 UTC
[GitHub] [pulsar] kenbaev opened a new issue #10171: Broker fails. Log contains only failed metrics requests (500 error code)
kenbaev opened a new issue #10171:
URL: https://github.com/apache/pulsar/issues/10171
**Describe the bug**
Pulsar broker for no apparent reasons stops returning metrics for Prometheus. The following entries appears in the broker log
`11:25:52.830 [pulsar-web-42-7] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:25:22 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002
11:44:07.847 [pulsar-web-42-20] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:37 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30001
11:43:52.838 [pulsar-web-42-5] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:22 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002
11:43:37.841 [pulsar-web-42-17] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:43:07 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30004
11:43:22.857 [pulsar-web-42-3] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:42:52 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30011
11:43:07.842 [pulsar-web-42-14] INFO org.eclipse.jetty.server.RequestLog - 10.247.144.56 - - [07/Apr/2021:11:42:37 +0000] "GET /metrics/ HTTP/1.1" 500 553 "http://10.247.106.218:8080/metrics" "Prometheus/2.18.1" 30002`
At the same time, problems with connecting to the fallen broker begin on the proxy
`11:25:11.666 [pulsar-proxy-io-2-1] WARN org.apache.pulsar.proxy.server.LookupProxyHandler - [non-persistent://loadtest/pcm-reply/reply-unlock-project-lb-24] failed to get Partitioned metadata : Disconnected from server at newpulsar-broker.newpulsar.svc.cluster.local/10.247.106.218:6650
org.apache.pulsar.client.api.PulsarClientException: Disconnected from server at newpulsar-broker.newpulsar.svc.cluster.local/10.247.106.218:6650
at org.apache.pulsar.client.impl.ClientCnx.channelInactive(ClientCnx.java:245) [org.apache.pulsar-pulsar-client-original-2.7.0.6.jar:2.7.0.6]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:389) [io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final]
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:354) [io.netty-netty-codec-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:241) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1405) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:262) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:248) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:901) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:818) [io.netty-netty-transport-4.1.51.Final.jar:4.1.51.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:472) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384) [io.netty-netty-transport-native-epoll-4.1.51.Final-linux-x86_64.jar:4.1.51.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:989) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [io.netty-netty-common-4.1.51.Final.jar:4.1.51.Final]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_282]`
**To Reproduce**
The problem happened several times. We were unable to identify the Steps to reproduce it
**Expected behavior**
Stable work of pulsar broker
**Desktop (please complete the following information):**
- Apache Pulsar 2.7.0.6
- Kubernetes 1.18.3
- Cluster with 3 ZK, 5 bokies, 5 brokers, 5 proxies
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] codelipenghui commented on issue #10171: Broker fails. Log contains only failed metrics requests (500 error code)
Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #10171:
URL: https://github.com/apache/pulsar/issues/10171#issuecomment-1058891596
The issue had no activity for 30 days, mark with Stale label.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] kenbaev commented on issue #10171: Broker fails. Log contains only failed metrics requests (500 error code)
Posted by GitBox <gi...@apache.org>.
kenbaev commented on issue #10171:
URL: https://github.com/apache/pulsar/issues/10171#issuecomment-821123968
Broker crashes again
On the client side i see
`"ERROR":"Short read when reading frame size: read tcp 10.247.144.5:41024->10.96.137.199:6650: i/o timeout"`
`"MESSAGE":"Failed to perform initial handshake"}`
On the proxy side i see this lines
`09:57:00.425 [pulsar-proxy-io-2-1] INFO org.apache.pulsar.proxy.server.ProxyConnection - [/10.247.144.5:41024] New connection opened
09:57:00.427 [pulsar-proxy-io-2-1] INFO org.apache.pulsar.proxy.server.ProxyConnection - [/10.247.144.5:41024] complete connection, init proxy handler. authenticated with none role null, hasProxyToBrokerUrl: true
09:57:30.430 [pulsar-proxy-io-2-1] INFO org.apache.pulsar.proxy.server.ProxyConnection - [/10.247.144.5:41024] Connection closed`
I've set proxyLogLevel=1 on my proxies, but it did not help.
On the broker's side there are no log entries at all (corresponding to this connection)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [pulsar] kenbaev commented on issue #10171: Broker fails. Log contains only failed metrics requests (500 error code)
Posted by GitBox <gi...@apache.org>.
kenbaev commented on issue #10171:
URL: https://github.com/apache/pulsar/issues/10171#issuecomment-815665891
We also see problems at the service level. We use Go client and it fails with error
`Failed to perform initial handshake`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org