You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Binh Le (Jira)" <ji...@apache.org> on 2022/04/27 14:58:00 UTC

[jira] [Comment Edited] (STORM-3751) NPE in WorkerState.transferLocalBatch

    [ https://issues.apache.org/jira/browse/STORM-3751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528839#comment-17528839 ] 

Binh Le edited comment on STORM-3751 at 4/27/22 2:57 PM:
---------------------------------------------------------

We are also running into this same error, with the same stack trace and same line number. Has anyone been able to figure out what's going on yet?

I see that [STORM-3141 NPE in WorkerState.transferLocalBatch when receiving messages for a task that isn't the first task assigned to the executor|https://issues.apache.org/jira/browse/STORM-3141] is fixed. However, that has a different stack trace; error occurred on a different line number. So I believe that original issue was fixed, but now leads to this new error. So this may be a different issue?


was (Author: JIRAUSER288735):
We are also running into this same error, with the same stack trace and same line number. Has anyone been able to figure out what's going on yet?

I see that [STORM-3141] NPE in WorkerState.transferLocalBatch when receiving messages for a task that isn't the first task assigned to the executor - ASF JIRA (apache.org) is fixed. However, that has a different stack trace; error occurred on a different line number. So I believe that original issue was fixed, but now leads to this new error. So this may be a different issue?

> NPE in WorkerState.transferLocalBatch
> -------------------------------------
>
>                 Key: STORM-3751
>                 URL: https://issues.apache.org/jira/browse/STORM-3751
>             Project: Apache Storm
>          Issue Type: Bug
>          Components: storm-client
>    Affects Versions: 2.2.0
>            Reporter: Arwin S Tio
>            Priority: Major
>
> Hello,
>  
> I've recently upgraded to Storm 2.2.0 and have been getting this error:
>  
> {code:java}
> 2021-03-07 04:36:51.061 o.a.s.m.n.StormServerHandler Netty-server-localhost-6700-worker-1 [ERROR] server errors in handling the request
> java.lang.NullPointerException: null
>         at org.apache.storm.daemon.worker.WorkerState.transferLocalBatch(WorkerState.java:543) ~[storm-client-2.2.0.jar:2.2.0]
>         at org.apache.storm.messaging.DeserializingConnectionCallback.recv(DeserializingConnectionCallback.java:71) ~[storm-client-2.2.0.jar:2.2.0]
>         at org.apache.storm.messaging.netty.Server.enqueue(Server.java:146) ~[storm-client-2.2.0.jar:2.2.0]
>         at org.apache.storm.messaging.netty.Server.received(Server.java:264) ~[storm-client-2.2.0.jar:2.2.0]
>         at org.apache.storm.messaging.netty.StormServerHandler.channelRead(StormServerHandler.java:51) ~[storm-client-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:323) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:297) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:340) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1434) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:362) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:348) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:965) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:163) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:644) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:579) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:496) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:458) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at org.apache.storm.shade.io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:897) [storm-shaded-deps-2.2.0.jar:2.2.0]
>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_272]
> 2021-03-07 04:36:51.061 o.a.s.m.n.StormServerHandler Netty-server-localhost-6700-worker-1 [INFO] Received error in netty thread.. terminating server... {code}
>  
> This issue happens every 20-30 minutes and causes the workers to die/restart.
> It seems related to https://issues.apache.org/jira/browse/STORM-3141 but seems to have been fixed in 2.0. 
> I am happy to provide more information but at the moment am unsure of what is relevant.
> I have a suspicion that this is related to load-aware localOrShuffleGrouping ("LoadAwareShuffleGrouping") because this issue seems to have started when I switched the Grouping, but again, not sure if it's actually related.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)