You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "刘珍 (Jira)" <ji...@apache.org> on 2022/12/19 06:41:00 UTC

[jira] [Commented] (IOTDB-5034) [ratis] ERROR o.a.i.d.s.t.i.DataNodeInternalRPCServiceImpl:1209 - [ChangeRegionLeader] Failed to change the leader of RegionGroup: SchemaRegion[8]

    [ https://issues.apache.org/jira/browse/IOTDB-5034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17649172#comment-17649172 ] 

刘珍 commented on IOTDB-5034:
---------------------------

master_1216_d426f7a ,验证通过,数据count验证通过。

> [ratis]  ERROR o.a.i.d.s.t.i.DataNodeInternalRPCServiceImpl:1209 - [ChangeRegionLeader] Failed to change the leader of RegionGroup: SchemaRegion[8]
> ---------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IOTDB-5034
>                 URL: https://issues.apache.org/jira/browse/IOTDB-5034
>             Project: Apache IoTDB
>          Issue Type: Bug
>          Components: mpp-cluster
>    Affects Versions: 0.14.0-SNAPSHOT
>            Reporter: 刘珍
>            Assignee: Song Ziyang
>            Priority: Minor
>              Labels: pull-request-available
>         Attachments: iotdb_4851.conf
>
>
> 测试版本:1124_cd839a4
> 在机器上的路径:/data/liuzhen_test/master_1123_32e2f98  (1124_cd839a4的lib)
> 1. 启动1副本3C5D集群
> 2.写入数据,完成(配置见附件)
> 3.缩容ip76 datanode,最终缩容成功,但是ip62有如下报错:
> 2022-11-24 11:24:39,955 [pool-69-IoTDB-DataNodeInternalRPC-Processor-12] ERROR o.a.i.c.r.RatisConsensus:676 - org.apache.iotdb.consensus.ratis.RatisConsensus@6c19f01a request failed with exception {}
> org.apache.iotdb.consensus.exception.RatisRequestFailedException: Ratis request failed
>         at org.apache.iotdb.consensus.ratis.RatisConsensus.transferLeader(RatisConsensus.java:545)
>         at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.transferLeader(DataNodeInternalRPCServiceImpl.java:1201)
>         at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.changeRegionLeader(DataNodeInternalRPCServiceImpl.java:1192)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$changeRegionLeader.getResult(IDataNodeRPCService.java:3932)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$changeRegionLeader.getResult(IDataNodeRPCService.java:3912)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.ratis.protocol.exceptions.GroupMismatchException: 6: group-000200000008 not found.
>         at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:150)
>         at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:351)
>         at org.apache.ratis.server.impl.RaftServerProxy.setConfigurationAsync(RaftServerProxy.java:606)
>         at org.apache.ratis.grpc.server.GrpcAdminProtocolService.lambda$setConfiguration$3(GrpcAdminProtocolService.java:70)
>         at org.apache.ratis.grpc.GrpcUtil.asyncCall(GrpcUtil.java:164)
>         at org.apache.ratis.grpc.server.GrpcAdminProtocolService.setConfiguration(GrpcAdminProtocolService.java:70)
>         at org.apache.ratis.proto.grpc.AdminProtocolServiceGrpc$MethodHandlers.invoke(AdminProtocolServiceGrpc.java:643)
>         at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
>         at org.apache.ratis.thirdparty.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
>         at org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:354)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>         at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.lang.Thread.run(Thread.java:834)
> 2022-11-24 11:24:39,956 [pool-69-IoTDB-DataNodeInternalRPC-Processor-12] ERROR o.a.i.d.s.t.i.DataNodeInternalRPCServiceImpl:1209 - [ChangeRegionLeader] Failed to change the leader of RegionGroup: SchemaRegion[8]
> org.apache.iotdb.consensus.exception.RatisRequestFailedException: Ratis request failed
>         at org.apache.iotdb.consensus.ratis.RatisConsensus.transferLeader(RatisConsensus.java:545)
>         at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.transferLeader(DataNodeInternalRPCServiceImpl.java:1201)
>         at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.changeRegionLeader(DataNodeInternalRPCServiceImpl.java:1192)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$changeRegionLeader.getResult(IDataNodeRPCService.java:3932)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$changeRegionLeader.getResult(IDataNodeRPCService.java:3912)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> Caused by: org.apache.ratis.protocol.exceptions.GroupMismatchException: 6: group-000200000008 not found.
>         at org.apache.ratis.server.impl.RaftServerProxy$ImplMap.get(RaftServerProxy.java:150)
>         at org.apache.ratis.server.impl.RaftServerProxy.getImplFuture(RaftServerProxy.java:351)
>         at org.apache.ratis.server.impl.RaftServerProxy.setConfigurationAsync(RaftServerProxy.java:606)
>         at org.apache.ratis.grpc.server.GrpcAdminProtocolService.lambda$setConfiguration$3(GrpcAdminProtocolService.java:70)
>         at org.apache.ratis.grpc.GrpcUtil.asyncCall(GrpcUtil.java:164)
>         at org.apache.ratis.grpc.server.GrpcAdminProtocolService.setConfiguration(GrpcAdminProtocolService.java:70)
>         at org.apache.ratis.proto.grpc.AdminProtocolServiceGrpc$MethodHandlers.invoke(AdminProtocolServiceGrpc.java:643)
>         at org.apache.ratis.thirdparty.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:182)
>         at org.apache.ratis.thirdparty.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)
>         at org.apache.ratis.thirdparty.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:354)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:866)
>         at org.apache.ratis.thirdparty.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)
>         at org.apache.ratis.thirdparty.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:133)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.lang.Thread.run(Thread.java:834)
> 测试环境
> 1. 192.168.10.62/64/66/68  72CPU256GB
> 192.168.10.76    48CPU384GB
> ConfigNode
> MAX_HEAP_SIZE="8G"
> DataNode
> MAX_HEAP_SIZE="192G"
> MAX_DIRECT_MEMORY_SIZE="32G"
> Common
>  schema_replication_factor=1
>  schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus
>  data_replication_factor=1
>  data_region_consensus_protocol_class=org.apache.iotdb.consensus.multileader.MultiLeaderConsensus
> query_timeout_threshold=3600000
> max_connection_for_internal_service=300
> 2. benchmark 配置见附件 iotdb_4851.conf
> 3. 缩容ip76 datanode ,查看所有节点日志



--
This message was sent by Atlassian Jira
(v8.20.10#820010)