You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "Jinrui Zhang (Jira)" <ji...@apache.org> on 2022/11/24 05:10:00 UTC

[jira] [Commented] (IOTDB-5030) java.lang.IllegalArgumentException: all replicas for region[TConsensusGroupId(type:SchemaRegion, id:6)] are not available in these DataNodes

    [ https://issues.apache.org/jira/browse/IOTDB-5030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638091#comment-17638091 ] 

Jinrui Zhang commented on IOTDB-5030:
-------------------------------------

The issue is caused by the schemaFetching failure during writing. I have two questions here :
 # Why the SchemaRegion's replica is only distributed in 66 ? It should has 3 replicas but only 1 is got from ConfigNode's partition info.
 # It seems that 66 cannot process the FI. We need to investigate the error log from 66

>  java.lang.IllegalArgumentException: all replicas for region[TConsensusGroupId(type:SchemaRegion, id:6)] are not available in these DataNodes
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IOTDB-5030
>                 URL: https://issues.apache.org/jira/browse/IOTDB-5030
>             Project: Apache IoTDB
>          Issue Type: Bug
>          Components: mpp-cluster
>    Affects Versions: 0.14.0-SNAPSHOT
>            Reporter: 刘珍
>            Assignee: Jinrui Zhang
>            Priority: Major
>         Attachments: iotdb_4851.conf
>
>
> master_1123_32e2f98
> 1. 启动1副本3C5D集群
> 2. BM 写入数据,50分钟,ip68 报错
> {color:#DE350B}2022-11-23 15:32:46,820 [pool-24-IoTDB-DataNodeInternalRPC-Processor-122] ERROR o.a.t.ProcessFunction:47 - Internal error processing sendPlanNode
> java.lang.IllegalArgumentException: all replicas for region[TConsensusGroupId(type:SchemaRegion, id:1)] are not available in these DataNodes[[TDataNodeLocation(dataNodeId:4, clientRpcEndPoint:TEndPoint(ip:192.168.10.66, port:6667), internalEndPoint:TEndPoint(ip:192.168.10.66, port:9003), mPPDataExchangeEndPoint:TEndPoint(ip:192.168.10.66, port:8777), dataRegionConsensusEndPoint:TEndPoint(ip:192.168.10.66, port:40010), schemaRegionConsensusEndPoint:TEndPoint(ip:192.168.10.66, port:50010))]]{color}
>         at org.apache.iotdb.db.mpp.plan.planner.distribution.SimpleFragmentParallelPlanner.selectTargetDataNode(SimpleFragmentParallelPlanner.java:146)
>         at org.apache.iotdb.db.mpp.plan.planner.distribution.SimpleFragmentParallelPlanner.produceFragmentInstance(SimpleFragmentParallelPlanner.java:115)
>         at org.apache.iotdb.db.mpp.plan.planner.distribution.SimpleFragmentParallelPlanner.prepare(SimpleFragmentParallelPlanner.java:87)
>         at org.apache.iotdb.db.mpp.plan.planner.distribution.SimpleFragmentParallelPlanner.parallelPlan(SimpleFragmentParallelPlanner.java:78)
>         at org.apache.iotdb.db.mpp.plan.planner.distribution.DistributionPlanner.planFragmentInstances(DistributionPlanner.java:94)
>         at org.apache.iotdb.db.mpp.plan.planner.distribution.DistributionPlanner.planFragments(DistributionPlanner.java:78)
>         at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.doDistributedPlan(QueryExecution.java:304)
>         at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.start(QueryExecution.java:201)
>         at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.retry(QueryExecution.java:235)
>         at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.getStatus(QueryExecution.java:500)
>         at org.apache.iotdb.db.mpp.plan.Coordinator.execute(Coordinator.java:152)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.executeSchemaFetchQuery(ClusterSchemaFetcher.java:178)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:156)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:98)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchemaWithAutoCreate(ClusterSchemaFetcher.java:265)
>         at org.apache.iotdb.db.mpp.plan.analyze.SchemaValidator.validate(SchemaValidator.java:56)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.executeDataInsert(RegionWriteExecutor.java:193)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:165)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:119)
>         at org.apache.iotdb.db.mpp.plan.planner.plan.node.write.InsertTabletNode.accept(InsertTabletNode.java:1086)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor.execute(RegionWriteExecutor.java:85)
>         at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.sendPlanNode(DataNodeInternalRPCServiceImpl.java:283)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3607)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3587)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> 测试环境
> 1. 192.168.10.62/66/64/68       72CPU256GB
> 192.168.10.76    48CPU384GB
> 3C : 62,66,68
> ConfigNode
> MAX_HEAP_SIZE="8G"
> DataNode
> MAX_HEAP_SIZE="192G"
> MAX_DIRECT_MEMORY_SIZE="32G"
> Common
> max_connection_for_internal_service=300
> query_timeout_threshold=3600000
> schema_region_consensus_protocol_class=org.apache.iotdb.consensus.ratis.RatisConsensus
> data_region_consensus_protocol_class=org.apache.iotdb.consensus.multileader.MultiLeaderConsensus
> schema_replication_factor=1
> data_replication_factor=1
> 2. benchmark 写入数据,配置见附件。
> 约50分钟,报错见上面的LOG 。



--
This message was sent by Atlassian Jira
(v8.20.10#820010)