You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "刘珍 (Jira)" <ji...@apache.org> on 2023/02/13 01:13:00 UTC

[jira] [Reopened] (IOTDB-5467) Execute query : ERROR o.a.i.d.m.e.e.RegionWriteExecutor:88 - Fetch Schema failed

     [ https://issues.apache.org/jira/browse/IOTDB-5467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

刘珍 reopened IOTDB-5467:
-----------------------

不过这个异常,需要分析一下:
java.lang.RuntimeException: cannot fetch schema, status is: 301, msg is: There is not enough memory to execute current fragment instance, current remaining free memory is 10409, estimated memory usage for current fragment instance is 131072

> Execute query : ERROR o.a.i.d.m.e.e.RegionWriteExecutor:88 - Fetch Schema failed
> --------------------------------------------------------------------------------
>
>                 Key: IOTDB-5467
>                 URL: https://issues.apache.org/jira/browse/IOTDB-5467
>             Project: Apache IoTDB
>          Issue Type: Bug
>          Components: mpp-cluster
>    Affects Versions: 1.0.1-SNAPSHOT
>            Reporter: 刘珍
>            Assignee: Minghui Liu
>            Priority: Major
>         Attachments: auto_set_ttl_per_sg.sh, confignode_ip23_leader_logs.tar.gz, datanode_ip2_fetch_schema_failed.tar.gz, image-2023-02-03-14-39-21-109.png, image-2023-02-03-14-39-33-195.png, lt.conf
>
>
> 测试版本:rc/1.0.1  20230129 573097a
> 问题描述:
> 启动3副本3C21D集群,
> 2023-1-31 16:40:00 , 启动1个Benchmark连ip2(操作执行间隔OP_INTERVAL=1000)执行读写,
> 设置TTL 为1小时(脚本见附件)
> 2023-02-02 02:21:08 ConfigNode Leader(ip23)报错,连不上ip7的datanode(unkown)
> IP2 datanode log:
> 2023-02-02 02:51:04,625 [pool-26-IoTDB-DataNodeInternalRPC-Processor-9]{color:red}* ERROR o.a.i.d.m.e.e.RegionWriteExecutor:88 - Fetch Schema failed.
> java.lang.RuntimeException: Fetch Schema failed.*{color}
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.executeSchemaFetchQuery(ClusterSchemaFetcher.java:202)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:156)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:98)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchemaWithAutoCreate(ClusterSchemaFetcher.java:265)
>         at org.apache.iotdb.db.mpp.plan.analyze.SchemaValidator.validate(SchemaValidator.java:56)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.executeDataInsert(RegionWriteExecutor.java:202)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:174)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:128)
>         at org.apache.iotdb.db.mpp.plan.planner.plan.node.write.InsertTabletNode.accept(InsertTabletNode.java:1086)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor.execute(RegionWriteExecutor.java:86)
>         at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.sendPlanNode(DataNodeInternalRPCServiceImpl.java:288)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3607)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3587)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:834)
> Caused by: org.apache.iotdb.commons.exception.IoTDBException: org.apache.iotdb.db.mpp.execution.fragment.FragmentInstanceFailureInfo$FailureException
>         at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.dealWithException(QueryExecution.java:428)
>         at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.getResult(QueryExecution.java:411)
>         at org.apache.iotdb.db.mpp.plan.execution.QueryExecution.getBatchResult(QueryExecution.java:437)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.executeSchemaFetchQuery(ClusterSchemaFetcher.java:200)
>         ... 18 common frames omitted
> Caused by: org.apache.iotdb.db.mpp.execution.fragment.FragmentInstanceFailureInfo$FailureException: null
>         at org.apache.iotdb.db.mpp.execution.fragment.FragmentInstanceManager.lambda$cancelTimeoutFlushingInstances$8(FragmentInstanceManager.java:288)
>         at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
>         at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
>         at java.base/java.util.concurrent.ConcurrentHashMap$EntrySpliterator.forEachRemaining(ConcurrentHashMap.java:3645)
>         at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
>         at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
>         at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
>         at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
>         at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>         at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
>         at org.apache.iotdb.db.mpp.execution.fragment.FragmentInstanceManager.cancelTimeoutFlushingInstances(FragmentInstanceManager.java:288)
>         at org.apache.iotdb.commons.concurrent.threadpool.ScheduledExecutorUtil.lambda$scheduleWithFixedDelay$1(ScheduledExecutorUtil.java:177)
>         at org.apache.iotdb.commons.concurrent.WrappedRunnable$1.runMayThrow(WrappedRunnable.java:44)
>         at org.apache.iotdb.commons.concurrent.WrappedRunnable.run(WrappedRunnable.java:29)
>         at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
>         at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
>         at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
>         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:834)
> 2023-02-02 04:00:52,056 [pool-26-IoTDB-DataNodeInternalRPC-Processor-16] ERROR o.a.i.d.m.e.e.RegionWriteExecutor:88 - cannot fetch schema, status is: 301, msg is: {color:red}*There is not enough memory to execute current fragment instance, current remaining free memory is 10409, estimated memory usage for current fragment instance is 131072*{color}
> java.lang.RuntimeException: cannot fetch schema, status is: 301, msg is: There is not enough memory to execute current fragment instance, current remaining free memory is 10409, estimated memory usage for current fragment instance is 131072
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.executeSchemaFetchQuery(ClusterSchemaFetcher.java:188)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:156)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchema(ClusterSchemaFetcher.java:98)
>         at org.apache.iotdb.db.mpp.plan.analyze.ClusterSchemaFetcher.fetchSchemaWithAutoCreate(ClusterSchemaFetcher.java:265)
>         at org.apache.iotdb.db.mpp.plan.analyze.SchemaValidator.validate(SchemaValidator.java:56)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.executeDataInsert(RegionWriteExecutor.java:202)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:174)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor$WritePlanNodeExecutionVisitor.visitInsertTablet(RegionWriteExecutor.java:128)
>         at org.apache.iotdb.db.mpp.plan.planner.plan.node.write.InsertTabletNode.accept(InsertTabletNode.java:1086)
>         at org.apache.iotdb.db.mpp.execution.executor.RegionWriteExecutor.execute(RegionWriteExecutor.java:86)
>         at org.apache.iotdb.db.service.thrift.impl.DataNodeInternalRPCServiceImpl.sendPlanNode(DataNodeInternalRPCServiceImpl.java:288)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3607)
>         at org.apache.iotdb.mpp.rpc.thrift.IDataNodeRPCService$Processor$sendPlanNode.getResult(IDataNodeRPCService.java:3587)
>         at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:38)
>         at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:38)
>         at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:248)
>         at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>         at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>         at java.base/java.lang.Thread.run(Thread.java:834)
> 测试详细信息:
> 1.启动3C21D集群
> 3C:172.16.2.23/24/25  /data/iotdb/r_0129_573097a/logs
> 21D: 172.16.2.2 ~ 172.16.2.22     /data1/iotdb/r_0129_573097a
> 配置参数:
> ConfigNode配置
> MAX_HEAP_SIZE="20G"
> MAX_DIRECT_MEMORY_SIZE="6G"
> cn_target_config_node_list=172.16.2.23:10710
> DataNode配置:
> MAX_HEAP_SIZE="20G"
> MAX_DIRECT_MEMORY_SIZE="6G"
> dn_target_config_node_list=172.16.2.23:10710,172.16.2.24:10710,172.16.2.25:10710
> Common配置:
> schema_replication_factor=3
> data_replication_factor=3
> 2. 启动Benchmark,配置见附件,主要参数如下:
> DEVICE_NUMBER=4200
> SENSOR_NUMBER=600
> CLIENT_NUMBER=210
> GROUP_NUMBER=1
> OPERATION_PROPORTION=91:1:1:1:1:0:1:1:1:1:1
> 3. 启动BM开始写入数据后,启动设置TTL的脚本(见附件)
> 脚本位置在172.16.2.205
> /data/iotdb/deploy_mpp_scripts_0110
> 4. 查看日志
> 2023-02-02 02:21:08 ConfigNode Leader(ip23)报错,连不上ip7的datanode(unkown),ip7 ping不通。
> 查看datanode的报错日志,见问题描述。
> 集群状态:
>  !image-2023-02-03-14-39-21-109.png! 
> region状态:
>  !image-2023-02-03-14-39-33-195.png! 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)