You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@iotdb.apache.org by "Gaofei Cao (Jira)" <ji...@apache.org> on 2023/01/10 09:22:00 UTC

[jira] [Commented] (IOTDB-5343) Verify the trigger_info.bin error and "you need to increase dn_max_connection_for_internal_service" when remove DataNode

    [ https://issues.apache.org/jira/browse/IOTDB-5343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17656531#comment-17656531 ] 

Gaofei Cao commented on IOTDB-5343:
-----------------------------------

All issues are resolved, need reproduce [~刘珍] 

> Verify the trigger_info.bin error and "you need to increase dn_max_connection_for_internal_service" when remove DataNode
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: IOTDB-5343
>                 URL: https://issues.apache.org/jira/browse/IOTDB-5343
>             Project: Apache IoTDB
>          Issue Type: Improvement
>            Reporter: Gaofei Cao
>            Assignee: Gaofei Cao
>            Priority: Major
>
> This issue is a mirror of https://issues.apache.org/jira/browse/IOTDB-4830.
>  
> rel/1.0 2022-11-29_a7a1738 ,{color:#de350b}下面的2类问题需要确认。{color}
> rel/1.0 2022-12-01_84c01ae 版本也有“问题1”的问题。
> {*}问题1{*}:{color:#de350b}[ForkJoinPool.commonPool-worker-5] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 需要确认这个报错。{color}
> {*}问题2{*}: SET_SYSTEM_STATUS failed on DataNode TEndPoint(ip:172.20.70.3, port:9003)
> java.io.IOException: Borrow client from pool for node TEndPoint(ip:172.20.70.3, port:9003) failed, you need to increase dn_max_connection_for_internal_service.
> 因为ip3已经下线,缩容的时候confignode会set ip3的status,所以失败,{color:#de350b}但是报错信息中的you need to increase dn_max_connection_for_internal_service. 不合适。{color}
> 私有云3副本3C5D
> 1.启动3副本3C5D集群
> 2.stop ip3的datanode
> 3.BM写入数据,完成
> 4.缩容ip3的datanode,缩容成功。
> 查看ConfigNode Leader的日志:
> 2022-11-30 10:52:40,849 [ForkJoinPool.commonPool-worker-5] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 2022-11-30 10:54:41,161 [ForkJoinPool.commonPool-worker-1] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 2022-11-30 10:56:41,474 [ForkJoinPool.commonPool-worker-1] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 2022-11-30 10:58:41,789 [ForkJoinPool.commonPool-worker-0] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 2022-11-30 11:00:42,105 [ForkJoinPool.commonPool-worker-6] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 2022-11-30 11:02:42,401 [ForkJoinPool.commonPool-worker-0] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 2022-11-30 11:04:42,686 [0@group-000000000000-StateMachineUpdater] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 2022-11-30 11:06:42,972 [ForkJoinPool.commonPool-worker-5] ERROR o.a.i.c.p.TriggerInfo:246 - Failed to take snapshot, because snapshot file [/data/iotdb/r_1130_40de3ad/sbin/../data/confignode/consensus/47474747-4747-4747-4747-000000000000/sm/.tmp.1_20583/trigger_info.bin] is already exist.
> 2022-11-30 11:11:48,561 [ProcExecWorker-2] ERROR o.a.i.c.c.s.SyncDataNodeClientPool:97 - {color:#de350b}SET_SYSTEM_STATUS failed on DataNode TEndPoint(ip:172.20.70.3, port:9003)
> java.io.IOException: Borrow client from pool for node TEndPoint(ip:172.20.70.3, port:9003) failed, you need to increase dn_max_connection_for_internal_service.{color}
> at org.apache.iotdb.commons.client.ClientManager.borrowClient(ClientManager.java:64)
> at org.apache.iotdb.confignode.client.sync.SyncDataNodeClientPool.sendSyncRequestToDataNodeWithGivenRetry(SyncDataNodeClientPool.java:87)
> at org.apache.iotdb.confignode.procedure.env.ConfigNodeProcedureEnv.markDataNodeAsRemovingAndBroadcast(ConfigNodeProcedureEnv.java:373)
> at org.apache.iotdb.confignode.procedure.impl.node.RemoveDataNodeProcedure.executeFromState(RemoveDataNodeProcedure.java:86)
> at org.apache.iotdb.confignode.procedure.impl.node.RemoveDataNodeProcedure.executeFromState(RemoveDataNodeProcedure.java:47)
> at org.apache.iotdb.confignode.procedure.impl.statemachine.StateMachineProcedure.execute(StateMachineProcedure.java:186)
> at org.apache.iotdb.confignode.procedure.Procedure.doExecute(Procedure.java:365)
> at org.apache.iotdb.confignode.procedure.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:414)
> at org.apache.iotdb.confignode.procedure.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:373)
> at org.apache.iotdb.confignode.procedure.ProcedureExecutor.access$300(ProcedureExecutor.java:50)
> at org.apache.iotdb.confignode.procedure.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:741)
> Caused by: net.sf.cglib.core.CodeGenerationException: org.apache.thrift.transport.TTransportException-->java.net.ConnectException: Connection refused (Connection refused)
> at net.sf.cglib.core.ReflectUtils.newInstance(ReflectUtils.java:235)
> at net.sf.cglib.core.ReflectUtils.newInstance(ReflectUtils.java:220)
> at net.sf.cglib.proxy.Enhancer.createUsingReflection(Enhancer.java:639)
> at net.sf.cglib.proxy.Enhancer.firstInstance(Enhancer.java:538)
> at net.sf.cglib.core.AbstractClassGenerator.create(AbstractClassGenerator.java:225)
> at net.sf.cglib.proxy.Enhancer.createHelper(Enhancer.java:377)
> at net.sf.cglib.proxy.Enhancer.create(Enhancer.java:304)
> at org.apache.iotdb.commons.client.sync.SyncThriftClientWithErrorHandler.newErrorHandler(SyncThriftClientWithErrorHandler.java:48)
> at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$Factory.makeObject(SyncDataNodeInternalServiceClient.java:127)
> at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$Factory.makeObject(SyncDataNodeInternalServiceClient.java:105)
> at org.apache.commons.pool2.impl.GenericKeyedObjectPool.create(GenericKeyedObjectPool.java:780)
> at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:439)
> at org.apache.commons.pool2.impl.GenericKeyedObjectPool.borrowObject(GenericKeyedObjectPool.java:350)
> at org.apache.iotdb.commons.client.ClientManager.borrowClient(ClientManager.java:50)
> ... 10 common frames omitted
> Caused by: org.apache.thrift.transport.TTransportException: java.net.ConnectException: Connection refused (Connection refused)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:243)
> at org.apache.iotdb.rpc.TElasticFramedTransport.open(TElasticFramedTransport.java:91)
> at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient.<init>(SyncDataNodeInternalServiceClient.java:63)
> at org.apache.iotdb.commons.client.sync.SyncDataNodeInternalServiceClient$$EnhancerByCGLIB$$b73d1a05.<init>(<generated>)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at net.sf.cglib.core.ReflectUtils.newInstance(ReflectUtils.java:228)
> ... 23 common frames omitted
> Caused by: java.net.ConnectException: Connection refused (Connection refused)
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at org.apache.thrift.transport.TSocket.open(TSocket.java:238)
> ... 31 common frames omitted
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)