You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pegasus.apache.org by GitBox <gi...@apache.org> on 2021/09/18 01:52:35 UTC

[GitHub] [incubator-pegasus] ZhongChaoqiang opened a new issue #815: potential replica may not be closed when dropped app

ZhongChaoqiang opened a new issue #815:
URL: https://github.com/apache/incubator-pegasus/issues/815


   ## Bug Report
   删除表的时候 如果该表有部分分片处于PS_POTENTIAL_SECONDARY状态,有概率出现replica-server遗留了potential状态的replica无法关闭
   
   版本为2.0.0
   
   下面是删除app(appid=3)后,通过remote_command查询到的replica信息
   D2021-09-17 06:40:44.606 (1631832044606185642 4017) replica.rep_long5.04010000000000b5: replica_stub.cpp:1707:on_gc(): start to garbage collection, replica_count = 4
   D2021-09-17 06:40:44.606 (1631832044606199774 4017) replica.rep_long5.04010000000000b5: replica_stub.cpp:1746:on_gc(): gc_shared: gc condition for 3.97@10.97.174.182:54801, status = replication::partition_status::PS_POTENTIAL_SECONDARY, garbage_max_decree = 37803, last_durable_decree= 37804, plog_max_commit_on_disk = 37803
   D2021-09-17 06:40:44.606 (1631832044606206186 4017) replica.rep_long5.04010000000000b5: replica_stub.cpp:1746:on_gc(): gc_shared: gc condition for 3.13@10.97.174.182:54801, status = replication::partition_status::PS_POTENTIAL_SECONDARY, garbage_max_decree = 37894, last_durable_decree= 37895, plog_max_commit_on_disk = 37894
   D2021-09-17 06:40:44.606 (1631832044606229313 4017) replica.rep_long5.04010000000000b5: replica_stub.cpp:1746:on_gc(): gc_shared: gc condition for 3.49@10.97.174.182:54801, status = replication::partition_status::PS_POTENTIAL_SECONDARY, garbage_max_decree = 37838, last_durable_decree= 37839, plog_max_commit_on_disk = 37838
   D2021-09-17 06:40:44.606 (1631832044606232808 4017) replica.rep_long5.04010000000000b5: replica_stub.cpp:1746:on_gc(): gc_shared: gc condition for 3.61@10.97.174.182:54801, status = replication::partition_status::PS_POTENTIAL_SECONDARY, garbage_max_decree = 37882, last_durable_decree= 37883, plog_max_commit_on_disk = 37882


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] ZhongChaoqiang edited a comment on issue #815: potential replica may not be closed when dropped app

Posted by GitBox <gi...@apache.org>.
ZhongChaoqiang edited a comment on issue #815:
URL: https://github.com/apache/incubator-pegasus/issues/815#issuecomment-922160227


   初步分析:
   potential状态的replica在learning结束,但状态未切换到secondary前,如果drop该表,会触发该问题:
   `D2021-09-16 20:21:42.890 (1631794902890873823 3fba) replica.replica10.0404000a00000ca5: replica_learn.cpp:1430:on_learn_completion_notification_reply(): 3.13@xxxxxxxxx:54801: on_learn_completion_notification_reply[0000000c00000002]: learnee = xxxxxxxxx:54801, learn_duration = 2358 ms, response_err = ERR_OK`
   
   删除app后,replicaserver在同步meta的信息的时候,由于on_node_query_reply_scatter2并不会删除potiential状态的replica,所以造成了这些replica一直存在。这有可能会导致slog一直不能执行gc。
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org


[GitHub] [incubator-pegasus] ZhongChaoqiang commented on issue #815: potential replica may not be closed when dropped app

Posted by GitBox <gi...@apache.org>.
ZhongChaoqiang commented on issue #815:
URL: https://github.com/apache/incubator-pegasus/issues/815#issuecomment-922160227


   初步分析:
   potential状态的replica在learning结束,但状态未切换到secondary前,如果drop该表,会触发该问题:
   `D2021-09-16 20:21:42.890 (1631794902890873823 3fba) replica.replica10.0404000a00000ca5: replica_learn.cpp:1430:on_learn_completion_notification_reply(): 3.13@10.97.174.182:54801: on_learn_completion_notification_reply[0000000c00000002]: learnee = 10.97.174.240:54801, learn_duration = 2358 ms, response_err = ERR_OK`
   
   删除app后,replicaserver在同步meta的信息的时候,由于on_node_query_reply_scatter2并不会删除potiential状态的replica,所以造成了这些replica一直存在。这有可能会导致slog一直不能执行gc。
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org