You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pegasus.apache.org by GitBox <gi...@apache.org> on 2021/04/13 02:28:11 UTC

[GitHub] [incubator-pegasus] ZhongChaoqiang opened a new issue #719: data loss

ZhongChaoqiang opened a new issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719


   ## Bug Report
   
   Please answer these questions before submitting your issue. Thanks!
   
   1. What did you do?
   If possible, provide a recipe for reproducing the error.
   
   2. What did you expect to see?
   
   3. What did you see instead?
   
   4. What version of Pegasus are you using?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org

[GitHub] [incubator-pegasus] zhangyifan27 edited a comment on issue #719: data loss after restarting

Posted by GitBox <gi...@apache.org>.

zhangyifan27 edited a comment on issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719#issuecomment-822382925


   @ZhongChaoqiang 
   能具体描述一下这个问题要怎么复现吗？我测试了下learn之后重启，并没有复现出`ERR_INCOMPLETE_DATA`相关的log。
   另外出现这个错误之后，open replica失败，replica及相关数据会被删除，replica server会重新learn一份数据，应该是不会丢数据的，实际中你们发现丢数据的现象了吗？
   
   看了下replica start这块的逻辑：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1706-L1752 `_last_committed_decree`是从manifest中读到的last_flushed_decree的值，随后`_last_durable_decree`被更新成和`_last_committed_decree`一样的值，如果出现`last_durable_decree() < _info.init_durable_decree`，说明replica重启之前可能没有正常flush，导致没有更新last_flushed_decree：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1805-L1812
   我觉得你可以看下是不是replica server进程退出时没有执行flush导致的这个问题？
   
   `_last_committed_decree`在数据成功写入log和memtable之后会更新，而`_last_durable_decree`好像是打checkpoint时才会更新（这部分不太确定, @neverchanje 可以解答下）。
   
   update_init_info 这块有个注释是说在learn之后需要再一次打checkpoint否则app::open_internal会出错（好像就是你说的这个问题）:
   https://github.com/XiaoMi/rdsn/blob/master/src/replica/replica_learn.cpp#L769-L779
   所以我觉得可能本身就是这么设计的，init_info里面需要写入_last_committed_decree，至于为什么可能需要其他committer解答下:)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org

[GitHub] [incubator-pegasus] ZhongChaoqiang commented on issue #719: data loss after restarting

Posted by GitBox <gi...@apache.org>.

ZhongChaoqiang commented on issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719#issuecomment-825351849


   @zhangyifan27 
   谢谢你的恢复！不好意思啊，没有及时看到你的信息！
   这个问题比较久远了，我们是在现网碰到的这个问题，具体怎么操作导致的这个问题当时也没有具体说明，但是应该是有多次的重启操作的。
   由于我们把err的清理时间（gc_disk_error_replica_interval_seconds）设置过短了，当时的确是丢数据了。（primary的机器下线了，secondary的节点open的时候把数据移到了err目录），所以用户读不到数据了。
   
   还有一个关键日志：在open replica的时候，打印了如下日志。
   ```
   E2020-12-03 21:27:57.929 (1607002077929018092 3b14) replica.replica0.04010000000000d9: replication_app_base.cpp:347:open_internal(): 35.24@10.32.82.225:34801: replica data is not complete coz last_durable_decree(16368) < init_durable_decree(16371)
   E2020-12-03 21:27:57.929 (1607002077929048291 3b14) replica.replica0.04010000000000d9: replication_app_base.cpp:353:open_internal(): 35.24@10.32.82.225:34801: open replica app return ERR_INCOMPLETE_DATA
   ```
   这里表明，open replica的时候，我们记录的last_durable_decree是不能少于从.init-info读取出来的init_durable_decree的。因为这样有可能是以为丢失了数据，此时的replica数据应该是不完整的了，所以需要把数据移到err目录下。
   
   怎么样保证last_durable_decree不少于init_durable_decree？由于我们对degree的机制了解的不是很深，目前我们想到的办法是last_committed_decree改成last_durable_decree，应该可以优化这个问题。
   
   怎么重现这个问题我好像没有想到好的方法，所以也是从代码上去分析这个问题的。麻烦你们再帮忙看看是不是会存在这样的问题？谢谢！


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org

[GitHub] [incubator-pegasus] zhangyifan27 edited a comment on issue #719: data loss after restarting

Posted by GitBox <gi...@apache.org>.

zhangyifan27 edited a comment on issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719#issuecomment-822382925


   @ZhongChaoqiang 
   能具体描述一下这个问题要怎么复现吗？我测试了下learn之后重启，并没有复现出`ERR_INCOMPLETE_DATA`相关的log。
   另外出现这个错误之后，open replica失败，replica及相关数据会被删除，replica server会重新learn一份数据，应该是不会丢数据的，实际中你们发现丢数据的现象了吗？
   
   看了下replica start这块的逻辑：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1706-L1752 `_last_committed_decree`是从manifest中读到的last_flush_decree的值，随后`_last_durable_decree`被更新成和`_last_committed_decree`一样的值，如果出现`last_durable_decree() < _info.init_durable_decree`，说明replica重启之前可能没有正常flush，导致last_flush_decree没有持久化到manifest中：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1805-L1812
   我觉得你可以看下是不是replica server进程退出时没有执行flush导致的这个问题？
   
   `_last_committed_decree`在数据成功写入log和memtable之后会更新，而`_last_durable_decree`好像是打checkpoint时才会更新（这部分不太确定, @neverchanje 可以解答下）。
   
   update_init_info 这块有个注释是说在learn之后需要再一次打checkpoint否则app::open_internal会出错（好像就是你说的这个问题）:
   https://github.com/XiaoMi/rdsn/blob/master/src/replica/replica_learn.cpp#L769-L779
   所以我觉得可能本身就是这么设计的，init_info里面需要写入_last_committed_decree，至于为什么可能需要其他committer解答下:)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org

[GitHub] [incubator-pegasus] ZhongChaoqiang commented on issue #719: data loss

Posted by GitBox <gi...@apache.org>.

ZhongChaoqiang commented on issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719#issuecomment-818388685


   下面是分析过程：
   
   1. last_durable_decree为上一次持久化的点，打开replica时从rocksdb的数据目录中读取出来，而init_durable_decree是从.init-info文件中读取出来。.init-info文件一般是在learn的时候生成，我们继续分析日志，发现replica在发生问题前是有发生learn的操作的，并且有如下日志：
   `D2020-12-03 21:27:09.499 (1607002029499473085 7754) replica.rep_long3.0404000d000021bd: replica_learn.cpp:1075:on_copy_remote_state_completed(): 35.24@10.32.82.225:34801: on_copy_remote_state_completed[000001a600000002]: learnee = 10.32.82.16:34801, learn_duration = 5 ms, apply checkpoint/log done, err = ERR_OK, last_prepared_decree = (16369 => 16371), last_committed_decree = (16368 => 16371), app_committed_decree = (16368 => 16371), app_durable_decree = (16368 => 16368), remote_committed_decree = 16371, prepare_start_decree = -1, current_learning_status = replication::learner_status::LearningWithoutPrepare`
   
   可以发现经过learn后，last_committed_decree变成了16371，而app_durable_decree还是16368。我们继续分析代码，发现代码中learn后写入.init-info文件中的init_durable_decree的值并不是last_durable_decree的值，而是last_committed_decree的值。
   
   ```
   err = _app->update_init_info(
         this,
         _stub->_log->on_partition_reset(get_gpid(), _app->last_committed_decree()),
         _private_log->on_partition_reset(get_gpid(), _app->last_committed_decree()),
         _app->last_committed_decree());
   ```
   
   init_durable_decree应该是记录last_durable_decree的值，而不是last_committed_decree的值；last_committed_decree表示上一次已经提交的修改，但不一定已经持久化了。所以last_committed_decree总是大于或等于last_durable_decree的值的。


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org

[GitHub] [incubator-pegasus] zhangyifan27 edited a comment on issue #719: data loss after restarting

Posted by GitBox <gi...@apache.org>.

zhangyifan27 edited a comment on issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719#issuecomment-822382925


   @ZhongChaoqiang 
   能具体描述一下这个问题要怎么复现吗？我测试了下learn之后重启，并没有复现出`ERR_INCOMPLETE_DATA`相关的log。
   另外出现这个错误之后，open replica失败，replica及相关数据会被删除，replica server会重新learn一份数据，应该是不会丢数据的，实际中你们发现丢数据的现象了吗？
   
   看了下replica start这块的逻辑：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1706-L1752 `_last_committed_decree`是从manifest中读到的last_flush_decree的值，随后`_last_durable_decree`被更新成和`_last_committed_decree`一样的值，如果出现`last_durable_decree() < _info.init_durable_decree`，说明replica重启之前可能没有正常flush，导致last_flush_decree没有持久化到manifest中：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1805-L1812
   我觉得你可以看下是不是replica server进程退出时没有执行flush导致的这个问题？
   
   `_last_committed_decree`在数据成功写入log和memtable之后会更新，而`_last_durable_decree`好像是打checkpoint时才会更新（这部分不太确定, @neverchanje 可以解答下）。
   
   update_init_info 这块有个注释是说在copy checkpoint之后需要再一次打checkpoint否则open_internal会出错:
   https://github.com/XiaoMi/rdsn/blob/master/src/replica/replica_learn.cpp#L769-L779
   我觉得可能本身就是这么设计的，至于为什么可能需要其他committer解答下


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org

[GitHub] [incubator-pegasus] zhangyifan27 commented on issue #719: data loss after restarting

Posted by GitBox <gi...@apache.org>.

zhangyifan27 commented on issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719#issuecomment-822382925


   @ZhongChaoqiang 
   能具体描述一下这个问题要怎么复现吗？我测试了下learn之后重启，并没有复现出`ERR_INCOMPLETE_DATA`相关的log。
   另外出现这个错误之后，open replica失败，replica及相关数据会被删除，replica server会重新learn一份数据，应该是不会丢数据的，实际中你们发现丢数据的现象了吗？
   
   `_last_committed_decree`在数据成功写入log和memtable之后会更新，而`_last_durable_decree`好像是打checkpoint时才会更新（这部分不太确定, @neverchanje 可以解答下）, init info里面记录last_committed_decree的值应该没什么问题，因为数据已经成功写入了，open replica时是需要replay这部分数据的，如果把它改成last_durable_decree 反而是有问题的。
   
   看了下replica start这块的逻辑：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1706-L1752 `_last_committed_decree`被初始化为last_flush_decree，`_last_committed_decree`被更新成和`_last_committed_decree`一样的值，如果后面出现`last_durable_decree() < _info.init_durable_decree`，说明replica重启之前可能没有正常flush导致last_flush_decree没有写入manifest（2.0.0之后的版本是写到meta column family）：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1805-L1812
   我觉得你可以看下是不是replica server进程退出时没有执行这个flush导致的这个问题？


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org

[GitHub] [incubator-pegasus] zhangyifan27 edited a comment on issue #719: data loss after restarting

Posted by GitBox <gi...@apache.org>.

zhangyifan27 edited a comment on issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719#issuecomment-822382925


   @ZhongChaoqiang 
   能具体描述一下这个问题要怎么复现吗？我测试了下learn之后重启，并没有复现出`ERR_INCOMPLETE_DATA`相关的log。
   另外出现这个错误之后，open replica失败，replica及相关数据会被删除，replica server会重新learn一份数据，应该是不会丢数据的，实际中你们发现丢数据的现象了吗？
   
   `_last_committed_decree`在数据成功写入log和memtable之后会更新，而`_last_durable_decree`好像是打checkpoint时才会更新（这部分不太确定, @neverchanje 可以解答下）, init info里面记录last_committed_decree的值应该没什么问题，因为数据已经成功写入了，open replica时是需要replay这部分数据的，如果把它改成last_durable_decree 反而是有问题的。
   
   看了下replica start这块的逻辑：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1706-L1752 `_last_committed_decree`被初始化为last_flush_decree，随后`_last_durable_decree`被更新成和`_last_committed_decree`一样的值，如果出现`last_durable_decree() < _info.init_durable_decree`，说明replica重启之前可能没有正常flush，导致last_flush_decree没有持久化到manifest中：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1805-L1812
   我觉得你可以看下是不是replica server进程退出时没有执行flush导致的这个问题？


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org

[GitHub] [incubator-pegasus] zhangyifan27 edited a comment on issue #719: data loss after restarting

Posted by GitBox <gi...@apache.org>.

zhangyifan27 edited a comment on issue #719:
URL: https://github.com/apache/incubator-pegasus/issues/719#issuecomment-822382925


   @ZhongChaoqiang 
   能具体描述一下这个问题要怎么复现吗？我测试了下learn之后重启，并没有复现出`ERR_INCOMPLETE_DATA`相关的log。
   另外出现这个错误之后，open replica失败，replica及相关数据会被删除，replica server会重新learn一份数据，应该是不会丢数据的，实际中你们发现丢数据的现象了吗？
   
   看了下replica start这块的逻辑：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1706-L1752 `_last_committed_decree`被初始化为rocksdb的last_flushed_decree，随后`_last_durable_decree`被更新成和`_last_committed_decree`一样的值，如果出现`last_durable_decree() < _info.init_durable_decree`，说明replica重启之前可能没有正常flush，导致没有更新last_flushed_decree：https://github.com/apache/incubator-pegasus/blob/a948e89b180b6a5c82d298d0dcc65f7bb770a8be/src/server/pegasus_server_impl.cpp#L1805-L1812
   我觉得你可以看下是不是replica server进程退出时没有执行flush导致的这个问题？
   
   `_last_committed_decree`在数据成功写入log和memtable之后会更新，而`_last_durable_decree`好像是打checkpoint时才会更新（这部分不太确定, @neverchanje 可以解答下），不太确定.init_info里面记录_last_durable_decree有没有问题。
   
   另外update_init_info 这块有个注释是说在learn之后需要再一次打checkpoint否则app::open_internal会出错（好像就是你提的这个问题）:
   https://github.com/XiaoMi/rdsn/blob/master/src/replica/replica_learn.cpp#L769-L779
   所以我觉得可能本身就是这么设计的，init_info里面需要写入_last_committed_decree，至于为什么可能需要其他committer解答下:)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pegasus.apache.org
For additional commands, e-mail: dev-help@pegasus.apache.org