You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Grant Henke (Jira)" <ji...@apache.org> on 2020/06/03 15:45:00 UTC
[jira] [Updated] (KUDU-2942) A rare flaky test for the aggregated
live row count
[ https://issues.apache.org/jira/browse/KUDU-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Grant Henke updated KUDU-2942:
------------------------------
Component/s: test
> A rare flaky test for the aggregated live row count
> ---------------------------------------------------
>
> Key: KUDU-2942
> URL: https://issues.apache.org/jira/browse/KUDU-2942
> Project: Kudu
> Issue Type: Bug
> Components: test
> Reporter: LiFu He
> Priority: Major
> Attachments: ts_tablet_manager-itest.txt
>
>
> A few days ago, Adar met a rare flaky test for the live row count in TSAN mode.
>
> {code:java}
> // code placeholder
> /home/jenkins-slave/workspace/kudu-master/3/src/kudu/integration-tests/ts_tablet_manager-itest.cc:642
> Expected: live_row_count
> Which is: 327
> To be equal to: table_info->GetMetrics()->live_row_count->value()
> Which is: 654
> {code}
> It seems the metric value is doubled. And his full test output is in the attachment.
>
> I reviewed the previous patches and made some unusual guesses. I think one of them could explain the issue:
> When one master just becomes the leader and there are two heartbeat messages from the same tserver that are processed in parallel at [Line4239|https://github.com/apache/kudu/blob/1bdae88faefe9b0d43b6897d96cd853bc5dd7353/src/kudu/master/catalog_manager.cc#L4239], then the metric value will be doubled because the old tablet stats can be accessed concurrently.
> Thus, the question becomes how to generate two heartbeat messages from the same tserver at the same time? The possible answer is: [First heartbeat message|https://github.com/apache/kudu/blob/1bdae88faefe9b0d43b6897d96cd853bc5dd7353/src/kudu/integration-tests/ts_tablet_manager-itest.cc#L741] and [Second heartbeat message|https://github.com/apache/kudu/blob/1bdae88faefe9b0d43b6897d96cd853bc5dd7353/src/kudu/integration-tests/ts_tablet_manager-itest.cc#L635]
> Please don't forget the above case is integrate test environment, not product.
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)