You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/02/24 16:58:40 UTC

[kudu-CR] WIP: KUDU-1341 test

Hello David Ribeiro Alves,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/2300

to review the following change.

Change subject: WIP: KUDU-1341 test
......................................................................

WIP: KUDU-1341 test

This adds a new test which reproduces the following bootstrap error:
- a row is written and flushed to a DRS
- the row is updated and deleted, but the DMS is not flushed
- the row is inserted again into MRS and flushed again
- the TS restarts

When bootstrap begins, we have the following tablet state:

- DRS 1: contains the row (original inserted version)
- DRS 2: contains the row (second inserted version)

and the following log:

  REPLICATE 1.1: INSERT orig version
  COMMIT 1.1: dms id = 1
    - bootstrap correctly decides that this already flushed

  REPLICATE 1.2: UPDATE
  COMMIT 1.2: drs_id = 1, dms_id = 1
    - bootstrap correctly decides that the update is not flushed
    - it applies the update to the tablet
      - the tablet MutateRow code checks the interval tree and finds
        that this row could be in either DRS 1 or DRS 2
      - it tries applying the update to them in whatever order the
        RowSetTree returns the DiskRowSets
      - if it happens to try DRS 2 first, then we incorrectly apply
        the update to the new version of the row instead of the old

We haven't seen this bug in most of our testing because the RowSetTree
returns rows in a deterministic order. We only test the
insert-update-delete-insert sequence in a few test cases, and it appears
that those test cases either:
1) do not restart the tablet server, and thus don't see bootstrap issues
2) happen to result in RowSetTrees that give back the DRS in the lucky
   order that doesn't trigger the bug.

This patch adds a test case for the problematic sequence and also adds
a temporary random shuffling of the results of the RowSetTree, to see if
non-determinism will expose this bug in more test cases.

Change-Id: I6017ef67ae236021f7e6bd19d21b89310b8e6894
---
M src/kudu/tablet/deltafile.cc
M src/kudu/tablet/deltafile.h
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/tablet_server-test.cc
4 files changed, 37 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/00/2300/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2300
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I6017ef67ae236021f7e6bd19d21b89310b8e6894
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@cloudera.com>