You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2017/05/10 18:04:12 UTC

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Will Berkeley has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6843

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................

KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Now that safe time works properly, this patch enables snapshot
checksum scans in ksck.

Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
---
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_remote-test.cc
2 files changed, 18 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/6843/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6843/2/src/kudu/tools/ksck_remote-test.cc
File src/kudu/tools/ksck_remote-test.cc:

PS2, Line 154: a chance establish a timestamp.
> Without this change the test fails regularly with a message like:
I see what you mean, my comment was more about verbage.
Are you sure that you might get that number from 2 replicas, not just 1?
That lagging amount is interesting though, seems like the replica hasn't updated the safe time ever. Might actually be a bug.

Don't want to delay this though, so regarding verbage how about:
"Wait for the first 100 writes so be committed so that there is a very high chance that all replicas have committed at least one message in each tablet, otherwise safe time might not have been and replicas might refuse snapshot scans because of lag"

os something like that


-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6843/1//COMMIT_MSG
Commit Message:

PS1, Line 9: Now that safe time works properly, this patch enables snapshot
           : checksum scans in ksck.
> the commit message doesn't seem quite right, since no changes to the actual
You're right. It just turns the tests on to make sure it's working. The code in ksck was already there.


-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6843/2/src/kudu/tools/ksck_remote-test.cc
File src/kudu/tools/ksck_remote-test.cc:

PS2, Line 154: a chance establish a timestamp.
missing a "to" but in general what do you mean by "establish a timestamp"?


-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


Patch Set 3: Verified+1

-raft_consensus-itest failure is a flake I've seen a lot on master
-MultiThreadedTabletTest appears to be a rare known failure that is unrelated to the patch

-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6843/2/src/kudu/tools/ksck_remote-test.cc
File src/kudu/tools/ksck_remote-test.cc:

PS2, Line 154: a chance establish a timestamp.
> missing a "to" but in general what do you mean by "establish a timestamp"?
Without this change the test fails regularly with a message like:

Service unavailable: Timed out: could not wait for desired snapshot timestamp to be consistent: Tablet is lagging too much to be able to serve snapshot scan. Lagging by: 1494362699845 ms, (max is 30000 ms)

from 1-2 tablets. Sometimes a minicluster replica hasn't ever updated its timestamp by the time the snapshot scan request reaches it.


-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6843/2/src/kudu/tools/ksck_remote-test.cc
File src/kudu/tools/ksck_remote-test.cc:

PS2, Line 154: a chance establish a timestamp.
> I see what you mean, my comment was more about verbage.
Ah ok. Your suggestion is much clearer so I will use something like it.

My comment about 1-2 tablets was misleading. More precisely, in the test table with 3 tablets, 9 replicas, I'd see 1-2 out of the 9 replicas have that problem-- not 2/3 replicas of the same tablet.


-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has submitted this change and it was merged.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Now that safe time works properly, this patch enables
the snapshot checksum tests for ksck.

Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Reviewed-on: http://gerrit.cloudera.org:8080/6843
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
Tested-by: Will Berkeley <wd...@gmail.com>
---
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_remote-test.cc
2 files changed, 20 insertions(+), 28 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Will Berkeley: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6843/1//COMMIT_MSG
Commit Message:

PS1, Line 9: Now that safe time works properly, this patch enables snapshot
           : checksum scans in ksck.
the commit message doesn't seem quite right, since no changes to the actual ksck code are being made. This is just removing some workarounds from tests, no?


http://gerrit.cloudera.org:8080/#/c/6843/1/src/kudu/tools/ksck_remote-test.cc
File src/kudu/tools/ksck_remote-test.cc:

Line 154:       // Wait for the first 100 writes so each server should have a chance establish a timestamp.
hm, not quite following this. shouldn't servers always have a timestamp?


-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6843

to look at the new patch set (#3).

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................

KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Now that safe time works properly, this patch enables
the snapshot checksum tests for ksck.

Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
---
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_remote-test.cc
2 files changed, 20 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/6843/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6843

to look at the new patch set (#2).

Change subject: KUDU-1056 and KUDU-1020 Safe time for ksck checksum
......................................................................

KUDU-1056 and KUDU-1020 Safe time for ksck checksum

Now that safe time works properly, this patch enables
the snapshot checksum tests for ksck.

Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
---
M src/kudu/tools/ksck.cc
M src/kudu/tools/ksck_remote-test.cc
2 files changed, 18 insertions(+), 28 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/43/6843/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6843
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib45be20dcfa37fb85185302adf84d2c4a55f8c1e
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>