You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "KeDeng (Code Review)" <ge...@cloudera.org> on 2023/04/19 06:30:12 UTC

[kudu-CR] [tool] add details show for not fully quiesced tserver

KeDeng has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19767


Change subject: [tool] add details show for not fully quiesced tserver
......................................................................

[tool] add details show for not fully quiesced tserver

We can run `kudu tserver quiesce start` with the
'--error_if_not_fully_quiesced' option to indicating
that all leaders have been moved away from the tablet
server and all on-going scans have completed for now.

However, in some cases, such as the presence of a
single replica table, it may result in quiesce operations
not being completed. If the total number of tables is
very large, we will not be able to easily locate the root
cause of quiesce jamming.

To solve this issue, I have added a parameter
'--show_details_if_not_fully_quiesced' to the quiesce
progress query, which in conjunction with parameter
'--error_if_not_fully_quiesced', can easily determine which
tablets or scanners are present, blocking the completion of
the query operation.

At the same time, I added test to ensure that the new
parameter will take effect.

Change-Id: I1dde33f1c36831b62c321eeea2040bc4a5ac21d0
---
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_server_quiescing-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_tserver.cc
M src/kudu/tserver/scanners-test.cc
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver_admin.proto
13 files changed, 218 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/19767/1
-- 
To view, visit http://gerrit.cloudera.org:8080/19767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1dde33f1c36831b62c321eeea2040bc4a5ac21d0
Gerrit-Change-Number: 19767
Gerrit-PatchSet: 1
Gerrit-Owner: KeDeng <kd...@gmail.com>

[kudu-CR] [tool] add details show for not fully quiesced tserver

Posted by "Ashwani Raina (Code Review)" <ge...@cloudera.org>.
Ashwani Raina has posted comments on this change. ( http://gerrit.cloudera.org:8080/19767 )

Change subject: [tool] add details show for not fully quiesced tserver
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/19767/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19767/2//COMMIT_MSG@10
PS2, Line 10: indicating
nit: indicate


http://gerrit.cloudera.org:8080/#/c/19767/2//COMMIT_MSG@10
PS2, Line 10: error_if_not_fully_quiesced
Just curious - Today, if we don't use "error_if_not_fully_quiesced" option and there are some active scanners or tablet replica leaders present, what is the behavior?


http://gerrit.cloudera.org:8080/#/c/19767/2//COMMIT_MSG@24
PS2, Line 24: blocking the completion of
            : the query operation.
Not sure what you mean by blocking the query operation. Do you mean what could have blocked quiesce command due to presence of any active scanners or tablet leaders on the ts if it was run without error_if_not_fully_quiesced option?



-- 
To view, visit http://gerrit.cloudera.org:8080/19767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dde33f1c36831b62c321eeea2040bc4a5ac21d0
Gerrit-Change-Number: 19767
Gerrit-PatchSet: 2
Gerrit-Owner: KeDeng <kd...@gmail.com>
Gerrit-Reviewer: Ashwani Raina <ar...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 21 Apr 2023 12:36:38 +0000
Gerrit-HasComments: Yes

[kudu-CR] [tool] add details show for not fully quiesced tserver

Posted by "KeDeng (Code Review)" <ge...@cloudera.org>.
Hello Ashwani Raina, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19767

to look at the new patch set (#3).

Change subject: [tool] add details show for not fully quiesced tserver
......................................................................

[tool] add details show for not fully quiesced tserver

We can run `kudu tserver quiesce start` with the
'--error_if_not_fully_quiesced' option to indicate
that all leaders have been moved away from the tablet
server and all on-going scans have completed for now.

However, in some cases, such as the presence of a
single replica table, it may result in quiesce operations
not being completed. If the total number of tables is
very large, we will not be able to easily locate the root
cause of quiesce jamming.

To solve this issue, I have added a parameter
'--show_details_if_not_fully_quiesced' to the quiesce
progress query, which in conjunction with parameter
'--error_if_not_fully_quiesced', can easily determine which
tablets or scanners are present, blocking the completion of
the query operation.

At the same time, I added test to ensure that the new
parameter will take effect.

Change-Id: I1dde33f1c36831b62c321eeea2040bc4a5ac21d0
---
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_server_quiescing-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_tserver.cc
M src/kudu/tserver/scanners-test.cc
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver_admin.proto
13 files changed, 218 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/19767/3
-- 
To view, visit http://gerrit.cloudera.org:8080/19767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dde33f1c36831b62c321eeea2040bc4a5ac21d0
Gerrit-Change-Number: 19767
Gerrit-PatchSet: 3
Gerrit-Owner: KeDeng <kd...@gmail.com>
Gerrit-Reviewer: Ashwani Raina <ar...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [tool] add details show for not fully quiesced tserver

Posted by "KeDeng (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19767

to look at the new patch set (#2).

Change subject: [tool] add details show for not fully quiesced tserver
......................................................................

[tool] add details show for not fully quiesced tserver

We can run `kudu tserver quiesce start` with the
'--error_if_not_fully_quiesced' option to indicating
that all leaders have been moved away from the tablet
server and all on-going scans have completed for now.

However, in some cases, such as the presence of a
single replica table, it may result in quiesce operations
not being completed. If the total number of tables is
very large, we will not be able to easily locate the root
cause of quiesce jamming.

To solve this issue, I have added a parameter
'--show_details_if_not_fully_quiesced' to the quiesce
progress query, which in conjunction with parameter
'--error_if_not_fully_quiesced', can easily determine which
tablets or scanners are present, blocking the completion of
the query operation.

At the same time, I added test to ensure that the new
parameter will take effect.

Change-Id: I1dde33f1c36831b62c321eeea2040bc4a5ac21d0
---
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/integration-tests/tablet_server_quiescing-itest.cc
M src/kudu/integration-tests/ts_tablet_manager-itest.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tools/tool_action_tserver.cc
M src/kudu/tserver/scanners-test.cc
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
M src/kudu/tserver/tserver_admin.proto
13 files changed, 218 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/67/19767/2
-- 
To view, visit http://gerrit.cloudera.org:8080/19767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1dde33f1c36831b62c321eeea2040bc4a5ac21d0
Gerrit-Change-Number: 19767
Gerrit-PatchSet: 2
Gerrit-Owner: KeDeng <kd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [tool] add details show for not fully quiesced tserver

Posted by "KeDeng (Code Review)" <ge...@cloudera.org>.
KeDeng has posted comments on this change. ( http://gerrit.cloudera.org:8080/19767 )

Change subject: [tool] add details show for not fully quiesced tserver
......................................................................


Patch Set 3:

(3 comments)

Thanks for your reviews.

http://gerrit.cloudera.org:8080/#/c/19767/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19767/2//COMMIT_MSG@10
PS2, Line 10: error_if_not_fully_quiesced
> Just curious - Today, if we don't use "error_if_not_fully_quiesced" option 
This will signal to Kudu to stop hosting leaders on the specified tablet server and to redirect new scan requests to other tablet servers.


http://gerrit.cloudera.org:8080/#/c/19767/2//COMMIT_MSG@10
PS2, Line 10: indicate
> nit: indicate
Done


http://gerrit.cloudera.org:8080/#/c/19767/2//COMMIT_MSG@24
PS2, Line 24: blocking the completion of
            : the query operation.
> Not sure what you mean by blocking the query operation. Do you mean what co
The "error_if_not_fully_quiesced" option use to query whether the quiescing operation is completed. If there are single-replica tables or incompleted scans, the quiescing operation will be blocked.



-- 
To view, visit http://gerrit.cloudera.org:8080/19767
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dde33f1c36831b62c321eeea2040bc4a5ac21d0
Gerrit-Change-Number: 19767
Gerrit-PatchSet: 3
Gerrit-Owner: KeDeng <kd...@gmail.com>
Gerrit-Reviewer: Ashwani Raina <ar...@cloudera.com>
Gerrit-Reviewer: KeDeng <kd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 25 Apr 2023 02:34:24 +0000
Gerrit-HasComments: Yes