You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/02/10 03:57:00 UTC

[jira] [Commented] (KUDU-3212) Location assignment improvements

    [ https://issues.apache.org/jira/browse/KUDU-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17282210#comment-17282210 ] 

ASF subversion and git services commented on KUDU-3212:
-------------------------------------------------------

Commit 32c5b9c60bc923e27f0a4c78508fc3e2d1276e28 in kudu's branch refs/heads/master from Alexey Serbin
[ https://gitbox.apache.org/repos/asf?p=kudu.git;h=32c5b9c ]

[master] turn off client location assignment by default

This patch turns off location assignment by default to clients
connecting to a Kudu cluster.  The assigned locations are cached,
but the way how locations assignment performed is resource consuming,
see [1] for details.

There aren't many benefits in assigning locations to clients so far:
the only nice property of a client with assigned location vs a client
with no location assigned is that former case the client run from a
particular location would choose tablet servers in the same location
if performing scan in ReplicaSelection::CLOSEST_REPLICA mode.

This is a first step towards addressing KUDU-3212.

[1] https://issues.apache.org/jira/browse/KUDU-3212

Change-Id: I78474ced0a0129b3f2b1add55f6f908a136106d0
Reviewed-on: http://gerrit.cloudera.org:8080/17024
Tested-by: Kudu Jenkins
Reviewed-by: Hao Hao <ha...@cloudera.com>
Reviewed-by: Grant Henke <gr...@apache.org>


> Location assignment improvements
> --------------------------------
>
>                 Key: KUDU-3212
>                 URL: https://issues.apache.org/jira/browse/KUDU-3212
>             Project: Kudu
>          Issue Type: Improvement
>          Components: client, master, tserver
>    Affects Versions: 1.10.1, 1.12.0, 1.11.1, 1.13.0
>            Reporter: Alexey Serbin
>            Priority: Major
>              Labels: performance, scalability
>
> Current implementation of location assignment has some room for improvement.  As of now, the following is understood:
> # Implementation-wise, Kudu masters could use newly introduced [Subprocess|https://github.com/apache/kudu/tree/master/src/kudu/subprocess] functionality to run location assignment script.  That would be more robust than using current fork/exec approach to run the script, especially for larger deployments where Kudu masters might have high request-per-second ratio (many active threads running, a lot of memory allocated, etc.)
> # Conceptually, Kudu tablet servers could have all the necessary information regarding their  location at startup and that information isn't going to change while tablet server is running. The server/machine they are running at is provisioned to be in some rack, availability zone, data center, etc.  and that assignment isn't changing while the server is up and running.  So, a Kudu tablet server can be provided with information about its location upon startup; there is no need to consult Kudu master about this.
> # Conceptually, Kudu clients might be aware of their location as well.
> To address item 1, it's necessary to update current implementation of location assignment, so the script should be run by a dedicated subprocess forked off earlier during master's startup.  Ideally, to make it more robust, the subprocess server can run the location assignment script as a small server that takes an IP or DNS name on input and provides location label on the output, maybe line-by-line.  The latter assumes chaning the requirement for a location assignment script, and probably we should introduce a separate flag to specify the path to a script that is running in such mode.  However, even with current location assignment approach when it's necessary to run a script per every location assignment request, using the {{Subprocess}} functionality would benefit larger deployments where fork/exec sequence for a {{kudu-master}} process is slow and inefficient.
> To address item 2, it's necessary to introduce a new tablet server's flag that is set to the assigned location for the tablet server.  The systemd/init.d startup script for kudu-tserver should populate the flag with proper value.  It's also necessary to introduce a new field in the {{TSHeartbeatRequestPB}} message to pass the location from tablet server to master.  If master sees the field populated, it should not run the location assignment script, even if the location assignment script is set specified (i.e. {{\-\-location_mapping_cmd}} flag is set).  This way it would be possible to perform rolling upgrades from older versions which use centrally managed location assignment script to the version that implements the new approach.
> To address item 3, it's necessary to find a means to specify location for a Kudu client.  Probably, an environment variable can be used for that.   The {{ConnectToMasterRequestPB}} can be extended to include an optional {{client_location}} field.  In addition, if {{\-\-master_client_location_assignment_enabled}} is set to {{true}}, master could run the location assignment script to assign location to a client which doesn't populate the newly introduced {{ConnectToMasterRequestPB::client_location}} field.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)