You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@kudu.apache.org by "Todd Lipcon (JIRA)" <ji...@apache.org> on 2017/03/19 20:54:41 UTC
[jira] [Commented] (KUDU-1948) Client-side configuration of cluster details

    [ https://issues.apache.org/jira/browse/KUDU-1948?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15931961#comment-15931961 ] 

Todd Lipcon commented on KUDU-1948:
-----------------------------------

I chatted offline with [~danburkert] about this for a few minutes last week. Our proposal was something like the following:

- the client builder API would continue to have no "default" behavior. But it would gain a new call something like:
{code}
new KuduClientBuilder().loadConfigurationForCluster("my-cluster")
{code}

This would have the effect of looking in various locations for a configured cluster called 'my-cluster':
- $KUDUCONFIG
- $HOME/.kudurc
- /etc/kudu/kudurc

These would be some simple files (perhaps YAML) that look like:

{code}
clusters:
  my-cluster:
    masters:
      - foo1.example.com
      - foo2.example.com
      - foo3.example.com
    require_authentication: true
    require_encryption: true
    master_kerberos_principal: "my-custom-master-principal/_HOST@MY_REALM"
    tserver_kerberos_principal: "my-custom-master-principal/_HOST@MY_REALM"
  other-cluster:
    masters:
      - other.example.com
{code}

We also established some guiding principals:

- we should use these files only for configurations that we'd expect the _operator_ to be setting (eg security policies) and not for anything we expect that different applications would want to configure differently (eg timeouts)
- all configs should be clearly scoped per-cluster (to preserve the ability to do cross-cluster applications without gymnastics)
- these files should _only_ be read from the client, and not from servers
- these files should be referenced only when an API explicitly references them (eg the "loadConfigurationForCluster()" API). We should avoid implicit behavior in library code.
-- Command line tools like 'kudu table list' could potentially be more implicit, or they could take a cluster identifier.


All the above is just a brainstorm/draft, subject to change of course. When we get to actually implementing this we should transfer everything into a google doc, do normal design/review process, etc.

> Client-side configuration of cluster details
> --------------------------------------------
>
>                 Key: KUDU-1948
>                 URL: https://issues.apache.org/jira/browse/KUDU-1948
>             Project: Kudu
>          Issue Type: New Feature
>          Components: client, security
>    Affects Versions: 1.3.0
>            Reporter: Todd Lipcon
>
> In the beginning, Kudu clients were configured with only the address of the single Kudu master. This was nice and simple, and there was no need for a client "configuration file".
> Then, we added multi-masters, and the client API had to take a list of master addresses. This wasn't awful, but started to be a bit aggravating when trying to use tools on a multi-master cluster (who wants to type out three long hostnames in a 'ksck' command line every time?).
> Now with security, we have a couple more bits of configuration for the client. Namely:
> - "require SSL" and "require authentication" booleans -- necessary to prevent MITM downgrade attacks
> - custom Kerberos principal -- if the server wants to use a principal other than 'kudu/<HOST>@REALM' then the client needs to know to expect it and fetch the appropriate service ticket. (Note this isn't yet supported but would like to be!)
> In the future, there are other items that might be best specified as part of a client configuration as well (e.g. CA cert for BYO PKI, wire compression options, etc).
> For the above use cases it would be nicer to allow the various options to be specified in a configuration file rather than adding specific APIs for all options.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)