You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@metron.apache.org by Casey Stella <ce...@gmail.com> on 2017/01/12 22:08:51 UTC

[DISCUSS] Ambari Metron Configuration Management consequences and call to action

In the course of discussion on the PR for METRON-652
<https://github.com/apache/incubator-metron/pull/415> something that I
should definitely have understood better came to light and I thought that
it was worth bringing to the attention of the community to get
clarification/discuss is just how we manage configs.

Currently (assuming the management UI that Ryan Merriman submitted) configs
are managed/adjusted via a couple of different mechanism.

   - zk_load_utils.sh: pushed and pulled from disk to zookeeper
   - Stellar REPL: pushed and pulled via the CONFIG_GET/CONFIG_PUT functions
   - Ambari: initialized via the zk_load_utils script and then some of them
   are managed directly (global config) and some indirectly (sensor-specific
   configs).
      - NOTE: Upon service restart, it may or may not overwrite changes on
      disk or on zookeeper.  *Can someone more knowledgeable than me about
      this describe precisely the semantics that we can expect on
service restart
      for Ambari? What gets overwritten on disk and what gets updated
in ambari?*
   - The Management UI: manages some of the configs. *RYAN: Which configs
   do we support here and which don't we support here?*

As you can see, we have a mishmash of mechanisms to update and manage the
configuration for Metron in zookeeper.  In the beginning the approach was
just to edit configs on disk and push/pull them via zk_load_utils.  Configs
could be historically managed using source control, etc.  As we got more
and more components managing the configs, we haven't taken care that they
they all work with each other in an expected way (I believe these are
true..correct me if I'm wrong):

   - If configs are modified in the management UI or the Stellar REPL and
   someone forgets to pull the configs from zookeeper to disk, before they do
   a push via zk_load_utils, they will clobber the configs in zookeeper with
   old configs.
   - If the global config is changed on disk and the ambari service
   restarts, it'll get reset with the original global config.
   - *Ryan, in the management UI, if someone changes the zookeeper configs
   from outside, are those configs reflected immediately in the UI?*


It seems to me that we have a couple of options here:

   - A service to intermediate and handle config update/retrieval and
   tracking historical changes so these different mechanisms can use a common
   component for config management/tracking and refactor the existing
   mechanisms to use that service
   - Standardize on exactly one component to manage the configs and regress
   the others (that's a verb, right?   nicer than delete.)

I happen to like the service approach, myself, but I wanted to put it up
for discussion and hopefully someone will volunteer to design such a thing.

To frame the debate, I want us to keep in mind a couple of things that may
or may not be relevant to the discussion:

   - We will eventually be moving to support kerberos so there should at
   least be a path to use kerberos for any solution IMO
   - There is value in each of the different mechanisms in place now.  If
   there weren't, then they wouldn't have been created.  Before we try to make
   this a "there can be only one" argument, I'd like to hear very good
   arguments.

Finally, I'd appreciate if some people might answer the questions I have in
bold there.  Hopefully this discussion, if nothing else happens, will
result in fodder for proper documentation of the ins and outs of each of
the components bulleted above.

Best,

Casey

Re: [DISCUSS] Ambari Metron Configuration Management consequences and call to action

Posted by Michael Miklavcic <mi...@gmail.com>.

Hi Casey,

Thanks for starting this thread. I believe you are correct in your
assessment of the 4 options for updating configs in Metron. When using more
than one of these options we can get into a split-brain scenario. A basic
example is updating the global config on disk and using the
zk_load_configs.sh. Later, if a user decides to restart Ambari, the cached
version stored by Ambari (it's in the MySQL or other database backing
Ambari) will be written out to disk in the defined config directory, and
subsequently loaded using the zk_load_configs.sh under the hood. Any global
configuration modified outside of Ambari will be lost at this point. This
is obviously undesirable, but I also like the purpose and utility exposed
by the multiple config management interfaces we currently have available. I
also agree that a service would be best.

For reference, here's my understanding of the current configuration loading
mechanisms and their deps.

[image: Inline image 1]

Mike


On Thu, Jan 12, 2017 at 3:08 PM, Casey Stella <ce...@gmail.com> wrote:

> In the course of discussion on the PR for METRON-652
> <https://github.com/apache/incubator-metron/pull/415> something that I
> should definitely have understood better came to light and I thought that
> it was worth bringing to the attention of the community to get
> clarification/discuss is just how we manage configs.
>
> Currently (assuming the management UI that Ryan Merriman submitted) configs
> are managed/adjusted via a couple of different mechanism.
>
>    - zk_load_utils.sh: pushed and pulled from disk to zookeeper
>    - Stellar REPL: pushed and pulled via the CONFIG_GET/CONFIG_PUT
> functions
>    - Ambari: initialized via the zk_load_utils script and then some of them
>    are managed directly (global config) and some indirectly
> (sensor-specific
>    configs).
>       - NOTE: Upon service restart, it may or may not overwrite changes on
>       disk or on zookeeper.  *Can someone more knowledgeable than me about
>       this describe precisely the semantics that we can expect on
> service restart
>       for Ambari? What gets overwritten on disk and what gets updated
> in ambari?*
>    - The Management UI: manages some of the configs. *RYAN: Which configs
>    do we support here and which don't we support here?*
>
> As you can see, we have a mishmash of mechanisms to update and manage the
> configuration for Metron in zookeeper.  In the beginning the approach was
> just to edit configs on disk and push/pull them via zk_load_utils.  Configs
> could be historically managed using source control, etc.  As we got more
> and more components managing the configs, we haven't taken care that they
> they all work with each other in an expected way (I believe these are
> true..correct me if I'm wrong):
>
>    - If configs are modified in the management UI or the Stellar REPL and
>    someone forgets to pull the configs from zookeeper to disk, before they
> do
>    a push via zk_load_utils, they will clobber the configs in zookeeper
> with
>    old configs.
>    - If the global config is changed on disk and the ambari service
>    restarts, it'll get reset with the original global config.
>    - *Ryan, in the management UI, if someone changes the zookeeper configs
>    from outside, are those configs reflected immediately in the UI?*
>
>
> It seems to me that we have a couple of options here:
>
>    - A service to intermediate and handle config update/retrieval and
>    tracking historical changes so these different mechanisms can use a
> common
>    component for config management/tracking and refactor the existing
>    mechanisms to use that service
>    - Standardize on exactly one component to manage the configs and regress
>    the others (that's a verb, right?   nicer than delete.)
>
> I happen to like the service approach, myself, but I wanted to put it up
> for discussion and hopefully someone will volunteer to design such a thing.
>
> To frame the debate, I want us to keep in mind a couple of things that may
> or may not be relevant to the discussion:
>
>    - We will eventually be moving to support kerberos so there should at
>    least be a path to use kerberos for any solution IMO
>    - There is value in each of the different mechanisms in place now.  If
>    there weren't, then they wouldn't have been created.  Before we try to
> make
>    this a "there can be only one" argument, I'd like to hear very good
>    arguments.
>
> Finally, I'd appreciate if some people might answer the questions I have in
> bold there.  Hopefully this discussion, if nothing else happens, will
> result in fodder for proper documentation of the ins and outs of each of
> the components bulleted above.
>
> Best,
>
> Casey
>