You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by mattf-horton <gi...@git.apache.org> on 2017/01/26 08:40:15 UTC

[GitHub] incubator-metron pull request #425: METRON 609 Enhance Mpack to handle singl...

GitHub user mattf-horton opened a pull request:

    https://github.com/apache/incubator-metron/pull/425

    METRON 609 Enhance Mpack to handle single-node and small-cluster installs of Elasticsearch

    This PR is not ready for prime time, but is provided for ease of access to work-in-progress for:
    - METRON-609 Enhance Mpack to handle single-node and small-cluster installs of Elasticsearch, and 
    - METRON-634 Mpack bug fixes and improvements (not related to singlenode install).  
    
    These are presented as two separate commits, so you can look at them separately if you wish.
    
    These are the included enhancements from METRON-609:
    - Enable 1-, 2-, and 3-node clusters to have a working Elasticsearch install via the Mpack.
    	- Change constraints from 1+ Masters and 3+ Slaves, to 1+ and 0+.
    	- Allow non-dedicated master/datanodes via boolean "masters_also_are_datanodes".
    	- Allow use of alternative single-node template via "single_node_elasticsearch" boolean.
    	- Only the 1- and 4-node clusters have been tested, last month.
    - Improve various mouse-over Description fields in the GUI.
    - I included the attempted validation check on (storm) num_slots = slots_per_supervisor * num_supervisors.  This doesn't currently work due to pre-existing bug in other parts of validation check, so haven't been able to test.
    
    These are the included enhancements and bug fixes from METRON-634:    
    NOT AFFECTING THE AMBARI DATABASE:
    - ES pid_dir specification and usage:
    	- Currently pid_dir is multiply specified in elastic-env.xml and params.py. The config parameter should not be over-ridden in params.py.
    	- PID_DIR failed to be included in /etc/sysconfig/elasticsearch. It needs to be added to the template in elastic-sysconfig, as it must be provided to ES at launch-time (else the default directory will be used).
    	- pid_file is specified in params.py, but is not used anywhere. (The ES internal launcher synthesizes it from PID_DIR, and this is appropriate.)
    - JAVA_HOME needs to be provided in /etc/sysconfig/elasticsearch (templated in elastic-sysconfig.xml). Its absence causes Centos7 systemctl to fail the ES launch, unless /bin/java is defined (which it isn't necessarily).
    - Also in the /etc/sysconfig/elasticsearch template in elastic-sysconfig.xml, the value of ES_JAVA_OPTS incorrectly spans 3 lines. The lines must be terminated with backslashes to effectively become a single line. The current inclusion of newlines in the long string value is acceptable (although unusual) in shellscript, but not in a systemd EnvironmentFile. /etc/sysconfig/elasticsearch must function as both.
    - Also in ES_JAVA_OPTS, the two instances of log_dir needs to be followed by a slash '/'
    - In elastic.py, when directories are being pre-created and permissions set, the file $CONF_DIR/scripts should also be pre-created. I intermittently hit permissions issues with this directory being created later by root, and not properly assigned to elastic_user.
    - In several places in elastic.py, "params.elastic_user" is incorrectly used when "params.user_group" should be used.
    - Undefined "format()" method is used in elastic.py, unnecessarily in File(format("/etc/sysconfig/elasticsearch")...
    - Undefined "format()" method is similarly used several times unnecessarily in elastic_master.py
    - The comments and descriptions in elastic-site.xml have multiple suggested improvements.
    - Provide Quick Links in Ambari service page for Elasticsearch to the self-report pages for ES health and ES node list. (very useful for debugging)
    
    CHANGES THAT DO AFFECT THE AMBARI DATABASE:
    - pid_dir SHOULD be specified in elastic-sysconfig.xml, rather than elastic-env.xml, as it is a parameter that must be provided to ES at launch-time, but is not something there's any reason for the admin to change in usual circumstances.
    - conf_dir SHOULD be specified in elastic-env.xml or elastic-site.xml, not in elastic-sysconfig.xml. While it too is a parameter that must be provided to ES at launch-time, it is typically left to the installing admin where to put the config files.
    - The Ambari configuration parameter names in elastic-site.xml should be improved in several instances to make the semantics more obvious to the human reader (who may not be real familiar with Elasticsearch configuration). Mouse-over documentation will continue to provide the ES config parameter equivalents. In particular, suggest:
    	- cluster_name -> es_cluster_name  (to distinguish ES cluster from Stack cluster)
    	- zen_discovery_ping_unicast_hosts -> es_cluster_hosts
    	- network_host -> network_bindings  (these are in fact interface names, not host names)
    - There are at least two places in elasticsearch.master.yaml.j2 (zen_discovery_ping_unicast_hosts and network_host) where needed square brackets are either missing or included in the configuration string. To be consistent with other usages, and less prone to human error, the square brackets should not be in the configuration string but rather should be provided in the template text.
    - In METRON/0.3.0/configuration/metron-env.xml and METRON/0.3.0/package/scripts/params/params_linux.py, the value "metron_apps_indexed_hdfs_dir" does not need to be settable by admin; it is appropriate to require it to be subordinate to "metron_apps_hdfs_dir". Thus it can be removed from metron-env.xml and set to "{metron_apps_hdfs_dir}/indexing/indexed" in params_linux.py. This also eliminates a really unacceptable use of "double format".
    
    NOTE that these changes, because they affect the database, should properly be accompanied by a database update script and a version increment in the Mpack version number.  This is not currently implemented.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mattf-horton/incubator-metron METRON-609

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/425.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #425
    
----
commit 0fd12a5bab2745e7c496657ef92b792b60faf2bf
Author: mattf-horton <mf...@hortonworks.com>
Date:   2017-01-25T22:39:23Z

    METRON-609 Enhance Mpack to handle single-node and small-cluster installs of Elasticsearch.  Work in Progress, at request of David Lyle.

commit 1af5376d59fe4c1812bda519e9b960dc74fdb0d6
Author: mattf-horton <mf...@hortonworks.com>
Date:   2017-01-26T07:41:04Z

    METRON-634 Mpack bug fixes and improvements (not related to singlenode install). Partial: all improvements from METRON-634 already proved out in METRON-608.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #425: METRON 609 Enhance Mpack to handle single-node ...

Posted by dlyle65535 <gi...@git.apache.org>.
Github user dlyle65535 commented on the issue:

    https://github.com/apache/incubator-metron/pull/425
  
    I think I can collapse both templates into 1, at least that's what I was thinking.
    
    Just so you're aware @mattf-horton (and other interested parties). I've got a [METRON-671](https://issues.apache.org/jira/browse/METRON-671) PR coming out in the few days. It addresses a small subset of this changeset (METRON-641) and a few templating issues that will allow us to run on 1 node or 3+ nodes. It turns out, those were the only working configurations under Ansible automation. 
    
    Immediately after that, I want to get this change incorporated, so I'll be testing it on the branch that deploys with Ansible using the MPack.
    
    I'm using that order so that we can accelerate the path to using the MPack in Ansible. Otherwise, we'll keep having changes in the Ansible aren't reflected in the MPack.
    
    Please let me know if that approach makes sense or if there are any concerns.
    
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #425: METRON 609 Enhance Mpack to handle single-node ...

Posted by dlyle65535 <gi...@git.apache.org>.
Github user dlyle65535 commented on the issue:

    https://github.com/apache/incubator-metron/pull/425
  
    Hi @mattf-horton - thanks for putting this up! It's got my "have to haves" plus a bunch more good stuff. I'll get working with it today. If I discover any small changes, I'll put a PR on your branch.
    
    Thanks again, this is great stuff.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #425: METRON 609 Enhance Mpack to handle single-node ...

Posted by mattf-horton <gi...@git.apache.org>.
Github user mattf-horton commented on the issue:

    https://github.com/apache/incubator-metron/pull/425
  
    Thanks, @dlyle65535 .  Couple more comments:
    
    - Also included is the small change proposed in METRON-641 to use {0} instead of {} in python format strings in kibana_master.py.  It shouldn't be necessary for Python 2.7, but these are the only usages of the 2.7 behavior, and there's no harm in making it also work with Python 2.6.
    
    - The biggest deficiency, as far as I'm concerned, is the need to use a different, sparser template for a single-node "elasticsearch.master.yaml.j2".  Presumably only one or two of the removed parameters are actually a problem, and presumably even those don't really have to be removed but simply need a different value.  But it looked like a big job to figure out.  Perhaps someone more familiar with Elasticsearch would be able to solve it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #425: METRON 609 Enhance Mpack to handle single-node ...

Posted by mattf-horton <gi...@git.apache.org>.
Github user mattf-horton commented on the issue:

    https://github.com/apache/incubator-metron/pull/425
  
    @dlyle65535 , makes sense to me.  METRON-671 is very important for rationalizing our install scenarios, and clearly these fixes can be pipelined as you describe.  I'm super glad you're picking these up.
    
    FYI, the METRON-634 fixes included here have all been proven out in my single-node installer, which I've used quite a bit over the past month. Also, all but the last 2 of the 12 items "not affecting the Ambari database" are bug fixes, not enhancements, without which some piece or other of the ES installation doesn't work correctly in today's Mpack, at least with Centos7. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #425: METRON 609 Enhance Mpack to handle single-node ...

Posted by mattf-horton <gi...@git.apache.org>.
Github user mattf-horton commented on the issue:

    https://github.com/apache/incubator-metron/pull/425
  
    @dlyle65535 , I see that changes necessary to single-node, in slave.py and elastic_slave.py, were in my METRON-634 commit rather than the METRON-609 commit.  Sorry for the oversight.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---