You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by merrimanr <gi...@git.apache.org> on 2018/02/09 15:39:03 UTC

[GitHub] metron pull request #934: METRON-1423: Ambari work to handle Solr configurat...

GitHub user merrimanr opened a pull request:

    https://github.com/apache/metron/pull/934

    METRON-1423: Ambari work to handle Solr configuration

    ## Contributor Comments
    This PR allows support for Metron-related Solr configuration and management through Ambari.  That does NOT include managing the actual Solr service.  For now this is done outside of Ambari.
    
    High level changes include:
    
    - Solr is now installed automatically in full dev, although not in a running state.  This implementation was inspired by the `install_solr.sh` script.
    - Additional scripts for managing the Solr installation provided in full dev and installed with `install_solr.sh`.
    - Exposes a single Zookeeper Solr configuration (for now) parameter through Ambari.
    - Replaced the random access index writer class Ambari parameter with a random access search engine parameter that is a drop down of "Elasticsearch" and "Solr".  
    - Replaced "Elasticsearch" with "Random Access" in all the Ambari parameter descriptions.
    - Added the Solr schema creation to Ambari on Indexing start.  This matches the ES template loading implementation with the exception of exposing it through Service actions.
    
    Testing instructions are as follows:
    1. Spin up full dev
    2. Shutdown ES and Kibana
    3. Start Solr from the command line with $METRON_HOME/bin/start_solr.sh
    4. Navigate to the Indexing tab in Ambari > Metron > Configs and change the "Random Access Search Engine" setting from "Elasticsearch" to "Solr"
    5. Now you can start ingesting data into Solr in 2 ways.  After you make the change in Ambari it will prompt you to restart affected services.  A restart does not automatically create collections so you will need to do this outside of Ambari for bro, snort and error with $METRON_HOME/bin/create_collection.sh (first argument is sensor name) before you restart.
    6. Or you can simply stop Metron Indexing and then start it.  This will trigger Ambari to automatically load the collections.
    7. Navigate to http://node1:8983/solr/#/snort/query or http://node1:8983/solr/#/bro/query and hit the "Execute Search" button.  You should see data.
    
    There is a lot to digest here and still more questions to answer but this should get us started.  Outstanding items include:
    
    - What Solr parameters do we expose in Ambari?  All of them?
    - Do we want to add Solr collection functions to Service actions?  I don't see a way to make this conditional on the selected search engine so both options would always be shown
    - Need to review and add documentation
    - Support for REST configuration (probably a separate PR)
    
    ## Pull Request Checklist
    
    Thank you for submitting a contribution to Apache Metron.  
    Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions.  
    Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides.  
    
    
    In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:
    
    ### For all changes:
    - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
    - [ ] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    
    ### For code changes:
    - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
    - [x] Have you included steps or a guide to how the change may be verified and tested manually?
    - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
      ```
      mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
      ```
    
    - [ ] Have you written or updated unit tests and or integration tests to verify your changes?
    - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
    - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
    
    ### For documentation related changes:
    - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`:
    
      ```
      cd site-book
      mvn site
      ```
    
    #### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
    It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/merrimanr/incubator-metron METRON-1423

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/934.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #934
    
----
commit 6bb30af9d2005414e3ee44c0bdb0ea14540ce13c
Author: cstella <ce...@...>
Date:   2018-02-01T21:33:56Z

    METRON-1441: Create complementary Solr schemas for the main sensors

commit f4ff0c401eff23d9c1b2ca3b264bd9b0d4e8f381
Author: cstella <ce...@...>
Date:   2018-02-01T21:47:12Z

    Updating dao

commit 7e2ecb0f2f55ea16529128fec14920bc2a546b07
Author: cstella <ce...@...>
Date:   2018-02-02T21:43:38Z

    Migrated data to files, renamed test and added yaf and error.

commit 2aacd202ff1a2ebcbeb30300b30d080391cfe1cf
Author: cstella <ce...@...>
Date:   2018-02-02T21:45:08Z

    Merge branch 'feature/METRON-1416-upgrade-solr' into SOLR_METRON-1441

commit 2e32e7ea4ef8cace764394c1dec693d8385a6b9a
Author: cstella <ce...@...>
Date:   2018-02-02T21:50:06Z

    Added to readme.

commit e2901d4bd4b9787f668c2dccd2e4f8aa53a926d7
Author: cstella <ce...@...>
Date:   2018-02-05T14:39:31Z

    Updating error to have a guid and removed docValues=true for bytes type.

commit 3c4319ec4581fdb259a697b548a267225316874a
Author: cstella <ce...@...>
Date:   2018-02-05T16:52:17Z

    Missed spec file additions

commit 43e5ad2d4fb26ac8d6c4c623f427d6358b0c85fa
Author: cstella <ce...@...>
Date:   2018-02-05T21:53:23Z

    Updated schema to include guid, which I missed earlier

commit 261c28b1b594de8b1d7a1357e54e2367c32d0652
Author: cstella <ce...@...>
Date:   2018-02-06T14:33:32Z

    Blah, forgot guid field

commit 34e67cbb897938fd804286ecfcb5861e724c5886
Author: cstella <ce...@...>
Date:   2018-02-06T17:33:52Z

    Added context and grouping for schemata

commit 62a2eb28c8410ad08529eec74bdba0958e71f1f8
Author: cstella <ce...@...>
Date:   2018-02-06T23:03:54Z

    Updating solrwriter

commit bfbd65f3d18af14544673262d99f2c0840447009
Author: cstella <ce...@...>
Date:   2018-02-06T23:20:54Z

    Updating config.

commit 3faace9509903f5436dd8b9242bc3b2fc2343af0
Author: cstella <ce...@...>
Date:   2018-02-07T16:32:56Z

    Merge branch 'feature/METRON-1416-upgrade-solr' into SOLR_writer_mod

commit c9d842519c0fb48d26492265cd5ae7d3aa6768c9
Author: cstella <ce...@...>
Date:   2018-02-07T16:44:58Z

    Merge branch 'feature/METRON-1416-upgrade-solr' into SOLR_writer_mod

commit e8d0efd9113c8163f484dcf26ff66d5b6cbaf081
Author: cstella <ce...@...>
Date:   2018-02-07T16:51:34Z

    Updating should commit to be taken from global config.

commit 820dde3a03d1636aa82254f241e8fc422bc1d911
Author: cstella <ce...@...>
Date:   2018-02-07T16:54:40Z

    JonZeolla is right.

commit 4baed6a7197cbb91faafd17bad9fc1b7a8ddc158
Author: cstella <ce...@...>
Date:   2018-02-07T17:11:26Z

    Updated readme.

commit ed1f6b56484fca1262e605613cc9bbcc6db5096f
Author: cstella <ce...@...>
Date:   2018-02-07T22:58:47Z

    updating docs and making configuration more extensible.

commit 8a34e4b6de67ae3a0684d4ec638c94a59d6d717e
Author: cstella <ce...@...>
Date:   2018-02-07T23:21:34Z

    ../../..

commit f1637b187660fa71284b335e3c6bc1e3714e969c
Author: cstella <ce...@...>
Date:   2018-02-08T00:29:34Z

    Updating writer to not have star imports.

commit 967b84b69b56319dcaa6c1d6ca22da14b86a1e06
Author: cstella <ce...@...>
Date:   2018-02-08T15:54:33Z

    change flux file to be correct

commit 3817d41adb1293178394b1bea5b3e21de9e05e51
Author: merrimanr <me...@...>
Date:   2018-02-08T18:12:17Z

    initial commit

commit ea8e8a57ab72c1f5747a9bfd09de213963ce01ab
Author: merrimanr <me...@...>
Date:   2018-02-08T21:36:13Z

    added more scripts

commit 154438e5e88c7cdb80939aad222457b5a0c0337f
Author: merrimanr <me...@...>
Date:   2018-02-08T21:50:56Z

    added documentation

commit af642d48e574eeeac16e1b76a5fbdb8b8ebd36c8
Author: merrimanr <me...@...>
Date:   2018-02-08T21:54:08Z

    initial commit

commit 7a8ec8d6bc09954aedea4f91ae8468d37c9bd824
Author: merrimanr <me...@...>
Date:   2018-02-08T22:11:48Z

    added newline

commit 596349d382973bab0490f3126da7989fd1e4950f
Author: merrimanr <me...@...>
Date:   2018-02-08T22:12:31Z

    Merge branch 'solr-ansible' into METRON-1423

commit 49667e06e9847b082c2919bba68377d72a61d880
Author: merrimanr <me...@...>
Date:   2018-02-08T22:25:15Z

    Merge remote-tracking branch 'mirror/feature/METRON-1416-upgrade-solr' into METRON-1423

commit 74f038a69336cd0c56a763baf0e46513b4a4cb00
Author: merrimanr <me...@...>
Date:   2018-02-09T14:59:50Z

    Merge remote-tracking branch 'mirror/feature/METRON-1416-upgrade-solr' into METRON-1423
    
    # Conflicts:
    #	metron-platform/metron-solr/src/main/java/org/apache/metron/solr/writer/SolrWriter.java
    #	metron-platform/metron-solr/src/test/java/org/apache/metron/solr/writer/SolrWriterTest.java

commit 051afdcf3f20d5b6c572a93adb89c0fadbe86a6d
Author: merrimanr <me...@...>
Date:   2018-02-09T15:03:07Z

    removed log statement

----


---

[GitHub] metron pull request #934: METRON-1423: Ambari work to handle Solr configurat...

Posted by merrimanr <gi...@git.apache.org>.
Github user merrimanr commented on a diff in the pull request:

    https://github.com/apache/metron/pull/934#discussion_r170365518
  
    --- Diff: metron-platform/metron-solr/src/main/scripts/create_collection.sh ---
    @@ -0,0 +1,27 @@
    +#!/bin/bash
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one
    +# or more contributor license agreements.  See the NOTICE file
    +# distributed with this work for additional information
    +# regarding copyright ownership.  The ASF licenses this file
    +# to you under the Apache License, Version 2.0 (the
    +# "License"); you may not use this file except in compliance
    +# with the License.  You may obtain a copy of the License at
    +#
    +#     http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +METRON_VERSION=${project.version}
    +METRON_HOME=/usr/metron/$METRON_VERSION
    +SOLR_VERSION=6.6.2
    --- End diff --
    
    Latest commit should address this.


---

[GitHub] metron pull request #934: METRON-1423: Ambari work to handle Solr configurat...

Posted by merrimanr <gi...@git.apache.org>.
Github user merrimanr closed the pull request at:

    https://github.com/apache/metron/pull/934


---

[GitHub] metron pull request #934: METRON-1423: Ambari work to handle Solr configurat...

Posted by merrimanr <gi...@git.apache.org>.
Github user merrimanr commented on a diff in the pull request:

    https://github.com/apache/metron/pull/934#discussion_r170272344
  
    --- Diff: metron-platform/metron-solr/src/main/scripts/create_collection.sh ---
    @@ -0,0 +1,27 @@
    +#!/bin/bash
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one
    +# or more contributor license agreements.  See the NOTICE file
    +# distributed with this work for additional information
    +# regarding copyright ownership.  The ASF licenses this file
    +# to you under the Apache License, Version 2.0 (the
    +# "License"); you may not use this file except in compliance
    +# with the License.  You may obtain a copy of the License at
    +#
    +#     http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +METRON_VERSION=${project.version}
    +METRON_HOME=/usr/metron/$METRON_VERSION
    +SOLR_VERSION=6.6.2
    --- End diff --
    
    Good catch and I agree they should match.  I think the mistake is actually the maven version.  We initially explored using HDP Search (which is version 6.6.2) but it's not quite ready yet so we went ahead with a manual approach in the meantime.  The thinking was this would make it easy to switch to HDP Search in the future.


---

[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration

Posted by simonellistonball <gi...@git.apache.org>.
Github user simonellistonball commented on the issue:

    https://github.com/apache/metron/pull/934
  
    Given that we're happy to accept limiting ourselves to a Hadoop distribution, it doesn't seem unfair to limit ourselves to a Solr distribution. While sticking pure raw Apache has some appeal, it may well make sense to stick to Ambari mpack based installs in the distro @ottobackwards suggest. 


---

[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration

Posted by justinleet <gi...@git.apache.org>.
Github user justinleet commented on the issue:

    https://github.com/apache/metron/pull/934
  
    I wanted to chime in and make a distinction here for anyone looking in from the outside.  We're limiting ourselves to a Hadoop distribution for our dev environment, not in general.  Things primarily get tested on HDP for this reason, but we don't limit ourselves in general.
    
    The problem with that mpack is that it uses Solr 5.5.2, and it would be nice to use a more modern version, based on the discuss thread. Having said that, I'm not at all opposed to using an mpack if it fits our needs. There are definitely benefits to having an install approach that fits the general mpack install we already do.


---

[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/metron/pull/934
  
    Can I ask a question : why aren't we using https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.4/bk_solr-search-installation/content/ch_hdp-search-install-ambari.html?
    Is there a reason?  If so we should document that reason.



---

[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on the issue:

    https://github.com/apache/metron/pull/934
  
    +1 by inspection


---

[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/metron/pull/934
  
    I would just like it documented as to why we are not using the HDP solr mpack.
    Although, I would think we would be using that mpack as the example for ours?


---

[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/metron/pull/934
  
    Ran through test, everything worked fine. +1
    
    On the questions:
    
    - What Solr parameters do we expose in Ambari? All of them?
    
    People are going to want to tune indexing.  If ambari is managing the configuration where those tuning parameters happen, then they need to be exposed
    
    - Do we want to add Solr collection functions to Service actions? 
    
    Does that mean that we can stop a certain collection?  We may want to have a custom view for this?
    
    



---

[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on the issue:

    https://github.com/apache/metron/pull/934
  
    Chiming in late.  I agree that we should not have an explicit dependency on an indexing Mpack, even one not of our own construction.  I think people will have a lot of different ways to install solr and Metron's mpack should just be configured to point to an existing solr instance.
    
    I would generally be in favor of adding support for:
    * solr.commitPerBatch
    * solr.commit.waitSearcher
    * solr.commit.waitFlush
    * solr.commit.soft
    * solr.collection
    * solr.http.config
    
    but sensible defaults are chosen there and people can adjust them in the global config, so I think they can wait for a follow-on.  Some of them may be problematic to encode in a UI (solr.http.config is a map, for instance), but most of them would be pretty trivial.  Frankly, I'm a bit hesitant to give people the ability to screw with transactions details easily.


---

[GitHub] metron issue #934: METRON-1423: Ambari work to handle Solr configuration

Posted by merrimanr <gi...@git.apache.org>.
Github user merrimanr commented on the issue:

    https://github.com/apache/metron/pull/934
  
    I agree with @justinleet.  I would prefer we use Solr 6+.  At some point HDP Search will move to 6+ and we can easily switch to the Mpack since installing Solr is a manual step now.  Happy to document that once we are all in agreement.
    
    As for the parameters we expose in Ambari, I'm referring to the parameters stored in Zookeeper that the SolrWriter reads.  They are:
    
    - solr.zookeeper
    - solr.commitPerBatch
    - solr.commit.waitSearcher
    - solr.commit.waitFlush
    - solr.commit.soft
    - solr.collection
    - solr.http.config
    
    Currently Ambari only exposes the "solr.zookeeper" parameter which is necessary to get the indexing topology writing to Solr.  I think a reasonable solution would be to include all the others except "solr.http.config" since it's an advanced config.  Storm tuning parameters are independent of the writer implementation and already exposed in Ambari.  
    
    For the collection service actions, it would mirror what we expose for ES:  creating and deleting templates (schemas in Solr terms).


---

[GitHub] metron pull request #934: METRON-1423: Ambari work to handle Solr configurat...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on a diff in the pull request:

    https://github.com/apache/metron/pull/934#discussion_r170269639
  
    --- Diff: metron-platform/metron-solr/src/main/scripts/create_collection.sh ---
    @@ -0,0 +1,27 @@
    +#!/bin/bash
    +#
    +# Licensed to the Apache Software Foundation (ASF) under one
    +# or more contributor license agreements.  See the NOTICE file
    +# distributed with this work for additional information
    +# regarding copyright ownership.  The ASF licenses this file
    +# to you under the Apache License, Version 2.0 (the
    +# "License"); you may not use this file except in compliance
    +# with the License.  You may obtain a copy of the License at
    +#
    +#     http://www.apache.org/licenses/LICENSE-2.0
    +#
    +# Unless required by applicable law or agreed to in writing, software
    +# distributed under the License is distributed on an "AS IS" BASIS,
    +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    +# See the License for the specific language governing permissions and
    +# limitations under the License.
    +#
    +METRON_VERSION=${project.version}
    +METRON_HOME=/usr/metron/$METRON_VERSION
    +SOLR_VERSION=6.6.2
    --- End diff --
    
    I notice that we're depending on maven artifacts of 6.6.0 and we set the solr version to 6.6.2 here, is there a reason for that?  If we can align them, then we could replace this line with SOLR_VERSION=`${global_solr_version}`
    
    This question applies for the rest of the shell scripts too


---