You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by mmiklavc <gi...@git.apache.org> on 2018/11/04 00:45:37 UTC

[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

GitHub user mmiklavc opened a pull request:

    https://github.com/apache/metron/pull/1252

    METRON-1855: Make unified enrichment topology the default and deprecate split-join

    ## Contributor Comments
    
    https://issues.apache.org/jira/browse/METRON-1855
    
    Deprecates the split-join topology in favor of the simpler, more performant unified enrichment topology. Encompasses the following changes:
    
    - The MPack is configured with the Unified topology as the new default
    - Unified topology properties appear first in the Ambari Enrichment config section and in the enrichment type dropdown.
    - Documentation changed to emphasize the unified topology
    - Performance docs make note of the split-join topology deprecation in favor of the unified topology.
    
    I'll make another note of it in the DISCUSS thread, but I think we should make a goal of marking split-join deprecated in this upcoming release and removing it altogether shortly thereafter.
    
    DISCUSS thread - https://lists.apache.org/thread.html/6cfc883de28a5cb41f26d0523522d4b93272ac954e5713c80a35675e@%3Cdev.metron.apache.org%3E
    
    **Testing**
    
    - Spin up full dev
    - Verify that the unified enrichment topology is now running by default. 
    - View the Storm UI topology details for enrichment and verify that there is no longer a split/join bolt.
    - Verify the unified enrichment properties show up first under the enrichments config section and that "unified" is active in the enrichment type dropdown.
    - Change enrichment type to Split Join and verify that the deprecated topology is still able to be run.
    
    ## Pull Request Checklist
    
    Thank you for submitting a contribution to Apache Metron.  
    Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions.  
    Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides.  
    
    
    In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
    - [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    - [x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    
    ### For code changes:
    - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
    - [x] Have you included steps or a guide to how the change may be verified and tested manually?
    - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
      ```
      mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
      ```
    
    - [x] Have you written or updated unit tests and or integration tests to verify your changes?
    - n/a If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
    - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
    
    ### For documentation related changes:
    - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`:
    
      ```
      cd site-book
      mvn site
      ```
    
    #### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
    It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mmiklavc/metron deprecate-split-join

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/1252.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1252
    
----
commit 25e2b657b8f077006cabd1f79d1e7bb2aea28b13
Author: Michael Miklavcic <mi...@...>
Date:   2018-11-03T22:11:44Z

    Make unified enrichment topology new default. Deprecated split-join topology.

----


---

[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1252#discussion_r230864542
  
    --- Diff: metron-platform/metron-enrichment/README.md ---
    @@ -76,6 +62,19 @@ intel bolt, the configurations will be taken from the respective join bolt
     parallelism.  When proper ambari support for this is added, we will add
     its own property.
     
    +### Split-Join Enrichment Topology
    +
    +The now-deprecated split/join topology is also available and performs enrichments in parallel.
    +This poses some issues in terms of ease of tuning and reasoning about performance.
    +
    +![Architecture](enrichment_arch.png)
    +
    +#### Using It
    +
    +In order to use the older, deprecated topology, you will need to
    +* Edit `$METRON_HOME/bin/start_enrichment_topology.sh` and adjust it to use `remote-splitjoin.yaml` instead of `remote-unified.yaml`
    --- End diff --
    
    This actually wasn't net-new, it's the way we currently expose the choice. I would prefer to keep it as-is and manual because the desire is NOT to have customers use the deprecated topology.


---

[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/metron/pull/1252


---

[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1252#discussion_r230803329
  
    --- Diff: metron-platform/metron-enrichment/src/main/scripts/start_enrichment_topology.sh ---
    @@ -20,7 +20,7 @@ METRON_VERSION=${project.version}
     METRON_HOME=/usr/metron/$METRON_VERSION
     TOPOLOGY_JAR=${project.artifactId}-$METRON_VERSION-uber.jar
     
    -# there are two enrichment topologies.  by default, the split-join enrichment topology is executed
    +# There are two enrichment topologies. By default, the unified enrichment topology is executed. Split-join is now deprecated.
     SPLIT_JOIN_ARGS="--remote $METRON_HOME/flux/enrichment/remote-splitjoin.yaml --filter $METRON_HOME/config/enrichment-splitjoin.properties"
     UNIFIED_ARGS="--remote $METRON_HOME/flux/enrichment/remote-unified.yaml --filter $METRON_HOME/config/enrichment-unified.properties"
    --- End diff --
    
    We also need to change line 28 of this file so that it defaults to Unified.
    ```
    ARGS=${@:-$UNIFIED_ARGS}
    ```


---

[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1252#discussion_r230802749
  
    --- Diff: metron-platform/metron-enrichment/README.md ---
    @@ -76,6 +62,19 @@ intel bolt, the configurations will be taken from the respective join bolt
     parallelism.  When proper ambari support for this is added, we will add
     its own property.
     
    +### Split-Join Enrichment Topology
    +
    +The now-deprecated split/join topology is also available and performs enrichments in parallel.
    +This poses some issues in terms of ease of tuning and reasoning about performance.
    +
    +![Architecture](enrichment_arch.png)
    +
    +#### Using It
    +
    +In order to use the older, deprecated topology, you will need to
    +* Edit `$METRON_HOME/bin/start_enrichment_topology.sh` and adjust it to use `remote-splitjoin.yaml` instead of `remote-unified.yaml`
    --- End diff --
    
    Instead of saying you need to edit the script, we could just give them the following command.  You just need to pass in the right command line args to use Split/Join.
    ```
    $METRON_HOME/bin/start_enrichment_topology.sh --remote $METRON_HOME/flux/enrichment/remote-splitjoin.yaml --filter $METRON_HOME/config/enrichment-splitjoin.properties
    ```


---

[GitHub] metron issue #1252: METRON-1855: Make unified enrichment topology the defaul...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on the issue:

    https://github.com/apache/metron/pull/1252
  
    +1 by inspection.  Looks great.  Thanks!


---

[GitHub] metron issue #1252: METRON-1855: Make unified enrichment topology the defaul...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on the issue:

    https://github.com/apache/metron/pull/1252
  
    Pertaining to the topology args, our elasticsearch_master.py script always passes in args based on the active topology. I also added a note for changing the setting in Ambari as well as the manual process, per your suggestion, to Upgrading.md.
    
    ```
     # which enrichment topology needs started?
                if self.__params.enrichment_topology == "Unified":
                    topology_flux = "{0}/flux/enrichment/remote-unified.yaml".format(self.__params.metron_home)
                    topology_props = "{0}/config/enrichment-unified.properties".format(self.__params.metron_home)
                elif self.__params.enrichment_topology == "Split-Join":
                    topology_flux = "{0}/flux/enrichment/remote-splitjoin.yaml".format(self.__params.metron_home)
                    topology_props = "{0}/config/enrichment-splitjoin.properties".format(self.__params.metron_home)
                else:
                    raise Fail("Unexpected enrichment topology; name=" + self.__params.enrichment_topology)
    
                # start the topology
                start_cmd_template = """{0}/bin/start_enrichment_topology.sh --remote {1} --filter {2}"""
                Logger.info('Starting ' + self.__enrichment_topology)
                start_cmd = start_cmd_template.format(self.__params.metron_home, topology_flux, topology_props)
                Execute(start_cmd, user=self.__params.metron_user, tries=3, try_sleep=5, logoutput=True)
    ```


---

[GitHub] metron issue #1252: METRON-1855: Make unified enrichment topology the defaul...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on the issue:

    https://github.com/apache/metron/pull/1252
  
    @nickwallen Let me know what you think of the latest changes.
    
    Side note, I just noticed this new "Resolve Conversation" option.


---

[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1252#discussion_r230867139
  
    --- Diff: metron-platform/metron-enrichment/src/main/scripts/start_enrichment_topology.sh ---
    @@ -20,7 +20,7 @@ METRON_VERSION=${project.version}
     METRON_HOME=/usr/metron/$METRON_VERSION
     TOPOLOGY_JAR=${project.artifactId}-$METRON_VERSION-uber.jar
     
    -# there are two enrichment topologies.  by default, the split-join enrichment topology is executed
    +# There are two enrichment topologies. By default, the unified enrichment topology is executed. Split-join is now deprecated.
     SPLIT_JOIN_ARGS="--remote $METRON_HOME/flux/enrichment/remote-splitjoin.yaml --filter $METRON_HOME/config/enrichment-splitjoin.properties"
     UNIFIED_ARGS="--remote $METRON_HOME/flux/enrichment/remote-unified.yaml --filter $METRON_HOME/config/enrichment-unified.properties"
    --- End diff --
    
    > @mmiklavc Should we add a small blurb in `Upgrading.md` to document how users who are upgrading can revert to the existing functionality?
    
    Hrm, our upgrading doc appears to list items on a per-release basis, and we don't have anything new since 0.5.0. Here's what I'll do. Add an upgrading from 0.6.0 to 0.6.1 notice and make we change the version accordingly in the next release.


---

[GitHub] metron issue #1252: METRON-1855: Make unified enrichment topology the defaul...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on the issue:

    https://github.com/apache/metron/pull/1252
  
    @mmiklavc Should we add a small blurb in `Upgrading.md` to document how users who are upgrading can revert to the existing functionality?


---

[GitHub] metron pull request #1252: METRON-1855: Make unified enrichment topology the...

Posted by mmiklavc <gi...@git.apache.org>.
Github user mmiklavc commented on a diff in the pull request:

    https://github.com/apache/metron/pull/1252#discussion_r230869589
  
    --- Diff: metron-platform/metron-enrichment/README.md ---
    @@ -76,6 +62,19 @@ intel bolt, the configurations will be taken from the respective join bolt
     parallelism.  When proper ambari support for this is added, we will add
     its own property.
     
    +### Split-Join Enrichment Topology
    +
    +The now-deprecated split/join topology is also available and performs enrichments in parallel.
    +This poses some issues in terms of ease of tuning and reasoning about performance.
    +
    +![Architecture](enrichment_arch.png)
    +
    +#### Using It
    +
    +In order to use the older, deprecated topology, you will need to
    +* Edit `$METRON_HOME/bin/start_enrichment_topology.sh` and adjust it to use `remote-splitjoin.yaml` instead of `remote-unified.yaml`
    --- End diff --
    
    Nevermind, just re-read that suggestion. Sounds good to me @nickwallen.


---