You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by cestella <gi...@git.apache.org> on 2018/03/02 23:52:47 UTC

[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

GitHub user cestella opened a pull request:

    https://github.com/apache/metron/pull/947

    METRON-1467: Replace guava caches in places where the keyspace might be large

    ## Contributor Comments
    Based on the performance tuning exercise as part of METRON-1460, guava has difficulties with cache sizes over 10k.  We, unfortunately, are quite demanding of guava in this regard so we should transition a few uses of guava to Caffeine:
    
    * Stellar processor cache
    * The JoinBolt cache
    * The Enrichment Bolt Cache
    
    NOTE: This depends on METRON-1460 aka #940
    Test plan pending
    
    ## Pull Request Checklist
    
    Thank you for submitting a contribution to Apache Metron.  
    Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions.  
    Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides.  
    
    
    In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
    - [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    - [x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    
    ### For code changes:
    - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
    - [x] Have you included steps or a guide to how the change may be verified and tested manually?
    - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
      ```
      mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
      ```
    
    - [x] Have you written or updated unit tests and or integration tests to verify your changes?
    - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
    - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
    
    ### For documentation related changes:
    - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`:
    
      ```
      cd site-book
      mvn site
      ```
    
    #### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
    It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cestella/incubator-metron guava_cache_replacement

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/947.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #947
    
----
commit a4f618a3ad895d62772366e0e93e5b8b37c5c964
Author: cstella <ce...@...>
Date:   2018-02-21T23:59:16Z

    Adding parallel enrichment bolt.

commit 99fe0b86005fe04294b3851a17ae3d88f228c5d2
Author: cstella <ce...@...>
Date:   2018-02-22T00:21:06Z

    Updating to include trace statements.

commit 79736c6f3fab04d01dd1eb998b308f438003a0e1
Author: cstella <ce...@...>
Date:   2018-02-22T15:35:44Z

    Updating with some cleanup

commit cb4a527c9146865dafad1d597ba93032ef398d94
Author: cstella <ce...@...>
Date:   2018-02-22T15:48:11Z

    Updating spec.

commit fb4d4383f366776f446e33a422652c3ec1f56bfa
Author: cstella <ce...@...>
Date:   2018-02-22T18:00:36Z

    Updating threadpool creation

commit 87ef6a72827c31f8adee42ee71272a32c350bc1f
Author: cstella <ce...@...>
Date:   2018-02-22T18:04:37Z

    better docs

commit 6ae9594ee4ae2b4d33e0feca398b527077dac0d3
Author: cstella <ce...@...>
Date:   2018-02-22T18:41:20Z

    Updating readme.

commit 82ebc9550d759ea0bd06b48c586fd5e53c6e553a
Author: cstella <ce...@...>
Date:   2018-02-22T20:53:31Z

    Better documentation.

commit 235046d3d1fcda31690f4fc6b64cb38f050fc5af
Author: cstella <ce...@...>
Date:   2018-02-22T23:04:27Z

    Updating bolt to avoid an error.

commit 2e09e6e7c9d68514137a1da710fc4e0ffe2a8708
Author: cstella <ce...@...>
Date:   2018-02-27T14:14:27Z

    Updated shuffles to local or shuffle

commit cc3162d04683019838855119f0cc379cfb37a121
Author: cstella <ce...@...>
Date:   2018-02-28T20:58:02Z

    Updating bolt to not log so much.

commit 0601a9b491e6df055b0f078a101165cf851a67fb
Author: cstella <ce...@...>
Date:   2018-03-01T20:23:32Z

    Bug in the stellar adapter that is preventing caching.

commit 6b7161dbdd898c4ff2e07865c42d039e97fa35ed
Author: cstella <ce...@...>
Date:   2018-03-01T21:42:47Z

    Move the clone to the right place and make a test case for this.

commit 4706529fbab22530f86b610631dde86a18dd7b68
Author: cstella <ce...@...>
Date:   2018-03-01T22:37:22Z

    enricher test update

commit 09b3adff628bf356f490119b4fc9905210cb53bf
Author: cstella <ce...@...>
Date:   2018-03-02T17:07:20Z

    Adding the ability to monitor the cache

commit 167260ab554e519cb107bd5ab378653adfd7b306
Author: cstella <ce...@...>
Date:   2018-03-02T19:42:08Z

    parallel strategy should use a concurrency level set to the threadpool size

commit 934bdea0ffaa1ef14c1ab0c4952017a2cb0934aa
Author: cstella <ce...@...>
Date:   2018-03-02T21:37:20Z

    Moved to a different cache.

commit 571be7d7522f4796851d20fa1cef4d1863217910
Author: cstella <ce...@...>
Date:   2018-03-02T21:53:41Z

    updating test

commit d359746388b5caf87ae7f5f71435aaa991d9cc7b
Author: cstella <ce...@...>
Date:   2018-03-02T22:38:08Z

    dependencies for caffeine

commit 5786282539ca6a45b196867650b58f9d07fba007
Author: cstella <ce...@...>
Date:   2018-03-02T23:45:18Z

    Replacing guava caches in places where the cache sizes may be large.

----


---

[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

Posted by cestella <gi...@git.apache.org>.
GitHub user cestella reopened a pull request:

    https://github.com/apache/metron/pull/947

    METRON-1467: Replace guava caches in places where the keyspace might be large (NOTE: Review after METRON-1460)

    ## Contributor Comments
    Based on the performance tuning exercise as part of METRON-1460, guava has difficulties with cache sizes over 10k.  We, unfortunately, are quite demanding of guava in this regard so we should transition a few uses of guava to Caffeine:
    
    * Stellar processor cache
    * The JoinBolt cache
    * The Enrichment Bolt Cache
    
    NOTE: This depends on METRON-1460 aka #940 and that PR is included in here.  It might be easier to review after #940 is merged.
    
    Test plan pending
    
    ## Pull Request Checklist
    
    Thank you for submitting a contribution to Apache Metron.  
    Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions.  
    Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides.  
    
    
    In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
    - [x] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    - [x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    
    ### For code changes:
    - [x] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
    - [x] Have you included steps or a guide to how the change may be verified and tested manually?
    - [x] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
      ```
      mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
      ```
    
    - [x] Have you written or updated unit tests and or integration tests to verify your changes?
    - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
    - [x] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
    
    ### For documentation related changes:
    - [x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`:
    
      ```
      cd site-book
      mvn site
      ```
    
    #### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
    It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cestella/incubator-metron guava_cache_replacement

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/metron/pull/947.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #947
    
----
commit a4f618a3ad895d62772366e0e93e5b8b37c5c964
Author: cstella <ce...@...>
Date:   2018-02-21T23:59:16Z

    Adding parallel enrichment bolt.

commit 99fe0b86005fe04294b3851a17ae3d88f228c5d2
Author: cstella <ce...@...>
Date:   2018-02-22T00:21:06Z

    Updating to include trace statements.

commit 79736c6f3fab04d01dd1eb998b308f438003a0e1
Author: cstella <ce...@...>
Date:   2018-02-22T15:35:44Z

    Updating with some cleanup

commit cb4a527c9146865dafad1d597ba93032ef398d94
Author: cstella <ce...@...>
Date:   2018-02-22T15:48:11Z

    Updating spec.

commit fb4d4383f366776f446e33a422652c3ec1f56bfa
Author: cstella <ce...@...>
Date:   2018-02-22T18:00:36Z

    Updating threadpool creation

commit 87ef6a72827c31f8adee42ee71272a32c350bc1f
Author: cstella <ce...@...>
Date:   2018-02-22T18:04:37Z

    better docs

commit 6ae9594ee4ae2b4d33e0feca398b527077dac0d3
Author: cstella <ce...@...>
Date:   2018-02-22T18:41:20Z

    Updating readme.

commit 82ebc9550d759ea0bd06b48c586fd5e53c6e553a
Author: cstella <ce...@...>
Date:   2018-02-22T20:53:31Z

    Better documentation.

commit 235046d3d1fcda31690f4fc6b64cb38f050fc5af
Author: cstella <ce...@...>
Date:   2018-02-22T23:04:27Z

    Updating bolt to avoid an error.

commit 2e09e6e7c9d68514137a1da710fc4e0ffe2a8708
Author: cstella <ce...@...>
Date:   2018-02-27T14:14:27Z

    Updated shuffles to local or shuffle

commit cc3162d04683019838855119f0cc379cfb37a121
Author: cstella <ce...@...>
Date:   2018-02-28T20:58:02Z

    Updating bolt to not log so much.

commit 0601a9b491e6df055b0f078a101165cf851a67fb
Author: cstella <ce...@...>
Date:   2018-03-01T20:23:32Z

    Bug in the stellar adapter that is preventing caching.

commit 6b7161dbdd898c4ff2e07865c42d039e97fa35ed
Author: cstella <ce...@...>
Date:   2018-03-01T21:42:47Z

    Move the clone to the right place and make a test case for this.

commit 4706529fbab22530f86b610631dde86a18dd7b68
Author: cstella <ce...@...>
Date:   2018-03-01T22:37:22Z

    enricher test update

commit 09b3adff628bf356f490119b4fc9905210cb53bf
Author: cstella <ce...@...>
Date:   2018-03-02T17:07:20Z

    Adding the ability to monitor the cache

commit 167260ab554e519cb107bd5ab378653adfd7b306
Author: cstella <ce...@...>
Date:   2018-03-02T19:42:08Z

    parallel strategy should use a concurrency level set to the threadpool size

commit 934bdea0ffaa1ef14c1ab0c4952017a2cb0934aa
Author: cstella <ce...@...>
Date:   2018-03-02T21:37:20Z

    Moved to a different cache.

commit 571be7d7522f4796851d20fa1cef4d1863217910
Author: cstella <ce...@...>
Date:   2018-03-02T21:53:41Z

    updating test

commit d359746388b5caf87ae7f5f71435aaa991d9cc7b
Author: cstella <ce...@...>
Date:   2018-03-02T22:38:08Z

    dependencies for caffeine

commit 5786282539ca6a45b196867650b58f9d07fba007
Author: cstella <ce...@...>
Date:   2018-03-02T23:45:18Z

    Replacing guava caches in places where the cache sizes may be large.

----


---

[GitHub] metron issue #947: METRON-1467: Replace guava caches in places where the key...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on the issue:

    https://github.com/apache/metron/pull/947
  
    +1 LGTM


---

[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

Posted by cestella <gi...@git.apache.org>.
Github user cestella closed the pull request at:

    https://github.com/apache/metron/pull/947


---

[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/947#discussion_r172866140
  
    --- Diff: metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/JoinBolt.java ---
    @@ -89,29 +91,25 @@ public void prepare(Map map, TopologyContext topologyContext, OutputCollector ou
         if (this.maxTimeRetain == null) {
           throw new IllegalStateException("maxTimeRetain must be specified");
         }
    -    loader = new CacheLoader<String, Map<String, Tuple>>() {
    -      @Override
    -      public Map<String, Tuple> load(String key) throws Exception {
    -        return new HashMap<>();
    -      }
    -    };
    -    cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
    -            .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES).removalListener(new JoinRemoveListener())
    -            .build(loader);
    +    loader = s -> new HashMap<>();
    +    cache = Caffeine.newBuilder().maximumSize(maxCacheSize)
    +                         .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
    +                         .removalListener(new JoinRemoveListener())
    --- End diff --
    
    It seems like we only want notified of a full cache when ERROR logging is set. Is that the case? In the `JoinRemoveListener` we end up doing some work that we probably don't need to do unless ERROR logging is set.  One easy fix would be to only add the "remove listener" if `LOG.isDebugEnabled()`.


---

[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on a diff in the pull request:

    https://github.com/apache/metron/pull/947#discussion_r172870992
  
    --- Diff: metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/JoinBolt.java ---
    @@ -89,29 +91,25 @@ public void prepare(Map map, TopologyContext topologyContext, OutputCollector ou
         if (this.maxTimeRetain == null) {
           throw new IllegalStateException("maxTimeRetain must be specified");
         }
    -    loader = new CacheLoader<String, Map<String, Tuple>>() {
    -      @Override
    -      public Map<String, Tuple> load(String key) throws Exception {
    -        return new HashMap<>();
    -      }
    -    };
    -    cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
    -            .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES).removalListener(new JoinRemoveListener())
    -            .build(loader);
    +    loader = s -> new HashMap<>();
    +    cache = Caffeine.newBuilder().maximumSize(maxCacheSize)
    +                         .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
    +                         .removalListener(new JoinRemoveListener())
    --- End diff --
    
    So, I believe this was intentionally done before this PR (I migrated this to the new caching strategy) and the idea is that if a removal is happening from the join cache under specific circumstances, we want to know about it because a message could be being dropped because the cache is being overwhelmed.  @merrimanr Can you chime in here on the rationale?


---

[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

Posted by nickwallen <gi...@git.apache.org>.
Github user nickwallen commented on a diff in the pull request:

    https://github.com/apache/metron/pull/947#discussion_r172888378
  
    --- Diff: metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/JoinBolt.java ---
    @@ -89,29 +91,25 @@ public void prepare(Map map, TopologyContext topologyContext, OutputCollector ou
         if (this.maxTimeRetain == null) {
           throw new IllegalStateException("maxTimeRetain must be specified");
         }
    -    loader = new CacheLoader<String, Map<String, Tuple>>() {
    -      @Override
    -      public Map<String, Tuple> load(String key) throws Exception {
    -        return new HashMap<>();
    -      }
    -    };
    -    cache = CacheBuilder.newBuilder().maximumSize(maxCacheSize)
    -            .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES).removalListener(new JoinRemoveListener())
    -            .build(loader);
    +    loader = s -> new HashMap<>();
    +    cache = Caffeine.newBuilder().maximumSize(maxCacheSize)
    +                         .expireAfterWrite(maxTimeRetain, TimeUnit.MINUTES)
    +                         .removalListener(new JoinRemoveListener())
    --- End diff --
    
    Yes, it is pre-existing.  We can address at a later time.
    
    I remember now, maxing out this cache causes the Split/Join to fail, which is a major problem for the Split/Join topology.  And this cache here is only for the Split/Join, not the Unified topology.
    
    We should probably look at adding similar logging (only when ERROR enabled) for the other places where we use the cache.  Or just some mechanism to periodically log cache stats.  Anywho, down the road.


---

[GitHub] metron issue #947: METRON-1467: Replace guava caches in places where the key...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on the issue:

    https://github.com/apache/metron/pull/947
  
    I ran this up with vagrant and ensured:
    * Normal stellar works still in field transformations as well as enrichments
    * swapped in and out new enrichments live
    * swapped in and out new threat intel live


---

[GitHub] metron pull request #947: METRON-1467: Replace guava caches in places where ...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/metron/pull/947


---