You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2019/06/05 14:36:10 UTC

[GitHub] [nifi] FrederikP opened a new pull request #3518: NIFI-6322: Introduced EvaluationContext to store state while making evaluator tree reusable

FrederikP opened a new pull request #3518: NIFI-6322: Introduced EvaluationContext to store state while making evaluator tree reusable
URL: https://github.com/apache/nifi/pull/3518
 
 
   #### Description of PR
   
   This is a followup PR to #3500 . It enables true re-usage of the evaluator tree once created for a prepared query in nifi's expression language. This, at least in our case, saves a ton of CPU.
   
   This PR also includes a lot of tests for functions with stateful evaluators. This wasn't covered before.
   
   #3500 was closed by me because the approach I took there was not thread safe. Instead of cleaning up the state in evaluators I now introduced a context that gets passed through a tree for each evaluation to get rid of state in the evaluators itself. All evaluators that need state (mostly for performance reasons) can store that state in the context. A new context is created for each evaluation. That should also result in lower garbage collection impact, because we are only throwing away the state that needs to be thrown away not the whole evaluator tree again and again.
   
   Another related pull request is #3277 but that takes a different approach and only helps if no stateful evaluators are used in an expression. It also doesn't cover `and` + `or` even though they have state. Even if those will be excluded from optimization, I don't think it's the best approach because there are a ton of expressions (at least in our production scenario) that use functions that have stateful evaluators.
   
   Some profiling tests we did show a performance improvement when compared with the current master (60b5c13ce95fd4d4a5edf0f08b81af19e71b67ee).
   
   This is the test code:
   
   ```
   @Test
   public void testPerformance() {
       final Map<String, String> attributes = new HashMap<String, String>() {{
           put("hello", "Hello");
           put("boat", "World!");
       }};
       final StandardPreparedQuery prepared = (StandardPreparedQuery) Query.prepare("${allAttributes('hello', 'boat'):isEmpty():not():and(${hello:contains('o')})}");
       for (int i = 0; i < 1000000; i++) {
           assertEquals("true", prepared.evaluateExpressions(attributes, null));
       }
   }
   ```
   
   CPU Time for the evaluation loop (instrumented code):
   current master - 97.48s
   this PR - 23.78s
   
   ->75% performance improvement
   
   ### For all changes:
   - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [x] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [x] Has your PR been rebased against the latest commit within the target branch (typically `master`)?
   
   - [x] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [x] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [x] Have you written or updated unit tests to verify your changes?
   
   ### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services