You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2021/02/10 21:15:59 UTC

[GitHub] [nifi] markap14 opened a new pull request #4818: NIFI-7646, NIFI-8222: WIP for performance improvements to make it readily available for testing

markap14 opened a new pull request #4818:
URL: https://github.com/apache/nifi/pull/4818


   Note that this is NOT READY to be merged. Still more cleanup and testing must be done.
   
   
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
   #### Description of PR
   
   _Enables X functionality; fixes bug NIFI-YYYY._
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [ ] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically `main`)?
   
   - [ ] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [ ] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [ ] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on JDK 8?
   - [ ] Have you verified that the full build is successful on JDK 11?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`?
   - [ ] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check GitHub Actions CI for build issues and submit an update to your PR as soon as possible.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] mattyb149 commented on pull request #4818: NIFI-7646, NIFI-8222: Reduced lock contention when updating Provenance Repository by buffering up to 1 MB of serialized records in memory; keep InputStream from content repo open when possible for reading across multiple FlowFiles in a session

Posted by GitBox <gi...@apache.org>.
mattyb149 commented on pull request #4818:
URL: https://github.com/apache/nifi/pull/4818#issuecomment-784307135


   +1 LGTM, ran full build with tests and a few flows using MergeContent and others with lots of small files. I removed the TODO comment and the unused Jackson dependency. Thanks for the improvement! Merging to main


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] joewitt commented on pull request #4818: NIFI-7646, NIFI-8222: Reduced lock contention when updating Provenance Repository by buffering up to 1 MB of serialized records in memory; keep InputStream from content repo open when possible for reading across multiple FlowFiles in a session

Posted by GitBox <gi...@apache.org>.
joewitt commented on pull request #4818:
URL: https://github.com/apache/nifi/pull/4818#issuecomment-780117327


   Have a test flow running.  Also showing similar numbers.  Seeing around 73K events/sec on a flow which is event/metadata intensive so is very telling about any gains.  (GenerateFF/UpdateAttr)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] mattyb149 closed pull request #4818: NIFI-7646, NIFI-8222: Reduced lock contention when updating Provenance Repository by buffering up to 1 MB of serialized records in memory; keep InputStream from content repo open when possible for reading across multiple FlowFiles in a session

Posted by GitBox <gi...@apache.org>.
mattyb149 closed pull request #4818:
URL: https://github.com/apache/nifi/pull/4818


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [nifi] markap14 commented on pull request #4818: NIFI-7646, NIFI-8222: Reduced lock contention when updating Provenance Repository by buffering up to 1 MB of serialized records in memory; keep InputStream from content repo open when possible for reading across multiple FlowFiles in a session

Posted by GitBox <gi...@apache.org>.
markap14 commented on pull request #4818:
URL: https://github.com/apache/nifi/pull/4818#issuecomment-780061865


   I created a few different flows to measure the performance of NiFi on the main branch vs. this branch to verify that these changes resulted in significant performance improvements. Here are the results:
   
   ```
   GenerateFlowFile (0 bytes) -> UpdateAttribute
   Branch    | FlowFiles / 5 min | CPU Utilization
   -----------------------------------------------
   main      | 14.4 million      | 600-650%
   this branch | 21 million        | 600-650%
   
   46% higher throughput. 0% more CPU used.
   
   
   GenerateFlowFile (37 bytes JSON) -> ConvertRecord (JSON In, JSON Out) -> UpdateAttribute
   Branch    | FlowFiles / 5 min | CPU Utilization
   -----------------------------------------------
   main      | 7.6 million       | 750-800%
   this branch | 12.4 million      | 900-950%
   
   63% higher throughput. Took a good bit more CPU but made that CPU available for use by the processor.
   
   
   GenerateFlowFile (1024 bytes) -> MergeContent (binary concat, 1024 FlowFiles/bin) -> UpdateAttribute
   Branch    | FlowFiles / 5 min | CPU Utilization
   -----------------------------------------------
   main      | 7.3 million       | 600-650%
   this branch | 16.2 million      | 650-700%
   122% higher throughput. Maybe 1/2 core more CPU used.
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org