You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by GitBox <gi...@apache.org> on 2020/01/14 07:18:59 UTC

[GitHub] [nifi] IlyaKovalev opened a new pull request #3984: add DistributeRecord processor

IlyaKovalev opened a new pull request #3984: add DistributeRecord processor
URL: https://github.com/apache/nifi/pull/3984
 
 
   Thank you for submitting a contribution to Apache NiFi.
   
   Please provide a short description of the PR here:
   
   #### add DistributeRecord processor for distribute data over user specified relationships by distribution key/keys.
   
   In order to streamline the review of the contribution we ask you
   to ensure the following steps have been taken:
   
   ### For all changes:
   - [X] Is there a JIRA ticket associated with this PR? Is it referenced 
        in the commit message?
   
   - [X] Does your PR title start with **NIFI-XXXX** where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   
   - [X] Has your PR been rebased against the latest commit within the target branch (typically `master`)?
   
   - [X] Is your initial contribution a single, squashed commit? _Additional commits in response to PR reviewer feedback should be made on this branch and pushed to allow change tracking. Do not `squash` or use `--force` when pushing to allow for clean monitoring of changes._
   
   ### For code changes:
   - [X] Have you ensured that the full suite of tests is executed via `mvn -Pcontrib-check clean install` at the root `nifi` folder?
   - [X] Have you written or updated unit tests to verify your changes?
   - [ ] Have you verified that the full build is successful on both JDK 8 and JDK 11?
   - [X] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
   - [ ] If applicable, have you updated the `LICENSE` file, including the main `LICENSE` file under `nifi-assembly`?
   - [ ] If applicable, have you updated the `NOTICE` file, including the main `NOTICE` file found under `nifi-assembly`?
   - [X] If adding new Properties, have you added `.displayName` in addition to .name (programmatic access) for each of the new properties?
   
   ### For documentation related changes:
   - [X] Have you ensured that format looks appropriate for the output in which it is rendered?
   
   ### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [nifi] IlyaKovalev commented on issue #3984: NIFI-6970 add DistributeRecord processor

Posted by GitBox <gi...@apache.org>.
IlyaKovalev commented on issue #3984: NIFI-6970 add DistributeRecord processor
URL: https://github.com/apache/nifi/pull/3984#issuecomment-590423192
 
 
   Hello.
   1 Done
   2 Yes, i agree, we can implement this logic with PartitionRecord only if RecordPath will have hash functions and operations for number processing like `mod`. So PartitionRecord + RouteOnAttribute will have the same effect as this processor. (Implementation of weights can be awkward but it is definitely possible)
   3 Renamed processor with `DistributeHashRecord`. I think the key feature here is distribution over hashed key.
   and windows build failed ... hmm...
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [nifi] sburges commented on issue #3984: NIFI-6970 add DistributeRecord processor

Posted by GitBox <gi...@apache.org>.
sburges commented on issue #3984: NIFI-6970 add DistributeRecord processor
URL: https://github.com/apache/nifi/pull/3984#issuecomment-575349450
 
 
   @IlyaKovalev I had a couple of quick questions looking at this:
   
   1. Are you going to add this to the manifest for the standard processor NAR?
   2. Distributing based on a field is really interesting and useful but the DistributeLoad also has the ability to round robin distribute. Did you think about doing that here? Potentially make the Keys field optional and do round-robin if not specified. I'm thinking of a case where I don't want to distribute based on the actual data. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [nifi] joewitt commented on issue #3984: NIFI-6970 add DistributeRecord processor

Posted by GitBox <gi...@apache.org>.
joewitt commented on issue #3984: NIFI-6970 add DistributeRecord processor
URL: https://github.com/apache/nifi/pull/3984#issuecomment-590111770
 
 
   Hello.  This is an interesting PR/good idea.  I wonder though if this should just be a partitioning strategy in PartitionRecord.  If we keep it separate like this we might want to go with a more specific name than DistributeRecord as we might want different distribution options and this one seems to be pretty specific to weighted distribution using one or more hashing functions.  Maybe then the name should be 'WeightedRecordDistribution'
   
   In any case how about squashing the commits and rebasing to master and pushing the forced PR.  This will let the new CI processes run against the PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [nifi] IlyaKovalev commented on issue #3984: NIFI-6970 add DistributeRecord processor

Posted by GitBox <gi...@apache.org>.
IlyaKovalev commented on issue #3984: NIFI-6970 add DistributeRecord processor
URL: https://github.com/apache/nifi/pull/3984#issuecomment-575571931
 
 
   1 Fixed, also remove ORIGINAL relationship (it's excessively i think)
   2 Yes, i really thought about it. In case merge logic of DistributeRecord with DistributeLoad we should be add reader, writer, keys, hash_function, strategy fields but logic for understanding how processor will process input is growing i suggest it's too ambigious.
   So look at this like:
   DistributeLoad - works on flowfile level without processing content i.e distribute flowfiles.
   DistributeRecord - works on content level i.e distribute content
   (So maybe we need rename DistributeLoad to DistributeFlowFile just because DistributeLoad is 
   very general definition)
   I think it's much easier for understanding processor logic(how it would works) and much simplest for user to appreciate init process.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services