You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by MikeThomsen <gi...@git.apache.org> on 2018/06/09 17:43:57 UTC

[GitHub] nifi pull request #2777: NIFI-5287 Made LookupRecord able to take in flowfil...

GitHub user MikeThomsen opened a pull request:

    https://github.com/apache/nifi/pull/2777

    NIFI-5287 Made LookupRecord able to take in flowfile attributes and c…

    …ombine them with lookup keys.
    
    Thank you for submitting a contribution to Apache NiFi.
    
    In order to streamline the review of the contribution we ask you
    to ensure the following steps have been taken:
    
    ### For all changes:
    - [ ] Is there a JIRA ticket associated with this PR? Is it referenced 
         in the commit message?
    
    - [ ] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    
    - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    - [ ] Is your initial contribution a single, squashed commit?
    
    ### For code changes:
    - [ ] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
    - [ ] Have you written or updated unit tests to verify your changes?
    - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
    - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
    - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
    - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?
    
    ### For documentation related changes:
    - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
    
    ### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/MikeThomsen/nifi NIFI-5287

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/2777.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2777
    
----
commit d841faeb4311266fe0fde28c9c6053388082024e
Author: Mike Thomsen <mi...@...>
Date:   2018-06-09T17:43:24Z

    NIFI-5287 Made LookupRecord able to take in flowfile attributes and combine them with lookup keys.

----


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @markap14 @ijokarumawak updated based on the last comment.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @ijokarumawak 
    
    > Adding those values into lookup coordinate may not sound that wrong, if we keep the consistent overlaying order.
    
    It is, however, dangerous to do with LookupService implementations that are based around query builders like the Mongo one. The Mongo one does a straight conversion of the coordinate map into a Mongo query, so if you add any value other than a key the user specifies you will break the query in most cases.


---

[GitHub] nifi pull request #2777: NIFI-5287 Made LookupRecord able to take in flowfil...

Posted by ijokarumawak <gi...@git.apache.org>.
Github user ijokarumawak commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2777#discussion_r194933023
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestLookupRecord.java ---
    @@ -400,6 +433,20 @@ public void addValue(final String key, final String value) {
             public Set<String> getRequiredKeys() {
                 return Collections.singleton("lookup");
             }
    +
    +        public void setRequiredCoordinates(Map<String, Object> expectedCoordinates) {
    +            this.expectedCoordinates = expectedCoordinates;
    +        }
    +
    +        private void enforceRequiredCoordinates(Map<String, String> context) {
    --- End diff --
    
    This should be renamed to `validateContext`. And every name using `coordinates` should be replaced with `context`.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by markap14 <gi...@git.apache.org>.
Github user markap14 commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @MikeThomsen @ijokarumawak Hey guys, I'm sorry to chime in a bit late here. I don't think this is really the approach that we want to take here, though. What our interface provides currently is a mechanism to pass in the 'coordinates', i.e., the location for finding the data that we want. The approach here takes this concepts and muddies it by providing arbitrary key/value pairs that don't make sense for the lookup.
    
    I would propose an alternative approach, which would be to add a new method to the interface that has a default implementation:
    ```
    default Optional<T> lookup(Map<String, Object> coordinates, Map<String, String> context) throws LookupFailureException {
        return lookup(coordinates);
    }
    ```
    Where `context` is used for the FlowFile attributes (I'm referring to it as `context` instead of `attributes` because there may well be a case where we want to provide some other value that is not specifically a FlowFile attribute). Here is why I am suggesting this:
    - It provides a clean interface that properly separates the data's coordinates from FlowFile attributes.
    - It prevents any collisions between FlowFile attribute names and coordinates.
    - It maintains backward compatibility, and we know that it won't change the behavior of existing services or processors/components using those services - even those that may have been implemented by others outside of the Apache realm.
    - If attributes are passed in by a Processor, those attributes will be ignored anyway unless the Controller Service is specifically updated to make use of those attributes, such as via Expression Language. In such a case, the Controller Service can simply be updated at that time to make use of the new method instead of the existing method.



---

[GitHub] nifi pull request #2777: NIFI-5287 Made LookupRecord able to take in flowfil...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/nifi/pull/2777


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by ijokarumawak <gi...@git.apache.org>.
Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @markap14 I agree with that. We shouldn't break existing stuffs. I've been thinking about this after I posted my last comment and now I believe having separate map is a better idea. It's possible to mix those to evaluate an EL with both values if new implementation really needs so, while keeping existing ones work as they are.
    
    Sorry for making a confusion.
    
    @MikeThomsen now we can remove `Attributes Regular Expression` from LookupRecord, and just pass all FlowFile attribute as context as you did with LookupAttribute.
    
    In addition to that, there is one more things that I want to clarify on this improvement.
    Do we also want to access Variable Registry values through the `context` map? If so, how can we do that?


---

[GitHub] nifi pull request #2777: NIFI-5287 Made LookupRecord able to take in flowfil...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2777#discussion_r195270892
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestLookupRecord.java ---
    @@ -400,6 +433,20 @@ public void addValue(final String key, final String value) {
             public Set<String> getRequiredKeys() {
                 return Collections.singleton("lookup");
             }
    +
    +        public void setRequiredCoordinates(Map<String, Object> expectedCoordinates) {
    +            this.expectedCoordinates = expectedCoordinates;
    --- End diff --
    
    Done.


---

[GitHub] nifi pull request #2777: NIFI-5287 Made LookupRecord able to take in flowfil...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2777#discussion_r195271111
  
    --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-lookup-service-api/src/main/java/org/apache/nifi/lookup/LookupService.java ---
    @@ -35,6 +35,19 @@
          */
         Optional<T> lookup(Map<String, Object> coordinates) throws LookupFailureException;
     
    +    /**
    +     * Looks up a value that corresponds to the given map, coordinates. FlowFile attributes will also be passed into the
    +     * map labeled context.
    +     *
    +     * @param coordinates a Map of key/value pairs that indicate the information that should be looked up
    +     * @param context a Map of FlowFile attributes
    --- End diff --
    
    Done.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @ijokarumawak Review?


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by markap14 <gi...@git.apache.org>.
Github user markap14 commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    In hindsight, a "Context" type of object would have been a good idea in this situation, had it been done that way in the first place. However, while it's a helpful design pattern, we have to be cognizant of the fact that we always have performed a release that contains the Service API the way that it is. We need to avoid changing that, as any custom Controller Services would break if we change the API in a non-backward-compatible way. It would also break any custom processors that try to use a Lookup Service.
    While we do have some flexibility in changing Service API's after they have been released, it is kind of a nuclear option that should be avoided if we can. Especially for something that is as widely used as the lookup stuff.
    This is largely why I want to avoid also including attributes in the same map - it would potentially break existing services if new things suddenly are fed into that map that are not part of the lookup's coordinates. By introducing the `Map<String, String> context` and providing a default implementation in the interface, all existing services will work as they previously did, and all existing processors will work as the previously did. However, new or updated implementations will still be able to take advantage of FlowFile Attributes.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    Maybe there should be a `Context` interface, and there can be support for implementations that support more than one map or type of backing.  I think the limitation here is using a literal Map instead of a logical construct.
    
    This is similar to variable resolution in a custom language.  You may  need more than a map.
    
    `LookupContext context = new AttributeAndCoordinateLookup(attrs, coords);`
    
    



---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by markap14 <gi...@git.apache.org>.
Github user markap14 commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @ijokarumawak whenever Expression Language is evaluated from a PropertyDescriptor, it always has access to the Variable Registry. The framework provides the variable registry to the compiled expression when it is evaluated.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    It may be overkill, that is true.  But if you have to keep adding new functions to account for different scenarios that isn't great either and may suggest something like that would be good to have. Having a context or resolver for this type of thing isn't that radical. Having a context sets the interface, that is true, but the implementation can be any kind of policy/strategy/composition you may require.
    
    That being said, just a thought anyways.



---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by ijokarumawak <gi...@git.apache.org>.
Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    As for Variable Registry, I think passing only FlowFile attribute is enough, same as the way AbstractRecordProcessor is doing right now. https://github.com/apache/nifi/blob/master/nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/AbstractRecordProcessor.java#L113
    
    In that sense, I found a contradiction between current documentation and behavior. Schema Name, Schema Version, Schema Branch and Schema Text are documented that those supports Variable Registry and FlowFile Attributes, but it can only access to FlowFile attributes. I assume this pattern is not considered when ExpressionLanguageScope is added. Probably for another JIRA (not filed yet).
    ![image](https://user-images.githubusercontent.com/1107620/41296857-d025934e-6e98-11e8-9521-29c135f57da6.png)
    



---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @markap14 @ijokarumawak cleaned it up by removing that attribute and it's rebased against master (as of my last pull this morning). Can one of you review?


---

[GitHub] nifi pull request #2777: NIFI-5287 Made LookupRecord able to take in flowfil...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2777#discussion_r195270966
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestLookupRecord.java ---
    @@ -400,6 +433,20 @@ public void addValue(final String key, final String value) {
             public Set<String> getRequiredKeys() {
                 return Collections.singleton("lookup");
             }
    +
    +        public void setRequiredCoordinates(Map<String, Object> expectedCoordinates) {
    +            this.expectedCoordinates = expectedCoordinates;
    +        }
    +
    +        private void enforceRequiredCoordinates(Map<String, String> context) {
    --- End diff --
    
    Done.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by ijokarumawak <gi...@git.apache.org>.
Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    Thanks @markap14 for pointing that. Separating the coordinate and context make sense and implementations will be cleaner by doing so. Although I agree with the idea, let me try to explain why I wanted to put everything into the same map.
    
    I wanted to muddle all variables into the same map because some lookup target, such as URL for the RestLookupService have both lookup coordinate and environment depending part that can be passed as FlowFile attributes or variable registry.
    
    For example, let's say we want to make following URL to be populated by a Expression Language to be used by RestLookupService. `http://test.example.com:8080/service/john.smith/friend/12345`
    I think the URL has both lookup and contextual part in it.
    
    - `john.smith and `12345` are lookup coordinates that vary from input, in LookupRecord context, it varies per recocrd
    - But hostname and port is more of contextual values, that varies per FlowFile or environment
    
    If the URL property is configured as `http://${apiHost}:${apiPort}/service/${userName}/friend/${friendId}`, and if it can only refer lookup coordinate, then `apiHost` and `apiPort` need to be set for each record lookup. And to do so, user need to configure dynamic properties at LookupRecord processor using record paths, which can be awkward since RecordPathValidator doesn't allow literal String value. A work-around is to use `concat('test.example.com')` to return the constant value for every record lookup.
    
    To make this scenario more flexible while meeting with the idea of separating lookup coordinates and context, I wrote MikeThomsen/nifi#1  so that target URL can be configured by 2 properties (if necessary). `BASE_URL` can use Variable Registry, and `URL` can use coordinate.
    
    If we pursue the path to separate lookup and coordinate more implementations like that will be needed in different places. That has both pros / cons I think. 
    
    On the other hand, the variables those can be referred by an EL is already muddled a lot IMHO. It contains System properties, Variable registry, and FlowFile attribute values and EL can utilize those without knowing where it's configured.
    
    Adding those values into lookup coordinate may not sound that wrong, if we keep the consistent overlaying order.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by ijokarumawak <gi...@git.apache.org>.
Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    Thanks @markap14 , that is true. My previous comment on the difference between doc and actual behavior was a false alarm.
    
    I expected that but got different result, so I got confused. But now I understand what happened in my test. A LookupService was in a child group, using a RecordReader defined in a parent group. And I set a variable in the child group. But the RecordReader in the parent group was not able to use it. Configuring variable at the parent group worked as expected.
    
    - Parent PG
      - RecordReader: `Schema Name=${schema.name}`
      - Variable registry: `schema.name=parent` (this value can be used)
      - Child PG
        - LookupService using the RecordReader in the parent PG
        - Variable registry: `schema.name=child` (the RecordReader doesn't have access to variables defined in child PG)
    
    Also, I found that how SchemaAccessStrategy classes caches `PropertyValue` to evaluate EL at runtime. @MikeThomsen RestLookupService PR #2723 needs to be updated to follow this pattern when it evaluates EL. I will comment on that PR for detail.


---

[GitHub] nifi pull request #2777: NIFI-5287 Made LookupRecord able to take in flowfil...

Posted by ijokarumawak <gi...@git.apache.org>.
Github user ijokarumawak commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2777#discussion_r194932903
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/test/java/org/apache/nifi/processors/standard/TestLookupRecord.java ---
    @@ -400,6 +433,20 @@ public void addValue(final String key, final String value) {
             public Set<String> getRequiredKeys() {
                 return Collections.singleton("lookup");
             }
    +
    +        public void setRequiredCoordinates(Map<String, Object> expectedCoordinates) {
    +            this.expectedCoordinates = expectedCoordinates;
    --- End diff --
    
    This method should be renamed to `setExpectedContext`. Similarly expectedCoordinates should be `expectedContext`.


---

[GitHub] nifi pull request #2777: NIFI-5287 Made LookupRecord able to take in flowfil...

Posted by ijokarumawak <gi...@git.apache.org>.
Github user ijokarumawak commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2777#discussion_r194933630
  
    --- Diff: nifi-nar-bundles/nifi-standard-services/nifi-lookup-service-api/src/main/java/org/apache/nifi/lookup/LookupService.java ---
    @@ -35,6 +35,19 @@
          */
         Optional<T> lookup(Map<String, Object> coordinates) throws LookupFailureException;
     
    +    /**
    +     * Looks up a value that corresponds to the given map, coordinates. FlowFile attributes will also be passed into the
    +     * map labeled context.
    +     *
    +     * @param coordinates a Map of key/value pairs that indicate the information that should be looked up
    +     * @param context a Map of FlowFile attributes
    --- End diff --
    
    I'd suggest to make the description more lenient. LookupService is just an interface and is not limited to be called by Processors. This `context` map can be anything contextual variables. FlowFile attributes are an example.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @ottobackwards I think that's overkill because a Map is fine for the main variables and a Map also works for passing EL output. A major advantage of using two maps is that the "context" one is not set in stone as @markap14 pointed out and can change as our use changes without breaking much.


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    @markap14 that certainly makes sense, thanks for taking the time to respond


---

[GitHub] nifi issue #2777: NIFI-5287 Made LookupRecord able to take in flowfile attri...

Posted by ijokarumawak <gi...@git.apache.org>.
Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/2777
  
    Thanks @MikeThomsen LGTM +1, merging.


---