You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by ottobackwards <gi...@git.apache.org> on 2018/05/09 15:59:28 UTC

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

GitHub user ottobackwards opened a pull request:

    https://github.com/apache/nifi/pull/2691

    NIFI-5170 Upgrad Grok to version 0.1.9

    Upgrade to the new java-grok release and update for changes in the library.
    This includes:
    
    - Changes to the namespace from io.thekraken to io.krakens
    - Refactoring to use the new GrokCompiler api
    - Refactoring to do customValidation, since Grok will throw an IllegalArgumentException is an expression
    references a Grok that is not defined, which is a change of behavior
    
    - ExtractGrok now supports the default patterns, so the patterns file property is no longer required
    
    Handles both the Record Reader and Legacy Processor
    
    Thank you for submitting a contribution to Apache NiFi.
    
    In order to streamline the review of the contribution we ask you
    to ensure the following steps have been taken:
    
    ### For all changes:
    - [x] Is there a JIRA ticket associated with this PR? Is it referenced 
         in the commit message?
    
    - [x] Does your PR title start with NIFI-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    
    - [x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    - [x] Is your initial contribution a single, squashed commit?
    
    ### For code changes:
    - [x] Have you ensured that the full suite of tests is executed via mvn -Pcontrib-check clean install at the root nifi folder?
    - [ ] Have you written or updated unit tests to verify your changes?
    - [x] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
    - [ ] If applicable, have you updated the LICENSE file, including the main LICENSE file under nifi-assembly?
    - [ ] If applicable, have you updated the NOTICE file, including the main NOTICE file found under nifi-assembly?
    - [ ] If adding new Properties, have you added .displayName in addition to .name (programmatic access) for each of the new properties?
    
    ### For documentation related changes:
    - [ ] Have you ensured that format looks appropriate for the output in which it is rendered?
    
    ### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ottobackwards/nifi update-grok-019

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/nifi/pull/2691.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2691
    
----
commit d05d72830cfb63458f5f87213be5a64ca12c3270
Author: Otto Fowler <ot...@...>
Date:   2018-05-08T19:53:20Z

    NIFI-5170 Upgrad Grok to version 0.1.9
    
    Upgrade to the new java-grok release and update for changes in the library.
    This includes:
    
    - Changes to the namespace from io.thekraken to io.krakens
    - Refactoring to use the new GrokCompiler api
    - Refactoring to do customValidation, since Grok will throw an IllegalArgumentException is an expression
    references a Grok that is not defined, which is a change of behavior
    
    Handles both the Record Reader and Legacy Processor

----


---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2691#discussion_r187720132
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractGrok.java ---
    @@ -80,18 +84,21 @@
         public static final String FLOWFILE_ATTRIBUTE = "flowfile-attribute";
         public static final String FLOWFILE_CONTENT = "flowfile-content";
         private static final String APPLICATION_JSON = "application/json";
    +    private static final String DEFAULT_PATTERN_NAME = "/default-grok-patterns.txt";
     
         public static final PropertyDescriptor GROK_EXPRESSION = new PropertyDescriptor.Builder()
             .name("Grok Expression")
    -        .description("Grok expression")
    +        .description("Grok expression. If other Grok expressions are referenced in this expression, they must be provided "
    +        + "in the Grok Pattern File if set or exist in the default Grok patterns")
             .required(true)
    -        .addValidator(validateGrokExpression())
    +        .addValidator(StandardValidators.NON_BLANK_VALIDATOR)
    --- End diff --
    
    I think having a custom validator here was a better idea. Just checking that it's non-blank doesn't do much to help the user.


---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2691#discussion_r187795865
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/resources/default-grok-patterns.txt ---
    @@ -0,0 +1,115 @@
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    --- End diff --
    
    Ok, so it's a known quantity.


---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2691#discussion_r187774740
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractGrok.java ---
    @@ -179,17 +187,59 @@ public void onStopped() {
             bufferQueue.clear();
         }
     
    +    @Override
    +    protected Collection<ValidationResult> customValidate(final ValidationContext validationContext) {
    +        Collection<ValidationResult> problems = new ArrayList<>();
    +
    +        // validate the grok expression against configuration
    +        boolean namedCaptures = false;
    +        if (validationContext.getProperty(NAMED_CAPTURES_ONLY).isSet()) {
    +            namedCaptures = validationContext.getProperty(NAMED_CAPTURES_ONLY).asBoolean();
    +        }
    +        GrokCompiler grokCompiler = GrokCompiler.newInstance();
    +        String subject = GROK_EXPRESSION.getName();
    +        String input = validationContext.getProperty(GROK_EXPRESSION).getValue();
    +        if (validationContext.getProperty(GROK_PATTERN_FILE).isSet()) {
    +            try (final InputStream in = new FileInputStream(new File(validationContext.getProperty(GROK_PATTERN_FILE).getValue()));
    +                 final Reader reader = new InputStreamReader(in)) {
    +                grokCompiler.register(reader);
    +                grok = grokCompiler.compile(input, namedCaptures);
    +            } catch (IOException | GrokException | java.util.regex.PatternSyntaxException e) {
    +                problems.add(new ValidationResult.Builder()
    +                        .subject(subject)
    --- End diff --
    
    I needed to refactor this to be correct, sorry.  Please check the new commit


---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2691#discussion_r187770394
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/resources/default-grok-patterns.txt ---
    @@ -0,0 +1,115 @@
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    --- End diff --
    
    What's the origin of this file?


---

[GitHub] nifi issue #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/nifi/pull/2691
  
    Thanks for the review @MikeThomsen 


---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2691#discussion_r187770374
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractGrok.java ---
    @@ -179,17 +187,59 @@ public void onStopped() {
             bufferQueue.clear();
         }
     
    +    @Override
    +    protected Collection<ValidationResult> customValidate(final ValidationContext validationContext) {
    +        Collection<ValidationResult> problems = new ArrayList<>();
    +
    +        // validate the grok expression against configuration
    +        boolean namedCaptures = false;
    +        if (validationContext.getProperty(NAMED_CAPTURES_ONLY).isSet()) {
    +            namedCaptures = validationContext.getProperty(NAMED_CAPTURES_ONLY).asBoolean();
    +        }
    +        GrokCompiler grokCompiler = GrokCompiler.newInstance();
    +        String subject = GROK_EXPRESSION.getName();
    +        String input = validationContext.getProperty(GROK_EXPRESSION).getValue();
    +        if (validationContext.getProperty(GROK_PATTERN_FILE).isSet()) {
    +            try (final InputStream in = new FileInputStream(new File(validationContext.getProperty(GROK_PATTERN_FILE).getValue()));
    +                 final Reader reader = new InputStreamReader(in)) {
    +                grokCompiler.register(reader);
    +                grok = grokCompiler.compile(input, namedCaptures);
    +            } catch (IOException | GrokException | java.util.regex.PatternSyntaxException e) {
    +                problems.add(new ValidationResult.Builder()
    +                        .subject(subject)
    --- End diff --
    
    Why are you reusing the subject and input from the expression here? Is it because Grok uses the pattern to validate them?


---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2691#discussion_r187774759
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/resources/default-grok-patterns.txt ---
    @@ -0,0 +1,115 @@
    +# Licensed to the Apache Software Foundation (ASF) under one or more
    --- End diff --
    
    It if from the standard serialization services, it existed for the GrokReader, I copied it over


---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2691#discussion_r187774270
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractGrok.java ---
    @@ -80,18 +84,21 @@
         public static final String FLOWFILE_ATTRIBUTE = "flowfile-attribute";
         public static final String FLOWFILE_CONTENT = "flowfile-content";
         private static final String APPLICATION_JSON = "application/json";
    +    private static final String DEFAULT_PATTERN_NAME = "/default-grok-patterns.txt";
     
         public static final PropertyDescriptor GROK_EXPRESSION = new PropertyDescriptor.Builder()
             .name("Grok Expression")
    -        .description("Grok expression")
    +        .description("Grok expression. If other Grok expressions are referenced in this expression, they must be provided "
    +        + "in the Grok Pattern File if set or exist in the default Grok patterns")
             .required(true)
    -        .addValidator(validateGrokExpression())
    +        .addValidator(StandardValidators.NON_BLANK_VALIDATOR)
    --- End diff --
    
    I changed to the customValidate because the new grok no longer ignores missing named patterns when compiling.
    
    So if I had an expression %{FOO:foo}abc and tried to compile it without providing the FOO pattern to the compiler it would silently eat the error in the old version.
    
    In the current version, it throws an illegal argument exception.  So the validation needs to utilize the provided pattern file, so I didn't think it could be in Property validate.  I thought it needed to be in the custom validate, since it runs *after* all the regular validates.
    
    Does that make sense?



---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by MikeThomsen <gi...@git.apache.org>.
Github user MikeThomsen commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2691#discussion_r187795885
  
    --- Diff: nifi-nar-bundles/nifi-standard-bundle/nifi-standard-processors/src/main/java/org/apache/nifi/processors/standard/ExtractGrok.java ---
    @@ -80,18 +84,21 @@
         public static final String FLOWFILE_ATTRIBUTE = "flowfile-attribute";
         public static final String FLOWFILE_CONTENT = "flowfile-content";
         private static final String APPLICATION_JSON = "application/json";
    +    private static final String DEFAULT_PATTERN_NAME = "/default-grok-patterns.txt";
     
         public static final PropertyDescriptor GROK_EXPRESSION = new PropertyDescriptor.Builder()
             .name("Grok Expression")
    -        .description("Grok expression")
    +        .description("Grok expression. If other Grok expressions are referenced in this expression, they must be provided "
    +        + "in the Grok Pattern File if set or exist in the default Grok patterns")
             .required(true)
    -        .addValidator(validateGrokExpression())
    +        .addValidator(StandardValidators.NON_BLANK_VALIDATOR)
    --- End diff --
    
    Yeah. That seems like a behavior they really should have from the start.


---

[GitHub] nifi pull request #2691: NIFI-5170 Upgrad Grok to version 0.1.9

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/nifi/pull/2691


---