You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by cestella <gi...@git.apache.org> on 2016/05/27 13:55:50 UTC

[GitHub] incubator-metron pull request: METRON-189: Add the ability to do g...

GitHub user cestella opened a pull request:

    https://github.com/apache/incubator-metron/pull/138

    METRON-189: Add the ability to do global validations on messages passing through the parser.

    Allow the user to specify field level or message level validations to ensure messages coming from the parser are valid. For instance, allow the ability ensure that a field is an IPv4 address.
    If a field is invalid, send to a separate stream from the parser bolt. Follow-on work should be done to send this stream to the index for after-the-fact inspection.
    
    I added the following validation functions:
    * `MQL` : Execute a Query Language statement.  Expects the query string in the `condition` field of the config.
    * `IP` : Validates that the input fields are an IP addres.  By default, if no configuration is set, it assumes `IPV4`, but you can specify        the type by passing in the config by passing in `type` with either `IPV6` or `IPV4`.
    * `DOMAIN` : Validates that the fields are all domains.
    * `EMAIL` : Validates that the fields are all email addresses
    * `URL` : Validates that the fields are all URLs
    * `DATE` : Validates that the fields are a date.  Expects `format` in the config.
    * `INTEGER` : Validates that the fields are an integer.  String representation of an integer is allowed.
    * `REGEX_MATCH` : Validates that the fields match a regex.  Expects `pattern` in the config.
    * `NOT_EMPTY` : Validates that the fields exist and are not empty (after trimming.)
    
    Because of the nice overlap, I also added these functions to the query language, so query language rules can take advantage of `IS_IP(field1)` for instance.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/cestella/incubator-metron validation

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/138.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #138
    
----
commit 258ae1cde4cdb5092edeb00000dc6a50f0af2c1c
Author: cstella <ce...@gmail.com>
Date:   2016-05-27T02:46:35Z

    Added validation framework.

commit d2ae7b3a8a96787523755ce0107c64909e2729bd
Author: cstella <ce...@gmail.com>
Date:   2016-05-27T03:39:52Z

    Updating validators to work with unit tests.

commit ccc42b3d7390e98ade326c5651719d9b6d4533b8
Author: cstella <ce...@gmail.com>
Date:   2016-05-27T13:50:30Z

    Updating readme.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request: METRON-189: Add the ability to do g...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on the pull request:

    https://github.com/apache/incubator-metron/pull/138#issuecomment-222363626
  
    Yeah, the documentation situation is a bit incorrect.  Documentation for configuration started to be placed where the configuration objects existed, rather than where they were used.  Since we put the configuration in commons, then the documentation went in commons.  I suggest strongly that we do a follow-on after we flush the PR queue to move docs around.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request #138: METRON-189: Add the ability to do global...

Posted by merrimanr <gi...@git.apache.org>.
Github user merrimanr commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/138#discussion_r65461745
  
    --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java ---
    @@ -140,18 +142,25 @@ public void execute(Tuple tuple) {
         try {
           boolean ackTuple = true;
           if(sensorParserConfig != null) {
    +        List<FieldValidator> fieldValidations = getConfigurations().getFieldValidations();
             List<JSONObject> messages = parser.parse(originalMessage);
             for (JSONObject message : messages) {
               if (parser.validate(message)) {
    -            if (filter != null && filter.emitTuple(message)) {
    -              ackTuple = !isBulk;
    -              message.put(Constants.SENSOR_TYPE, getSensorType());
    -              for (FieldTransformer handler : sensorParserConfig.getFieldTransformations()) {
    -                if (handler != null) {
    -                  handler.transformAndUpdate(message, sensorParserConfig.getParserConfig());
    +            if(!isGloballyValid(message, fieldValidations)) {
    +              message.put(Constants.SENSOR_TYPE, getSensorType()+ ".invalid");
    +              collector.emit(Constants.INVALID_STREAM, new Values(message));
    +            }
    +            else if (filter != null && filter.emitTuple(message)) {
    +              if (filter != null && filter.emitTuple(message)) {
    --- End diff --
    
    Isn't this a duplicate of the previous line?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request #138: METRON-189: Add the ability to do global...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/138#discussion_r65462284
  
    --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java ---
    @@ -140,18 +142,25 @@ public void execute(Tuple tuple) {
         try {
           boolean ackTuple = true;
           if(sensorParserConfig != null) {
    +        List<FieldValidator> fieldValidations = getConfigurations().getFieldValidations();
             List<JSONObject> messages = parser.parse(originalMessage);
             for (JSONObject message : messages) {
               if (parser.validate(message)) {
    -            if (filter != null && filter.emitTuple(message)) {
    -              ackTuple = !isBulk;
    -              message.put(Constants.SENSOR_TYPE, getSensorType());
    -              for (FieldTransformer handler : sensorParserConfig.getFieldTransformations()) {
    -                if (handler != null) {
    -                  handler.transformAndUpdate(message, sensorParserConfig.getParserConfig());
    +            if(!isGloballyValid(message, fieldValidations)) {
    +              message.put(Constants.SENSOR_TYPE, getSensorType()+ ".invalid");
    +              collector.emit(Constants.INVALID_STREAM, new Values(message));
    +            }
    +            else if (filter != null && filter.emitTuple(message)) {
    +              if (filter != null && filter.emitTuple(message)) {
    --- End diff --
    
    Yep, bad merge.  Fixed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request: METRON-189: Add the ability to do g...

Posted by james-sirota <gi...@git.apache.org>.
Github user james-sirota commented on the pull request:

    https://github.com/apache/incubator-metron/pull/138#issuecomment-222346239
  
    I think the documentation is misplaced.  https://github.com/cestella/incubator-metron/tree/validation/metron-platform/metron-common is really good, but probably does not belong in common.  I would put this with the parsers.  What was the reasoning for putting it here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request #138: METRON-189: Add the ability to do global...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/138#discussion_r65462256
  
    --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java ---
    @@ -165,8 +174,17 @@ public void execute(Tuple tuple) {
         }
       }
     
    +  private boolean isGloballyValid(JSONObject input, List<FieldValidator> validators) {
    +    boolean ret = true;
    +    for(FieldValidator validator : validators) {
    +      ret &= validator.isValid(input, getConfigurations().getGlobalConfig());
    --- End diff --
    
    Yep, agreed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request #138: METRON-189: Add the ability to do global...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/incubator-metron/pull/138


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #138: METRON-189: Add the ability to do global valida...

Posted by merrimanr <gi...@git.apache.org>.
Github user merrimanr commented on the issue:

    https://github.com/apache/incubator-metron/pull/138
  
    +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request #138: METRON-189: Add the ability to do global...

Posted by merrimanr <gi...@git.apache.org>.
Github user merrimanr commented on a diff in the pull request:

    https://github.com/apache/incubator-metron/pull/138#discussion_r65462013
  
    --- Diff: metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java ---
    @@ -165,8 +174,17 @@ public void execute(Tuple tuple) {
         }
       }
     
    +  private boolean isGloballyValid(JSONObject input, List<FieldValidator> validators) {
    +    boolean ret = true;
    +    for(FieldValidator validator : validators) {
    +      ret &= validator.isValid(input, getConfigurations().getGlobalConfig());
    --- End diff --
    
    Would it be more efficient to break on a failed validation?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron pull request: METRON-189: Add the ability to do g...

Posted by james-sirota <gi...@git.apache.org>.
Github user james-sirota commented on the pull request:

    https://github.com/apache/incubator-metron/pull/138#issuecomment-222347328
  
    I spot checked the validators and the validators work.  I checked IP, domain, url and date using YAF and Bro.  I did not test the negative case since the dead letter q does not yet exist.  One thing of note I see that you are using commons validator and in my previous experience this library has pretty significant performance issues validating IPs.  I think checking it with regex was 10 times faster or something around there.  Once we get to performance tuning this may be something to keep in mind.  Otherwise great job!  +1 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---