You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by GitBox <gi...@apache.org> on 2019/01/28 21:52:48 UTC

[GitHub] nickwallen opened a new pull request #1326: METRON-1974 Batch Profiler Should Handle Errant Profiles Better

nickwallen opened a new pull request #1326: METRON-1974 Batch Profiler Should Handle Errant Profiles Better
URL: https://github.com/apache/metron/pull/1326
 
 
   If the Batch Profiler is used to run a profile with invalid or errant logic, the following exception is thrown. This does not indicate to the user what the underlying root cause of the problem is.
   
   ```
   19/01/28 19:11:28 ERROR TaskSetManager: Task 7 in stage 5.0 failed 4 times; aborting job
   Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 7 in stage 5.0 failed 4 times, most recent failure: Lost task 7.3 in stage 5.0 (TID 12, nat-r7-dwys-metron-3.openstacklocal, executor 2): java.lang.IndexOutOfBoundsException: Index: 0, Size: 0
   at java.util.ArrayList.rangeCheck(ArrayList.java:657)
   at java.util.ArrayList.get(ArrayList.java:433)
   at org.apache.metron.profiler.spark.function.ProfileBuilderFunction.call(ProfileBuilderFunction.java:97)
   at org.apache.metron.profiler.spark.function.ProfileBuilderFunction.call(ProfileBuilderFunction.java:49)
   at org.apache.spark.sql.KeyValueGroupedDataset$$anonfun$mapGroups$1.apply(KeyValueGroupedDataset.scala:219)
   at org.apache.spark.sql.KeyValueGroupedDataset$$anonfun$mapGroups$1.apply(KeyValueGroupedDataset.scala:219)
   at org.apache.spark.sql.KeyValueGroupedDataset$$anonfun$1.apply(KeyValueGroupedDataset.scala:196)
   ```
   
   This makes it difficult for a user to understand that the problem is due to their profile definition.  With this change, the exception is more clear.  
   
   ```
   Caused by: java.lang.IllegalStateException: Unknown function: `INVALID_FUNCTION`
   	at org.apache.metron.stellar.dsl.functions.resolver.BaseFunctionResolver.apply(BaseFunctionResolver.java:150)
   	at org.apache.metron.stellar.dsl.functions.resolver.BaseFunctionResolver.apply(BaseFunctionResolver.java:48)
   	at org.apache.metron.stellar.common.StellarCompiler.resolveFunction(StellarCompiler.java:691)
   	... 52 more
   2019-01-28 16:15:20 DEBUG DefaultProfileBuilder:212 - Flushed profile: profile=invalid-profile, entity=total, maxTime=1530978728982, period=1701087, start=1530978300000, end=1530979200000, duration=900000
   2019-01-28 16:15:20 ERROR ProfileBuilderFunction:104 - No profile measurement can be calculated. Review the profile for bugs. profile=my-invalid-profile, entity=total, period=1701087
   2019-01-28 16:15:20 ERROR Executor:91 - Exception in task 7.0 in stage 5.0 (TID 12)
   java.lang.IllegalStateException: No profile measurement can be calculated. Review the profile for bugs. group=invalid-profile-total-1701087
   	at org.apache.metron.profiler.spark.function.ProfileBuilderFunction.call(ProfileBuilderFunction.java:105)
   	at org.apache.metron.profiler.spark.function.ProfileBuilderFunction.call(ProfileBuilderFunction.java:51)
   	at org.apache.spark.sql.KeyValueGroupedDataset$$anonfun$mapGroups$1.apply(KeyValueGroupedDataset.scala:219)
   ...
   ```
   
   The following phrase is indicative of a problem with the user's profile definition.
   ```
   No profile measurement can be calculated. Review the profile for bugs. 
   profile=my-invalid-profile, entity=192.168.1.21, period=1701087
   ```
   
   
   
   ## Testing
   
   Create an invalid profile definition like the following and run it through the Batch Profiler.  
   ```
       {
         "timestampField": "timestamp",
         "profiles": [
            {
              "profile": "count-by-ip",
              "foreach": "ip_src_addr",
              "init": { "count": 0 },
              "update": { "count" : "count + 1" },
              "result": "count"
            },
            {
              "profile": "my-invalid-profile",
              "foreach": "'total'",
              "init": { "count": 0 },
              "update": { "count": "count + 1" },
              "result": "INVALID_FUNCTION(count)"
            }
         ]
       }
   ```
   
   ## Pull Request Checklist
   
   - [ ] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel).
   - [ ] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
   - [ ] Has your PR been rebased against the latest commit within the target branch (typically master)?
   
   
   ### For code changes:
   - [ ] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
   - [ ] Have you included steps or a guide to how the change may be verified and tested manually?
   - [ ] Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:
     ```
     mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
     ```
   
   - [ ] Have you written or updated unit tests and or integration tests to verify your changes?
   - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
   - [ ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
   
   ### For documentation related changes:
   - [ ] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`:
   
     ```
     cd site-book
     mvn site
     ```
   
   #### Note:
   Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
   It is also recommended that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services