You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@metron.apache.org by ottobackwards <gi...@git.apache.org> on 2017/04/13 21:27:17 UTC

[GitHub] incubator-metron pull request #530: METRON-777 Metron Extension System and P...

GitHub user ottobackwards opened a pull request:

    https://github.com/apache/incubator-metron/pull/530

    METRON-777 Metron Extension System and Parser Extensions

    ## Contributor Comments
    [Please place any comments here.  A description of the problem/enhancement, how to reproduce the issue, your testing methodology, etc.]
    
    The pr. introduces an extension system for metron, along with refactoring the metron parsers on top of it.  This is the base work for METRON-258 - Sideloading parsers, which is to follow,  as it enables the creation and management of extensions outside of the metron codebase.  The work for enabling side loading is the ability to install and deploy 3rd party extensions/parsers.
    
    There is a lot that can be done with this, but I could nibble at it forever, and I'd like to get feedback and improvements going.  There is still more documentation work that can be done for example.
    
    The areas of change:
    
    ### Metron Maven Bundle Plugin
    - adaptation of the nifi plugin
    - more configurable wrt file extension/dependency and metadata naming
    - new pre-build step on clean systems to install plugin
    
    ### bundle-lib
    My goal here was not to make any radical changes
     - adaptation of nifi-nar-utils to be used outside of the nifi project
     - rudimentary extensibility to allow configuration and injection of service types and other things that were hard coded to nifi 
     - refactored from File based to VFS based
     - introduced class cluster for FileUtils to allow for specialized HDFS file functionality ( HDFS with VFS is read only )
     - rebranding to Bundle from Nar ( although the lib and the plugin allow that to be configured now )
     - added capability to the properties class to write to stream, adapted to uri from paths
     - added integration tests for hdfs
    
    ### Metron Maven Parser Extension Archetype
     - locally installable archetype for creating Metron Parsers
     - <provided> dependencies instead of shading
     - builds Bundle
     - creates an assembly with bundle and configuration
     - configuration for parsers now includes all parser related configurations ( except ES and Logrotate )
     - Includes sample data for testing, global config etc. ( such that you don't need metron code to build and test ).
     - Can be used with configuration only parsers, so that you can still unit and integration test them, common deployment
     - Creates documentation readme.
    
    ### Metron-Extensions / Metron-Parser-Extensions/**
    - Module area for extensions, first extension type is parsers
    - All parsers re-based on archetype generated projects
    - All parser test data/ configuration located with parser ( see above )
    - Each parser had a readme that should be filled out, but I didn't do that
    
    ### Metron-Parsers
     - Removed all parsers and their tests etc except CSV, JSON, GROK
     - Still shaded, still the storm loaded jar
     - extended or fix tests so that they work when derived outside of code tree
     - Parser bolt no longer takes MessageParser<> instance, loads it as from extension/bundle system
    
    ### Metron-configuration
     - changes to support new parser locations
     - added functionality to load and store bundle.properties to zk
    
    ### Metron Tests
     - Extended to work with relative path / formatted paths
    
    ### RPM-Build
     - Copy all the parser extensions
     - Include in the spec
    
    /usr/metron/V/ now has a new directories for extensions:
    
    /extension_etc/PARSER_NAME/  -> that parser's configuration
    /extension_alt_etc/ -> location for 3rd Party extension configuration
    /extension_lib -> location on disk for rpm to place bundles
    /extension_alt_lib -> location on disk for staging 3rd party bundles
    
    ### METRON-SERVICE ambari
     - Load zk configurations for parsers from their location
     - filles out the properties template and deploys to hdfs
     - create HDFS directories
     - deploy/copy bundles to HDFS
    
    ### the metron workflow that this enables
    We need a new parser:
    *create with archetype under metron-extensions/metron-parser-extensions
    *implement including tests and test data, all configurations
    *add to the copy-resources of RPM-Docker pom
    *add to the spec file
    *add to the all_parsers variable in params
     - this will get it installed but not started, no ES no log rotate
    * add to parsers variable in the env.xml to get it to start as well ( still no ES or Log rotate )
    * other steps to get the ES template integrated with indexing scripts and log rotate with ansible
    
    I have been working in Full Dev to get this going, and I believe it is working enough to get this started.
    At the end of vagrant up with full dev, you should have data in kibana, as if nothing had changed ;)
    
    There are issues however:
    I have not integrated this with the Metron Docker project, I'm not sure how yet.
    I have fixed Metron-Interface to get the test to run, but I think that work needs to be done there.
    
    The next steps here are follow ons for installing parsers from the ui.
    
    Testing:  Build and Tests run, Vagrants work, what is broken with Docker, AWS if you can do it.
    Build a parser see that it builds and the tests run from the archetype
    
    Basic smoke test of system?
    
    I am sure I missed many things, or that there are things that could be better.  Thank you in advance for your review.
    
    
    ## Pull Request Checklist
    
    Thank you for submitting a contribution to Apache Metron (Incubating).  
    Please refer to our [Development Guidelines](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=61332235) for the complete guide to follow for contributions.  
    Please refer also to our [Build Verification Guidelines](https://cwiki.apache.org/confluence/display/METRON/Verifying+Builds?show-miniview) for complete smoke testing guides.  
    
    
    In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:
    
    ### For all changes:
    - [x ] Is there a JIRA ticket associated with this PR? If not one needs to be created at [Metron Jira](https://issues.apache.org/jira/browse/METRON/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel). 
    - [x ] Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
    - [ x] Has your PR been rebased against the latest commit within the target branch (typically master)?
    
    
    ### For code changes:
    - [x ] Have you included steps to reproduce the behavior or problem that is being changed or addressed?
    - [x ] Have you included steps or a guide to how the change may be verified and tested manually?
    - [x ] Have you ensured that the full suite of tests and checks have been executed in the root incubating-metron folder via:
      ```
      mvn -q clean integration-test install && build_utils/verify_licenses.sh 
      ```
    
    - [x ] Have you written or updated unit tests and or integration tests to verify your changes?
    - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? 
    - [x ] Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?
    
    ### For documentation related changes:
    - [ x] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via `site-book/target/site/index.html`:
    
      ```
      cd site-book
      bin/generate-md.sh
      mvn site:site
      ```
    
    #### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
    It is also recommened that [travis-ci](https://travis-ci.org) is set up for your personal repository such that your branches are built there before submitting a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ottobackwards/incubator-metron METRON-777

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/530.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #530
    
----
commit 864d320d91c522dfc2eb63fc12341f316a3f8952
Author: Otto Fowler <ot...@gmail.com>
Date:   2017-03-17T04:56:49Z

    Metron Extension system
    
    Based on Apache Nifi Nars
    
    NAR changes
    * new lib , rebrand to bundles from NAR
    * port to VFS/FileObject from File based
    * ability to set property values
    * Rework FileUtils so that you can derive and override
    * added initializers to set 'classes' that we care about instead of hard coding them, still needs defaults
    * added components nec. for integration tests ( do not want dep. on metron-* )
    * VFSClassloader for NarClassLoader
    * Hdfs based integration test version of unpacknars tests
    * HDFS ( filesystem ) based fileutilities to cover for writes to hdfs, since VFS is currently R/O HDFS
    * modified plugin to support configuration of outputs
    * use class index not service loader ( both subclass and annotated supported )
    
    Archetype
    * Parser Extension archetyp
    * incudes all configuration
    * creates tar.gz with bundle and configuration
    * class index support ( automatic generation )
    
    Extensions
    * new extensions modules
    * parser
    * archetype built module for each parser type
    * support for configuration only parsers with tests
    
    Parsers
    * moved all but json, csv, grok to extensions
    * Bolt now loads from bundle properties
    
    Deployment
    * rpms for parsers
    * create extension directories
    * ambari initializes zookeeper per parser
    * amabri creates hdfs directories
    * ISSUE: Writing to hdfs
    
    Rest-API
    * only test against parsers in metron-parsers
    * still needs integration

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #530: METRON-777 Metron Extension System and Parser E...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/incubator-metron/pull/530
  
    RE: parser size
    The metron-parser jar is still shaded. it is 97M JAR, 87M archive
    The parsers, individually like the ASA are 44k JAR, 44k Bundle, 49K tar.gz each


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #530: METRON-777 Metron Extension System and Parser E...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/incubator-metron/pull/530
  
    if you don't review the parsers, because the code didn't change, you still need to review or think about the concept that each parser, as an extension should have all of the things ( configuration etc ) that it needs within it's package.  So the parsers have their grok statements, their configuration ( index, enrichment , and parsers  and in the future ES + log rotate )


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #530: METRON-777 Metron Extension System and Parser E...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/incubator-metron/pull/530
  
    So do you like want a listing of files?
    All the areas of functionality that are not bundle or parser extensions would be the areas to look at.
    Although I would say that the bundle-lib should still be reviewed in general.
    
    Here is a high level by the file guide, but I may be missing something
    
    // changes to OOM issues in travis container
    .travis.yml
    README.md
    
    // packaging and deployment
    metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/configuration/metron-env.xml
    metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/metainfo.xml
    metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/metron_service.py
    metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/params/params_linux.py
    metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/params/status_params.py
    metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/scripts/parser_commands.py
    metron-deployment/packaging/ambari/metron-mpack/src/main/resources/common-services/METRON/CURRENT/package/templates/bundle.properties.j2
    metron-deployment/packaging/docker/rpm-docker/SPECS/metron.spec
    metron-deployment/packaging/docker/rpm-docker/pom.xml
    
    metron-extensions/README.md
    metron-extensions/metron-parser-extensions/README.md
    
    // asa integration test, but from bundle, not dep on asa
    metron-extensions/metron-parser-extensions/metron-parser-bundle-tests/pom.xml
    metron-extensions/metron-parser-extensions/metron-parser-bundle-tests/src/test/java/org/apache/metron/parsers/ASABundleHDFSIntegrationTest.java
    metron-extensions/metron-parser-extensions/metron-parser-bundle-tests/src/test/resources/metron/extension_contrib_lib/metron-parser-test-bundle-1.0-SNAPSHOT.bundle
    metron-extensions/metron-parser-extensions/metron-parser-bundle-tests/src/test/resources/metron/extension_lib/metron-parser-asa-bundle-0.3.1.bundle
    metron-extensions/metron-parser-extensions/metron-parser-bundle-tests/src/test/resources/zookeeper/bundle.properties
    metron-extensions/metron-parser-extensions/metron-parser-bundle-tests/src/test/resources/zookeeper/enrichments/test.json
    metron-extensions/metron-parser-extensions/metron-parser-bundle-tests/src/test/resources/zookeeper/global.json
    metron-extensions/metron-parser-extensions/metron-parser-bundle-tests/src/test/resources/zookeeper/indexing/test.json
    
    // fixes for tests
    metron-interface/metron-rest/src/main/java/org/apache/metron/rest/MetronRestConstants.java
    metron-interface/metron-rest/src/test/java/org/apache/metron/rest/controller/GrokControllerIntegrationTest.java
    metron-interface/metron-rest/src/test/java/org/apache/metron/rest/controller/KafkaControllerIntegrationTest.java
    metron-interface/metron-rest/src/test/java/org/apache/metron/rest/controller/SensorParserConfigControllerIntegrationTest.java
    metron-interface/metron-rest/src/test/java/org/apache/metron/rest/service/impl/GrokServiceImplTest.java
    metron-interface/metron-rest/src/test/java/org/apache/metron/rest/service/impl/SensorParserConfigServiceImplTest.java
    
    //new functions for bundle.properties and tests - dealing with paths etc
    metron-platform/metron-common/src/main/java/org/apache/metron/common/configuration/ConfigurationsUtils.java
    metron-platform/metron-common/src/test/java/org/apache/metron/common/cli/ConfigurationsUtilsTest.java
    metron-platform/metron-common/src/test/java/org/apache/metron/common/configuration/SensorEnrichmentConfigTest.java
    
    //support new directories
    metron-platform/metron-enrichment/src/test/java/org/apache/metron/enrichment/integration/components/ConfigUploadComponent.java
    
    // load parsers from bundles
    metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserBolt.java
    metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/bolt/ParserLoader.java
    metron-platform/metron-parsers/src/main/java/org/apache/metron/parsers/topology/ParserTopologyBuilder.java
    
    // the system bundle properties to tests
    metron-platform/metron-integration-test/src/main/config/zookeeper/bundle.properties
    
    // test function fixes for new paths
    metron-platform/metron-management/src/test/java/org/apache/metron/management/ConfigurationFunctionsTest.java
    metron-platform/metron-management/src/test/java/org/apache/metron/management/ParserConfigFunctionsTest.java
    
    // fix to work with new paths
    metron-platform/metron-test-utilities/src/main/java/org/apache/metron/test/utils/SampleDataUtils.java
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #530: METRON-777 Metron Extension System and Parser E...

Posted by ottobackwards <gi...@git.apache.org>.
Github user ottobackwards commented on the issue:

    https://github.com/apache/incubator-metron/pull/530
  
    RE: Travis
    I was getting these and going to vm was the only way I could get rid of them:
    
    
    The system is out of resources.
    Consult the following stack trace for details.
    java.lang.OutOfMemoryError: Java heap space
    	at java.util.Arrays.copyOf(Arrays.java:3181)
    	at java.util.ArrayList.grow(ArrayList.java:261)
    	at java.util.ArrayList.ensureExplicitCapacity(ArrayList.java:235)
    	at java.util.ArrayList.ensureCapacityInternal(ArrayList.java:227)
    	at java.util.ArrayList.add(ArrayList.java:458)
    	at com.sun.tools.javac.file.ZipFileIndex$ZipDirectory.readEntry(ZipFileIndex.java:674)
    	at com.sun.tools.javac.file.ZipFileIndex$ZipDirectory.buildIndex(ZipFileIndex.java:578)
    	at com.sun.tools.javac.file.ZipFileIndex$ZipDirectory.access$000(ZipFileIndex.java:485)
    	at com.sun.tools.javac.file.ZipFileIndex.checkIndex(ZipFileIndex.java:193)
    	at com.sun.tools.javac.file.ZipFileIndex.<init>(ZipFileIndex.java:137)
    	at com.sun.tools.javac.file.ZipFileIndexCache.getZipFileIndex(ZipFileIndexCache.java:100)
    	at com.sun.tools.javac.file.JavacFileManager.openArchive(JavacFileManager.java:598)
    	at com.sun.tools.javac.file.JavacFileManager.openArchive(JavacFileManager.java:545)
    	at com.sun.tools.javac.file.JavacFileManager.listContainer(JavacFileManager.java:429)
    	at com.sun.tools.javac.file.JavacFileManager.list(JavacFileManager.java:676)
    	at com.sun.tools.javac.code.ClassFinder.scanUserPaths(ClassFinder.java:564)
    	at com.sun.tools.javac.code.ClassFinder.fillIn(ClassFinder.java:504)
    	at com.sun.tools.javac.code.ClassFinder.complete(ClassFinder.java:287)
    	at com.sun.tools.javac.code.ClassFinder.access$000(ClassFinder.java:72)
    	at com.sun.tools.javac.code.ClassFinder$1.complete(ClassFinder.java:159)
    	at com.sun.tools.javac.code.Symbol.complete(Symbol.java:579)
    	at com.sun.tools.javac.comp.Enter.visitTopLevel(Enter.java:299)
    	at com.sun.tools.javac.tree.JCTree$JCCompilationUnit.accept(JCTree.java:509)
    	at com.sun.tools.javac.comp.Enter.classEnter(Enter.java:255)
    	at com.sun.tools.javac.comp.Enter.classEnter(Enter.java:270)
    	at com.sun.tools.javac.comp.Enter.complete(Enter.java:483)
    	at com.sun.tools.javac.comp.Enter.main(Enter.java:467)
    	at com.sun.tools.javac.main.JavaCompiler.enterTrees(JavaCompiler.java:952)
    	at com.sun.tools.javac.main.JavaCompiler.compile(JavaCompiler.java:833)
    	at com.sun.tools.javac.main.Main.compile(Main.java:253)
    	at com.google.errorprone.BaseErrorProneCompiler.run(BaseErrorProneCompiler.java:214)
    	at com.google.errorprone.BaseErrorProneCompiler.run(BaseErrorProneCompiler.java:106)
    [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.5.1:compile (default-compile) on project metron-parser-lancope: Compilation failure -> [Help 1]
    [ERROR] 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] incubator-metron issue #530: METRON-777 Metron Extension System and Parser E...

Posted by cestella <gi...@git.apache.org>.
Github user cestella commented on the issue:

    https://github.com/apache/incubator-metron/pull/530
  
    Woah, big contribution here; thanks @ottobackwards !  So this one is hard to review because a lot of it is:
    * Copied from NiFi's nar
    * Moving files around.
    
    Would you mind giving us a list of files where changes are made that don't fit those two categories?  I think that'd help us isolate the bits to review easier.
    
    In the meantime, I have a couple of questions:
    * Could you go over again why we needed the VM in travis?
    * What is the parser file size impact?  In other words, when we create a new bundle for a parser, are we shading and including all of metron-parser or is that isolated from the parser?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---