You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/08/16 20:33:20 UTC

[jira] [Commented] (METRON-372) Enhance Statistical Operations Available for Use with the Profiler

    [ https://issues.apache.org/jira/browse/METRON-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15423342#comment-15423342 ] 

ASF GitHub Bot commented on METRON-372:
---------------------------------------

GitHub user nickwallen opened a pull request:

    https://github.com/apache/incubator-metron/pull/214

    METRON-372 Enhance Statistical Operations Available for Use with the Profiler

    ### [METRON-372](https://issues.apache.org/jira/browse/METRON-372)
    
    Only basic math functions are currently available in Stellar for use with the Profiler.  This makes life difficult for users to create even basic profiles like a running average.  
    
    This can be seen in the following example where the average must be calculated manually in Stellar.  A variable `sum` and `cnt` must be maintained and then used to calculate the average.
    
    ```
    {
      "profile": "example3",
      "foreach": "ip_src_addr",
      "onlyif": "protocol == 'HTTP'",
      "init": {
        "sum": 0.0,
        "cnt": 0.0
      },
      "update": {
        "sum": "sum + resp_body_len",
        "cnt": "cnt + 1"
      },
      "result": "sum / cnt"
    }
    ```
    
    This change introduces a series of summary functions that make creating profiles much simpler for the user.  Instead of re-implementing the calculation of an average in Stellar, this leverages Commons Math to perform all the heavy lifting.
    
    The example above for calculating an average can be re-defined as follows.
    
    ```
    {
      "profile": "example3",
      "foreach": "ip_src_addr",
      "onlyif": "protocol == 'HTTP'",
      "init":   { "s": "STATS_INIT(0)" },
      "update": { "_": "STATS_ADD(length, s)" },
      "result": "STATS_MEAN(s)"
    }
    ```
    
    The following summary functions are supported.  These are all statistics that can be calculated in a single pass.  This means that none of the values being summarized are stored in memory.
    * count
    * mean
    * geometric mean
    * max
    * min
    * sum
    * population variance
    * variance
    * second moment
    * quadratic mean
    * standard deviation
    * sum of logs
    * sum of squares.
    
    Note: This change depends on METRON-309.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickwallen/incubator-metron METRON-372

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/214.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #214
    
----
commit 11ef29bb73c8363b1c905f597be496929b77888f
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-07-29T20:53:49Z

    METRON-309 Create a normalcy profiler

commit 28560b36273271c438c3309df610a4c4306ffdde
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-11T20:46:00Z

    METRON-309 Corrected a typo in the 'Getting Started' instructions

commit d55d32efeea38f52bd9d1095aa6f8817bc15f0a5
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-12T13:16:19Z

    METRON-309 Altered based on Stellar Unification changes

commit ca81339e0f326dc04110eafd74fd0879dfe1a029
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-12T13:23:06Z

    METRON-309 Need to set the kafka broker in the Profiler topology properties

commit 712a1083c21eb8f7c0d81511d432c054792ebe9b
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-12T15:17:47Z

    METRON-309 Updated examples to use latest Stellar binary functions

commit 01aa1f972324f97b231b7d9a26f122891a1b6d50
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-15T18:03:49Z

    METRON-309 Fixed the README examples and added each as an integration test.

commit d9e2c5e568292c4a08c6c6c314d3c2e8a9ea38be
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-15T20:08:26Z

    METRON-367 Enhance Profiler to Support Multiple Numeric Types

commit b9366ff97dbc8968ac44a2ba8dfe7bca43e7adc6
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-15T20:32:51Z

    METRON-368 Simplify Profile Configuration with Sensible Defaults

commit 486e8138efbc64342b0bcf405adae521c55d4e33
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-16T16:51:07Z

    METRON-309 Removed legacy classes from Stellar Unification that are no longer needed

commit c3ce806606c0903fd62819e4a29ea91adb1979be
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-16T16:34:26Z

    METRON-372 Refactored the Stellar functions for clarity

commit eaf5255143d7c09ee8735d3c7d4fa34a58c25831
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-16T19:49:13Z

    METRON-372 Added summary statistics functions to Stellar

commit 06118874b28deaf62fb4b454672c6bce21ea39ee
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-16T20:01:13Z

    METRON-372 Updated example to show use of new STATS_x functions

----


> Enhance Statistical Operations Available for Use with the Profiler
> ------------------------------------------------------------------
>
>                 Key: METRON-372
>                 URL: https://issues.apache.org/jira/browse/METRON-372
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>
> Only basic math functions are currently available in Stellar for use with the Profiler.  This makes life difficult for users to create even basic profiles like a running average.  This can be seen in this example where the average must be calculated manually in Stellar.
> ```
> {
>   "profile": "example3",
>   "foreach": "ip_src_addr",
>   "onlyif": "protocol == 'HTTP'",
>   "init": {
>     "sum": 0.0,
>     "cnt": 0.0
>   },
>   "update": {
>     "sum": "sum + resp_body_len",
>     "cnt": "cnt + 1"
>   },
>   "result": "sum / cnt"
> }
> ```
> Make it easier for users to create basic profiles like a running average.  Also, ddd additional summary functions to Stellar to extend the capabilities of the Profiler.  The following summary functions are targeted; min, max, mean, geometric mean, count, sum, sum of squares, standard deviation, variance



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)