You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/08/17 20:21:20 UTC

[jira] [Commented] (METRON-377) Enable Profiles that Use Non-Single Pass Summary Functions

    [ https://issues.apache.org/jira/browse/METRON-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15425288#comment-15425288 ] 

ASF GitHub Bot commented on METRON-377:
---------------------------------------

GitHub user nickwallen opened a pull request:

    https://github.com/apache/incubator-metron/pull/215

    METRON-377 Enable Profiles that Use Non-Single Pass Summary Functions

    Note: This change depends on #208, #212, #213, #214 . The diff will be easier to grok once those PRs are merged.
    
    ### [METRON-377](https://issues.apache.org/jira/browse/METRON-377)
    
    As of METRON-372 and #214 , Profiles can be built using many statistical summaries that only require a single-pass over the data.  This is less memory intensive and more scalable for high volume loads.
    
    Unfortunately, not all functions can be calculated in a single pass.  In particular, the skewness, ketosis and percentiles require all data to be stored in memory for the calculation to occur.  The platform was enhanced so that a user can leverage skewness, ketosis and percentiles. 
    
    ### Changes
    
    The `STATS_INIT` function was enhanced to accept a `window_size`.  This defines the number of input data elements that are maintained in memory.  
    
    If the `window_size` is greater than 0, a rolling window of the most recent `window_size` elements is maintained in memory.  The skewness, ketosis and percentiles are calculated over this rolling window.  The `window_size` must be >0 otherwise these values cannot be calculated.
    
    If a user supplies a `window_size` equal to 0 then the more efficient implementation that does not maintain a rolling window in memory is used.  Of course, in this case, the skewness, ketosis, and percentiles cannot be calculated.
    
    The following additional functions were also added.
    * STATS_KETOSIS
    * STATS_SKEWNESS
    * STATS_PERCENTILES
    
    Another integration test was added to validate this end-to-end also.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nickwallen/incubator-metron METRON-377

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/incubator-metron/pull/215.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #215
    
----
commit 11ef29bb73c8363b1c905f597be496929b77888f
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-07-29T20:53:49Z

    METRON-309 Create a normalcy profiler

commit 28560b36273271c438c3309df610a4c4306ffdde
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-11T20:46:00Z

    METRON-309 Corrected a typo in the 'Getting Started' instructions

commit d55d32efeea38f52bd9d1095aa6f8817bc15f0a5
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-12T13:16:19Z

    METRON-309 Altered based on Stellar Unification changes

commit ca81339e0f326dc04110eafd74fd0879dfe1a029
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-12T13:23:06Z

    METRON-309 Need to set the kafka broker in the Profiler topology properties

commit 712a1083c21eb8f7c0d81511d432c054792ebe9b
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-12T15:17:47Z

    METRON-309 Updated examples to use latest Stellar binary functions

commit 01aa1f972324f97b231b7d9a26f122891a1b6d50
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-15T18:03:49Z

    METRON-309 Fixed the README examples and added each as an integration test.

commit d9e2c5e568292c4a08c6c6c314d3c2e8a9ea38be
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-15T20:08:26Z

    METRON-367 Enhance Profiler to Support Multiple Numeric Types

commit b9366ff97dbc8968ac44a2ba8dfe7bca43e7adc6
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-15T20:32:51Z

    METRON-368 Simplify Profile Configuration with Sensible Defaults

commit 486e8138efbc64342b0bcf405adae521c55d4e33
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-16T16:51:07Z

    METRON-309 Removed legacy classes from Stellar Unification that are no longer needed

commit c3ce806606c0903fd62819e4a29ea91adb1979be
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-16T16:34:26Z

    METRON-372 Refactored the Stellar functions for clarity

commit eaf5255143d7c09ee8735d3c7d4fa34a58c25831
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-16T19:49:13Z

    METRON-372 Added summary statistics functions to Stellar

commit 2e4844cf7e4a98c3594fac542bcf0f32d938b0b2
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-16T20:01:13Z

    METRON-372 Updated example to show use of new STATS_x functions

commit 8025994f5858a2e332f4a2ee325ab9cffd82b092
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-17T19:39:53Z

    METRON-377 Enable Profiles that Use Non-Single Pass Summary Functions

commit 62a3a0f48d5c792c85486014cfa45e2c4620362c
Author: Nick Allen <ni...@nickallen.org>
Date:   2016-08-17T20:18:24Z

    METRON-377 Added another integration test

----


> Enable Profiles that Use Non-Single Pass Summary Functions
> ----------------------------------------------------------
>
>                 Key: METRON-377
>                 URL: https://issues.apache.org/jira/browse/METRON-377
>             Project: Metron
>          Issue Type: Improvement
>            Reporter: Nick Allen
>            Assignee: Nick Allen
>
> As of METRON-372, Profiles can be built using many statistical summaries that only require a single-pass over the data.  This is less memory intensive and more scalable for high volume loads.
> Unfortunately, not all functions can be calculated in a single pass.  In particular, the skewness, ketosis and percentiles require all data to be stored in memory for the calculation to occur.
> The platform needs enhanced so that a user can leverage skewness, ketosis and percentiles. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)