You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@metron.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2016/08/17 20:21:20 UTC
[jira] [Commented] (METRON-377) Enable Profiles that Use Non-Single
Pass Summary Functions
[ https://issues.apache.org/jira/browse/METRON-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15425288#comment-15425288 ]
ASF GitHub Bot commented on METRON-377:
---------------------------------------
GitHub user nickwallen opened a pull request:
https://github.com/apache/incubator-metron/pull/215
METRON-377 Enable Profiles that Use Non-Single Pass Summary Functions
Note: This change depends on #208, #212, #213, #214 . The diff will be easier to grok once those PRs are merged.
### [METRON-377](https://issues.apache.org/jira/browse/METRON-377)
As of METRON-372 and #214 , Profiles can be built using many statistical summaries that only require a single-pass over the data. This is less memory intensive and more scalable for high volume loads.
Unfortunately, not all functions can be calculated in a single pass. In particular, the skewness, ketosis and percentiles require all data to be stored in memory for the calculation to occur. The platform was enhanced so that a user can leverage skewness, ketosis and percentiles.
### Changes
The `STATS_INIT` function was enhanced to accept a `window_size`. This defines the number of input data elements that are maintained in memory.
If the `window_size` is greater than 0, a rolling window of the most recent `window_size` elements is maintained in memory. The skewness, ketosis and percentiles are calculated over this rolling window. The `window_size` must be >0 otherwise these values cannot be calculated.
If a user supplies a `window_size` equal to 0 then the more efficient implementation that does not maintain a rolling window in memory is used. Of course, in this case, the skewness, ketosis, and percentiles cannot be calculated.
The following additional functions were also added.
* STATS_KETOSIS
* STATS_SKEWNESS
* STATS_PERCENTILES
Another integration test was added to validate this end-to-end also.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/nickwallen/incubator-metron METRON-377
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/incubator-metron/pull/215.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #215
----
commit 11ef29bb73c8363b1c905f597be496929b77888f
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-07-29T20:53:49Z
METRON-309 Create a normalcy profiler
commit 28560b36273271c438c3309df610a4c4306ffdde
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-11T20:46:00Z
METRON-309 Corrected a typo in the 'Getting Started' instructions
commit d55d32efeea38f52bd9d1095aa6f8817bc15f0a5
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-12T13:16:19Z
METRON-309 Altered based on Stellar Unification changes
commit ca81339e0f326dc04110eafd74fd0879dfe1a029
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-12T13:23:06Z
METRON-309 Need to set the kafka broker in the Profiler topology properties
commit 712a1083c21eb8f7c0d81511d432c054792ebe9b
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-12T15:17:47Z
METRON-309 Updated examples to use latest Stellar binary functions
commit 01aa1f972324f97b231b7d9a26f122891a1b6d50
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-15T18:03:49Z
METRON-309 Fixed the README examples and added each as an integration test.
commit d9e2c5e568292c4a08c6c6c314d3c2e8a9ea38be
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-15T20:08:26Z
METRON-367 Enhance Profiler to Support Multiple Numeric Types
commit b9366ff97dbc8968ac44a2ba8dfe7bca43e7adc6
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-15T20:32:51Z
METRON-368 Simplify Profile Configuration with Sensible Defaults
commit 486e8138efbc64342b0bcf405adae521c55d4e33
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-16T16:51:07Z
METRON-309 Removed legacy classes from Stellar Unification that are no longer needed
commit c3ce806606c0903fd62819e4a29ea91adb1979be
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-16T16:34:26Z
METRON-372 Refactored the Stellar functions for clarity
commit eaf5255143d7c09ee8735d3c7d4fa34a58c25831
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-16T19:49:13Z
METRON-372 Added summary statistics functions to Stellar
commit 2e4844cf7e4a98c3594fac542bcf0f32d938b0b2
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-16T20:01:13Z
METRON-372 Updated example to show use of new STATS_x functions
commit 8025994f5858a2e332f4a2ee325ab9cffd82b092
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-17T19:39:53Z
METRON-377 Enable Profiles that Use Non-Single Pass Summary Functions
commit 62a3a0f48d5c792c85486014cfa45e2c4620362c
Author: Nick Allen <ni...@nickallen.org>
Date: 2016-08-17T20:18:24Z
METRON-377 Added another integration test
----
> Enable Profiles that Use Non-Single Pass Summary Functions
> ----------------------------------------------------------
>
> Key: METRON-377
> URL: https://issues.apache.org/jira/browse/METRON-377
> Project: Metron
> Issue Type: Improvement
> Reporter: Nick Allen
> Assignee: Nick Allen
>
> As of METRON-372, Profiles can be built using many statistical summaries that only require a single-pass over the data. This is less memory intensive and more scalable for high volume loads.
> Unfortunately, not all functions can be calculated in a single pass. In particular, the skewness, ketosis and percentiles require all data to be stored in memory for the calculation to occur.
> The platform needs enhanced so that a user can leverage skewness, ketosis and percentiles.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)