You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Ted Dunning (JIRA)" <ji...@apache.org> on 2010/07/22 02:56:51 UTC

[jira] Created: (MAHOUT-444) Need on-line distribution summary statistics ... mean, median, min, max q25, q75

Need on-line distribution summary statistics ... mean, median, min, max q25, q75
--------------------------------------------------------------------------------

                 Key: MAHOUT-444
                 URL: https://issues.apache.org/jira/browse/MAHOUT-444
             Project: Mahout
          Issue Type: Bug
            Reporter: Ted Dunning


For the on-line learning algorithms it is very helpful to have robust on-line estimate of various summary statistics.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-444) Need on-line distribution summary statistics ... mean, median, min, max q25, q75

Posted by "Sean Owen (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated MAHOUT-444:
-----------------------------

           Status: Resolved  (was: Patch Available)
         Assignee: Ted Dunning
    Fix Version/s: 0.4
       Resolution: Fixed

> Need on-line distribution summary statistics ... mean, median, min, max q25, q75
> --------------------------------------------------------------------------------
>
>                 Key: MAHOUT-444
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-444
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Ted Dunning
>            Assignee: Ted Dunning
>             Fix For: 0.4
>
>         Attachments: MAHOUT-444.patch
>
>
> For the on-line learning algorithms it is very helpful to have robust on-line estimate of various summary statistics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-444) Need on-line distribution summary statistics ... mean, median, min, max q25, q75

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Dunning updated MAHOUT-444:
-------------------------------

    Attachment: MAHOUT-444.patch

> Need on-line distribution summary statistics ... mean, median, min, max q25, q75
> --------------------------------------------------------------------------------
>
>                 Key: MAHOUT-444
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-444
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Ted Dunning
>         Attachments: MAHOUT-444.patch
>
>
> For the on-line learning algorithms it is very helpful to have robust on-line estimate of various summary statistics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-444) Need on-line distribution summary statistics ... mean, median, min, max q25, q75

Posted by "Hudson (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12891265#action_12891265 ] 

Hudson commented on MAHOUT-444:
-------------------------------

Integrated in Mahout-Quality #151 (See [http://hudson.zones.apache.org/hudson/job/Mahout-Quality/151/])
    MAHOUT-444 - fixed one test and disabled the other


> Need on-line distribution summary statistics ... mean, median, min, max q25, q75
> --------------------------------------------------------------------------------
>
>                 Key: MAHOUT-444
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-444
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Ted Dunning
>         Attachments: MAHOUT-444.patch
>
>
> For the on-line learning algorithms it is very helpful to have robust on-line estimate of various summary statistics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-444) Need on-line distribution summary statistics ... mean, median, min, max q25, q75

Posted by "Ted Dunning (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ted Dunning updated MAHOUT-444:
-------------------------------

    Status: Patch Available  (was: Open)

Here is a patch with an object that estimates the five quartiles, mean and standard deviation.  The associated tests indicate that it is pretty much as accurate as if all of the samples were kept and the empirical rank statistics were computed directly.

> Need on-line distribution summary statistics ... mean, median, min, max q25, q75
> --------------------------------------------------------------------------------
>
>                 Key: MAHOUT-444
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-444
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Ted Dunning
>
> For the on-line learning algorithms it is very helpful to have robust on-line estimate of various summary statistics.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.