You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:33:32 UTC

[jira] [Resolved] (SPARK-7809) MultivariateOnlineSummarizer should allow users to configure what to compute

     [ https://issues.apache.org/jira/browse/SPARK-7809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-7809.
---------------------------------
    Resolution: Incomplete

> MultivariateOnlineSummarizer should allow users to configure what to compute
> ----------------------------------------------------------------------------
>
>                 Key: SPARK-7809
>                 URL: https://issues.apache.org/jira/browse/SPARK-7809
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.4.0
>            Reporter: Xiangrui Meng
>            Priority: Major
>              Labels: bulk-closed
>
> Now MultivariateOnlineSummarizer computes every summary statistics it can provide, which is okay and convenient for small number of features. It the feature dimension is large, this becomes expensive. So we should add setters to allow users to configure what to compute.
> {code}
> val summarizer = new MultivariateOnlineSummarizer()
>   .withMean(false)
>   .withMax(false)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org