You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Brian (JIRA)" <ji...@apache.org> on 2017/09/10 18:49:00 UTC

[jira] [Created] (SPARK-21968) Improved KernelDensity support

Brian created SPARK-21968:
-----------------------------

             Summary: Improved KernelDensity support
                 Key: SPARK-21968
                 URL: https://issues.apache.org/jira/browse/SPARK-21968
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
    Affects Versions: 2.2.0
            Reporter: Brian


Related to SPARK-7753.  The KernelDensity API still does not provide a way to specify a kernel as described in the 7753 ticket, and requires the client to calculate their own optimal bandwidth.

Specifying a kernel could be something like:
def
setKernel(kernel: Function2[Double,Double]): KernelDensity.this.type

There could be something providing the user with a few options for kernels they could pass here so they don't need to implement each kernel themselves. Here are some example kernels:
https://en.wikipedia.org/wiki/Kernel_(statistics)#Kernel_functions_in_common_use

functions could also be provided to get more optimal bandwidth settings without the user needing to calculate it themselves, e.g. the "rule of thumb" and/or "solve the equation" bandwidth described here:
https://en.wikipedia.org/wiki/Kernel_density_estimation#Bandwidth_selection




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org