You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yanbo Liang (JIRA)" <ji...@apache.org> on 2016/09/28 08:06:20 UTC

[jira] [Created] (SPARK-17704) ChiSqSelector performance improvement.

Yanbo Liang created SPARK-17704:
-----------------------------------

             Summary: ChiSqSelector performance improvement.
                 Key: SPARK-17704
                 URL: https://issues.apache.org/jira/browse/SPARK-17704
             Project: Spark
          Issue Type: Improvement
          Components: ML, MLlib
            Reporter: Yanbo Liang


Several performance improvement for {{ChiSqSelector}}:
1, Keep {{selectedFeatures}} ordered ascendent.
{{ChiSqSelectorModel.transform}} need {{selectedFeatures}} ordered to make prediction. We should sort it when training model rather than making prediction, since users usually train model once and use the model to do prediction multiple times. 
2, When training {{fpr}} type {{ChiSqSelectorModel}}, it's not necessary to sort the ChiSq test result by statistic.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org