You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2019/08/16 17:40:00 UTC

[jira] [Updated] (SPARK-28722) Change sequential label sorting in StringIndexer fit to parallel

     [ https://issues.apache.org/jira/browse/SPARK-28722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen updated SPARK-28722:
------------------------------
    Priority: Minor  (was: Major)

> Change sequential label sorting in StringIndexer fit to parallel
> ----------------------------------------------------------------
>
>                 Key: SPARK-28722
>                 URL: https://issues.apache.org/jira/browse/SPARK-28722
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML
>    Affects Versions: 3.0.0
>            Reporter: Liang-Chi Hsieh
>            Priority: Minor
>
> The fit method in StringIndexer sorts given labels in a sequential approach, if there are multiple input columns. When the number of input column increases, the time of label sorting dramatically increases too so it is hard to use in practice if dealing with hundreds of input columns.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org