You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2015/07/11 06:14:04 UTC

[jira] [Created] (SPARK-8997) Improve LocalPrefixSpan performance

Xiangrui Meng created SPARK-8997:
------------------------------------

             Summary: Improve LocalPrefixSpan performance
                 Key: SPARK-8997
                 URL: https://issues.apache.org/jira/browse/SPARK-8997
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
    Affects Versions: 1.5.0
            Reporter: Xiangrui Meng
            Assignee: Feynman Liang


We can improve the performance by:

1. run should output Iterator instead of Array
2. Local count shouldn't use groupBy, which creates too many arrays. We can use PrimitiveKeyOpenHashMap
3. We can use list to avoid materialize frequent sequences



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org