You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2015/07/17 18:10:04 UTC

[jira] [Updated] (SPARK-6487) Add sequential pattern mining algorithm PrefixSpan to Spark MLlib

     [ https://issues.apache.org/jira/browse/SPARK-6487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiangrui Meng updated SPARK-6487:
---------------------------------
    Summary: Add sequential pattern mining algorithm PrefixSpan to Spark MLlib  (was: Add sequential pattern mining algorithm to Spark MLlib)

> Add sequential pattern mining algorithm PrefixSpan to Spark MLlib
> -----------------------------------------------------------------
>
>                 Key: SPARK-6487
>                 URL: https://issues.apache.org/jira/browse/SPARK-6487
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Zhang JiaJin
>            Assignee: Zhang JiaJin
>            Priority: Critical
>             Fix For: 1.5.0
>
>
> [~mengxr] [~zhangyouhua]
> Sequential pattern mining is an important branch in the pattern mining. In the past the actual work, we use the sequence mining (mainly PrefixSpan algorithm) to find the telecommunication signaling sequence pattern, achieved good results. But once the data is too large, the operation time is too long, even can not meet the the service requirements. We are ready to implement the PrefixSpan algorithm in spark, and applied to our subsequent work. 
> The related Paper: 
> PrefixSpan: 
> Pei, Jian, et al. "Mining sequential patterns by pattern-growth: The prefixspan approach." Knowledge and Data Engineering, IEEE Transactions on 16.11 (2004): 1424-1440.
> Parallel Algorithm: 
> Cong, Shengnan, Jiawei Han, and David Padua. "Parallel mining of closed sequential patterns." Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 2005.
> Distributed Algorithm: 
> Wei, Yong-qing, Dong Liu, and Lin-shan Duan. "Distributed PrefixSpan algorithm based on MapReduce." Information Technology in Medicine and Education (ITME), 2012 International Symposium on. Vol. 2. IEEE, 2012.
> Pattern mining and sequential mining Knowledge: 
> Han, Jiawei, et al. "Frequent pattern mining: current status and future directions." Data Mining and Knowledge Discovery 15.1 (2007): 55-86.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org