You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/02/28 23:07:00 UTC

[jira] [Updated] (FLINK-4613) Extend ALS to handle implicit feedback datasets

     [ https://issues.apache.org/jira/browse/FLINK-4613?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated FLINK-4613:
----------------------------------
    Labels: pull-request-available  (was: )

> Extend ALS to handle implicit feedback datasets
> -----------------------------------------------
>
>                 Key: FLINK-4613
>                 URL: https://issues.apache.org/jira/browse/FLINK-4613
>             Project: Flink
>          Issue Type: New Feature
>          Components: Library / Machine Learning
>            Reporter: Gábor Hermann
>            Assignee: Gábor Hermann
>            Priority: Major
>              Labels: pull-request-available
>
> The Alternating Least Squares implementation should be extended to handle _implicit feedback_ datasets. These datasets do not contain explicit ratings by users, they are rather built by collecting user behavior (e.g. user listened to artist X for Y minutes), and they require a slightly different optimization objective. See details by [Hu et al|http://dx.doi.org/10.1109/ICDM.2008.22].
> We do not need to modify much in the original ALS algorithm. See [Spark ALS implementation|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/recommendation/ALS.scala], which could be a basis for this extension. Only the updating factor part is modified, and most of the changes are in the local parts of the algorithm (i.e. UDFs). In fact, the only modification that is not local, is precomputing a matrix product Y^T * Y and broadcasting it to all the nodes, which we can do with broadcast DataSets. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)