You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2019/10/08 05:44:13 UTC

[jira] [Resolved] (SPARK-21389) ALS recommendForAll optimization uses Native BLAS

     [ https://issues.apache.org/jira/browse/SPARK-21389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-21389.
----------------------------------
    Resolution: Incomplete

> ALS recommendForAll optimization uses Native BLAS
> -------------------------------------------------
>
>                 Key: SPARK-21389
>                 URL: https://issues.apache.org/jira/browse/SPARK-21389
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML, MLlib
>    Affects Versions: 2.3.0
>            Reporter: Peng Meng
>            Priority: Major
>              Labels: bulk-closed
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> In Spark 2.2, we have optimized ALS recommendForAll, which uses a handwriting matrix multiplication, and get the topK items for each matrix. The method effectively reduce the GC problem. However, Native BLAS GEMM, like Intel MKL, and OpenBLAS, the performance of matrix multiplication is about 10X comparing with handwriting method. 
> I have rewritten the code of recommendForAll with GEMM, and got about 50% improvement comparing with the master recommendForAll method. 
> The key point of this optimization:
> 1), use GEMM to replace hand-written matrix multiplication.
> 2), Use matrix to keep temp result, largely reduce GC and computing time. The master method create many small objects, which causes using GEMM directly cannot get good performance.
> 3), Use sort and merge to get the topK items, which don't need to call priority queue two times.
> Test Result:
> 479818 users, 13727 products, rank = 10, topK = 20.
> 3 workers, each with 35 cores. Native BLAS is Intel MKL.
> Block Size: 1000===2000===4000===8000
> Master Method:40s==39.4s===39.5s===39.1s
> This Method 26.5s==25.9s===26s===27.1s
> Performance Improvement: (OldTime - NewTime)/NewTime = about 50%



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org