You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@systemml.apache.org by "Janardhan (JIRA)" <ji...@apache.org> on 2018/03/04 03:58:00 UTC

[jira] [Issue Comment Deleted] (SYSTEMML-2083) Language and runtime for parameter servers

     [ https://issues.apache.org/jira/browse/SYSTEMML-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Janardhan updated SYSTEMML-2083:
--------------------------------
    Comment: was deleted

(was: Hi [~Guobao] , Matthias is correct, we are not building parameter server. We are extending our existing stuff, the parameter server example, I kept above is to illustrate a practically implied ps, in use.)

> Language and runtime for parameter servers
> ------------------------------------------
>
>                 Key: SYSTEMML-2083
>                 URL: https://issues.apache.org/jira/browse/SYSTEMML-2083
>             Project: SystemML
>          Issue Type: Epic
>            Reporter: Matthias Boehm
>            Priority: Major
>              Labels: gsoc2018
>         Attachments: image-2018-02-14-12-18-48-932.png, image-2018-02-14-12-21-00-932.png, image-2018-02-14-12-31-37-563.png
>
>
> SystemML already provides a rich set of execution strategies ranging from local operations to large-scale computation on MapReduce or Spark. In this context, we support both data-parallel (multi-threaded or distributed operations) as well as task-parallel computation (multi-threaded or distributed parfor loops). This epic aims to complement the existing execution strategies by language and runtime primitives for parameter servers, i.e., model-parallel execution. We use the terminology of model-parallel execution with distributed data and distributed model to differentiate them from the existing data-parallel operations. Target applications are distributed deep learning and mini-batch algorithms in general. These new abstractions will help making SystemML a unified framework for small- and large-scale machine learning that supports all three major execution strategies in a single framework.
>  
> A major challenge is the integration of stateful parameter servers and their common push/pull primitives into an otherwise functional (and thus, stateless) language. We will approach this challenge via a new builtin function {{paramserv}} which internally maintains state but at the same time fits into the runtime framework of stateless operations.
> Furthermore, we are interested in providing (1) different runtime backends (local and distributed), (2) different parameter server modes (synchronous, asynchronous, hogwild!, stale-synchronous), (3) different update frequencies (batch, multi-batch, epoch), as well as (4) different architectures for distributed data (1 parameter server, k workers) and distributed model (k1 parameter servers, k2 workers). 
>  
> *Note for GSOC students:* This is large project which will be broken down into sub projects, so everybody will be having their share of pie.
> *Prerequistes:* Java, machine learning experience is a plus but not required.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)