You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Zhipeng Zhang (Jira)" <ji...@apache.org> on 2022/05/28 12:42:00 UTC
[jira] [Created] (FLINK-27826) Support machine learning training for high dimesional models
Zhipeng Zhang created FLINK-27826:
-------------------------------------
Summary: Support machine learning training for high dimesional models
Key: FLINK-27826
URL: https://issues.apache.org/jira/browse/FLINK-27826
Project: Flink
Issue Type: New Feature
Components: Library / Machine Learning
Reporter: Zhipeng Zhang
Assignee: Zhipeng Zhang
There is limited support for training high dimensional machine learning models in FlinkML though it is often useful especially in industrial cases. When the size of the model parameter can not be hold in the memory of a single machine, FlinkML crashes now.
So it is useful to support high dimensional model training in FlinkML. To achieve this, we probably need to do the following things:
# Do a survey on how to training large machine learning models of existing machine learning systems (e.g. data paralllel, model parallel)
# Define/Implement the infra of supporting large model training in FlinkML
# Implement a logistic regression model that can train models with more than ten billion parameters
# Benchmark the implementation and further improve it
--
This message was sent by Atlassian Jira
(v8.20.7#820007)