You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Maximilian Alber (JIRA)" <ji...@apache.org> on 2015/07/02 10:10:05 UTC

[jira] [Created] (FLINK-2312) Random Splits

Maximilian Alber created FLINK-2312:
---------------------------------------

             Summary: Random Splits
                 Key: FLINK-2312
                 URL: https://issues.apache.org/jira/browse/FLINK-2312
             Project: Flink
          Issue Type: Wish
          Components: Machine Learning Library
            Reporter: Maximilian Alber
            Priority: Minor


In machine learning applications it is common to split data sets into f.e. training and testing set.

To the best of my knowledge there is at the moment no nice way in Flink to split a data set randomly into several partitions according to some ratio.

The wished semantic would be the same as of Sparks RDD randomSplit.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)