You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Yunfeng Zhou (Jira)" <ji...@apache.org> on 2022/05/23 09:07:00 UTC

[jira] [Created] (FLINK-27742) Fix Compatibility Issues Between Flink ML Operators.

Yunfeng Zhou created FLINK-27742:
------------------------------------

             Summary: Fix Compatibility Issues Between Flink ML Operators.
                 Key: FLINK-27742
                 URL: https://issues.apache.org/jira/browse/FLINK-27742
             Project: Flink
          Issue Type: Bug
          Components: Library / Machine Learning
    Affects Versions: ml-2.0.0
            Reporter: Yunfeng Zhou


It is discovered that StringIndexer and LogisticRegression in Flink ML cannot be connected in a pipeline. The reason is that the output label column of StringIndexer is integer, while LogisticRegression can only accept input data whose labels are doubles.

In order to make Flink ML stages compatible with each other, the following changes need to be made.
- For stages who can only accept double-typed inputs, update their implementation to accept any numerical type.
- For stages that generates labels as integers, make them return labels as doubles.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)