You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Aleksey Zinoviev (JIRA)" <ji...@apache.org> on 2019/08/16 09:32:00 UTC

[jira] [Created] (IGNITE-12079) [ML][Umbrella] Add advanced preprocessing techniques

Aleksey Zinoviev created IGNITE-12079:
-----------------------------------------

             Summary: [ML][Umbrella] Add advanced preprocessing techniques
                 Key: IGNITE-12079
                 URL: https://issues.apache.org/jira/browse/IGNITE-12079
             Project: Ignite
          Issue Type: New Feature
          Components: ml
    Affects Versions: 2.8
            Reporter: Aleksey Zinoviev
            Assignee: Aleksey Zinoviev
             Fix For: 2.8


*Main goal:*

To reduce the gap between Apache Spark and Apache Ignite in preprocessing operations. The reducing of the gap could help with loading Spark ML Pipelines to Ignite ML.

 

Next steps:
 # Add Frequency Encoder
 # Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, LEAST_FREQUENT)
 # Add RobustScaler (will be added in Spark 3.0)
 # Add CountVectorizer
 # Add FeatureHasher
 # Add QuantileDiscretizer
 # Add Locality Sensitive Hashing (LSH)
 # Add LabelEncoder
 # Add RevertStringIndexing
 # Add multi-column preprocessor



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)