You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Aleksey Zinoviev (JIRA)" <ji...@apache.org> on 2019/08/16 09:32:00 UTC
[jira] [Created] (IGNITE-12079) [ML][Umbrella] Add advanced
preprocessing techniques
Aleksey Zinoviev created IGNITE-12079:
-----------------------------------------
Summary: [ML][Umbrella] Add advanced preprocessing techniques
Key: IGNITE-12079
URL: https://issues.apache.org/jira/browse/IGNITE-12079
Project: Ignite
Issue Type: New Feature
Components: ml
Affects Versions: 2.8
Reporter: Aleksey Zinoviev
Assignee: Aleksey Zinoviev
Fix For: 2.8
*Main goal:*
To reduce the gap between Apache Spark and Apache Ignite in preprocessing operations. The reducing of the gap could help with loading Spark ML Pipelines to Ignite ML.
Next steps:
# Add Frequency Encoder
# Add two Imputing Strategies (MIN, MAX, COUNT, MOST_FREQUENT, LEAST_FREQUENT)
# Add RobustScaler (will be added in Spark 3.0)
# Add CountVectorizer
# Add FeatureHasher
# Add QuantileDiscretizer
# Add Locality Sensitive Hashing (LSH)
# Add LabelEncoder
# Add RevertStringIndexing
# Add multi-column preprocessor
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)