You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@ignite.apache.org by ab...@apache.org on 2020/09/09 12:14:52 UTC
[ignite] branch IGNITE-7595 updated: copy-paste machine learning pages

This is an automated email from the ASF dual-hosted git repository.

abudnikov pushed a commit to branch IGNITE-7595
in repository https://gitbox.apache.org/repos/asf/ignite.git


The following commit(s) were added to refs/heads/IGNITE-7595 by this push:
     new 7342a03  copy-paste machine learning pages
7342a03 is described below

commit 7342a034ad1d9ebcd64445b562c1f2e78bbb2305
Author: abudnikov <ab...@gridgain.com>
AuthorDate: Wed Sep 9 15:12:46 2020 +0300

    copy-paste machine learning pages
---
 docs/_data/toc.yaml                                |  83 +++++++
 docs/_docs/images/111.gif                          | Bin 0 -> 419 bytes
 docs/_docs/images/222.gif                          | Bin 0 -> 1163 bytes
 docs/_docs/images/333.gif                          | Bin 0 -> 719 bytes
 docs/_docs/images/555.gif                          | Bin 0 -> 1197 bytes
 docs/_docs/images/666.gif                          | Bin 0 -> 1309 bytes
 docs/_docs/images/bagging.png                      | Bin 0 -> 4675 bytes
 docs/_docs/images/logistic-regression.png          | Bin 0 -> 9666 bytes
 docs/_docs/images/logistic-regression2.png         | Bin 0 -> 8764 bytes
 docs/_docs/images/machine_learning.png             | Bin 0 -> 68453 bytes
 docs/_docs/images/naive-bayes.png                  | Bin 0 -> 18067 bytes
 docs/_docs/images/naive-bayes2.png                 | Bin 0 -> 27103 bytes
 docs/_docs/images/naive-bayes3.png                 | Bin 0 -> 13713 bytes
 docs/_docs/images/naive-bayes3png                  | Bin 0 -> 13713 bytes
 docs/_docs/images/preprocessing.png                | Bin 0 -> 6588 bytes
 docs/_docs/images/preprocessing2.png               | Bin 0 -> 4548 bytes
 .../binary-classification/ann.adoc                 |  73 +++++++
 .../binary-classification/decision-trees.adoc      |  63 ++++++
 .../binary-classification/introduction.adoc        |  20 ++
 .../binary-classification/knn-classification.adoc  |  49 +++++
 .../binary-classification/linear-svm.adoc          |  38 ++++
 .../binary-classification/logistic-regression.adoc |  71 ++++++
 .../multilayer-perceptron.adoc                     |  64 ++++++
 .../binary-classification/naive-bayes.adoc         |  95 ++++++++
 .../clustering/gaussian-mixture.adoc               |  57 +++++
 .../machine-learning/clustering/introduction.adoc  |   8 +
 .../clustering/k-means-clustering.adoc             |  66 ++++++
 .../machine-learning/ensemble-methods/bagging.adoc |  42 ++++
 .../ensemble-methods/gradient-boosting.adoc        |  85 ++++++++
 .../ensemble-methods/introduction.adoc             |  11 +
 .../ensemble-methods/random-forest.adoc            |  71 ++++++
 .../ensemble-methods/stacking.adoc                 |  35 +++
 .../importing-model/introduction.adoc              |  12 ++
 .../model-import-from-apache-spark.adoc            |  70 ++++++
 .../importing-model/model-import-from-gxboost.adoc |  21 ++
 docs/_docs/machine-learning/machine-learning.adoc  | 125 +++++++++++
 .../model-selection/cross-validation.adoc          |  76 +++++++
 .../model-selection/evaluator.adoc                 |  93 ++++++++
 .../model-selection/hyper-parameter-tuning.adoc    |  51 +++++
 .../model-selection/introduction.adoc              |  18 ++
 .../model-selection/pipeline-api.adoc              | 111 ++++++++++
 ...lit-the-dataset-on-test-and-train-datasets.adoc |  52 +++++
 .../multiclass-classification.adoc                 |  41 ++++
 .../machine-learning/partition-based-dataset.adoc  |  86 ++++++++
 docs/_docs/machine-learning/preprocessing.adoc     | 239 +++++++++++++++++++++
 .../machine-learning/recommendation-systems.adoc   |  57 +++++
 .../regression/decision-trees-regression.adoc      |  61 ++++++
 .../machine-learning/regression/introduction.adoc  |   9 +
 .../regression/knn-regression.adoc                 |  49 +++++
 .../regression/linear-regression.adoc              |  85 ++++++++
 .../machine-learning/updating-trained-models.adoc  |  63 ++++++
 docs/_docs/quick-start/dotnet.adoc                 |   2 +-
 52 files changed, 2251 insertions(+), 1 deletion(-)

diff --git a/docs/_data/toc.yaml b/docs/_data/toc.yaml
index 6b83dfd..9aad776 100644
--- a/docs/_data/toc.yaml
+++ b/docs/_data/toc.yaml
@@ -199,6 +199,89 @@
       url: /data-structures/atomic-sequence
     - title:  Semaphore 
       url: /data-structures/semaphore
+- title: Machine Learning
+  items:
+    - title: Machine Learning
+      url: /machine-learning/machine-learning
+    - title: Partition Based Dataset 
+      url: /machine-learning/partition-based-dataset
+    - title: Updating Trained Models 
+      url: /machine-learning/updating-trained-models
+    - title: Binary Classification
+      items:
+        - title: Introduction
+          url: /machine-learning/binary-classification/introduction
+        - title: Linear SVM (Support Vector Machine) 
+          url: /machine-learning/binary-classification/linear-svm
+        - title: Decision Trees 
+          url: /machine-learning/binary-classification/decision-trees
+        - title: Multilayer Perceptron
+          url: /machine-learning/binary-classification/multilayer-perceptron
+        - title: Logistic Regression 
+          url: /machine-learning/binary-classification/logistic-regression
+        - title: k-NN Classification 
+          url: /machine-learning/binary-classification/knn-classification
+        - title: ANN (Approximate Nearest Neighbor) 
+          url: /machine-learning/binary-classification/ann
+        - title: Naive Bayes 
+          url: /machine-learning/binary-classification/naive-bayes
+    - title: Regression 
+      items:
+        - title: Introduction
+          url: /machine-learning/regression/introduction
+        - title: Linear Regression 
+          url: /machine-learning/regression/linear-regression
+        - title: Decision Trees Regression 
+          url: /machine-learning/regression/decision-trees-regression
+        - title: k-NN Regression 
+          url: /machine-learning/regression/knn-regression
+    - title: Clustering 
+      items:
+        - title: Introduction
+          url: /machine-learning/clustering/introduction
+        - title: K-Means Clustering 
+          url: /machine-learning/clustering/k-means-clustering
+        - title: Gaussian mixture (GMM) 
+          url: /machine-learning/clustering/gaussian-mixture
+    - title: Preprocessing 
+      url: /machine-learning/preprocessing
+    - title: Model Selection 
+      items:
+        - title: Introduction
+          url: /machine-learning/model-selection/introduction
+        - title: Evaluator 
+          url: /machine-learning/model-selection/evaluator
+        - title: Split the dataset on test and train datasets 
+          url: /machine-learning/model-selection/split-the-dataset-on-test-and-train-datasets
+        - title: Hyper-parameter tuning 
+          url: /machine-learning/model-selection/hyper-parameter-tuning
+        - title: Pipeline API 
+          url: /machine-learning/model-selection/pipeline-api
+    - title: Multiclass Classification 
+      url: /machine-learning/multiclass-classification
+    - title: Ensemble Methods 
+      items:
+        - title:
+          url: /machine-learning/ensemble-methods/introduction
+        - title: Stacking 
+          url: /machine-learning/ensemble-methods/stacking
+        - title: Bagging 
+          url: /machine-learning/ensemble-methods/baggin
+        - title: Random Forest 
+          url: /machine-learning/ensemble-methods/random-forest
+        - title: Gradient Boosting 
+          url: /machine-learning/ensemble-methods/gradient-boosting
+    - title: Recommendation Systems 
+      url: /machine-learning/recommendation-systems
+    - title: Importing Model
+      items:
+        - title: Introduction 
+          url: /machine-learning/importing-model/introduction
+        - title: Import Model from XGBoost 
+          url: /machine-learning/importing-model/model-import-from-gxboost
+        - title: Import Model from Apache Spark 
+          url: /machine-learning/importing-model/model-import-from-apache-spark
+ 
 - title: Monitoring
   items:
     - title: Introduction
diff --git a/docs/_docs/images/111.gif b/docs/_docs/images/111.gif
new file mode 100644
index 0000000..dc5f668
Binary files /dev/null and b/docs/_docs/images/111.gif differ
diff --git a/docs/_docs/images/222.gif b/docs/_docs/images/222.gif
new file mode 100644
index 0000000..05a097c
Binary files /dev/null and b/docs/_docs/images/222.gif differ
diff --git a/docs/_docs/images/333.gif b/docs/_docs/images/333.gif
new file mode 100644
index 0000000..828f448
Binary files /dev/null and b/docs/_docs/images/333.gif differ
diff --git a/docs/_docs/images/555.gif b/docs/_docs/images/555.gif
new file mode 100644
index 0000000..1d5ef9a
Binary files /dev/null and b/docs/_docs/images/555.gif differ
diff --git a/docs/_docs/images/666.gif b/docs/_docs/images/666.gif
new file mode 100644
index 0000000..983e35b
Binary files /dev/null and b/docs/_docs/images/666.gif differ
diff --git a/docs/_docs/images/bagging.png b/docs/_docs/images/bagging.png
new file mode 100644
index 0000000..5664051
Binary files /dev/null and b/docs/_docs/images/bagging.png differ
diff --git a/docs/_docs/images/logistic-regression.png b/docs/_docs/images/logistic-regression.png
new file mode 100644
index 0000000..4531071
Binary files /dev/null and b/docs/_docs/images/logistic-regression.png differ
diff --git a/docs/_docs/images/logistic-regression2.png b/docs/_docs/images/logistic-regression2.png
new file mode 100644
index 0000000..f55c151
Binary files /dev/null and b/docs/_docs/images/logistic-regression2.png differ
diff --git a/docs/_docs/images/machine_learning.png b/docs/_docs/images/machine_learning.png
new file mode 100644
index 0000000..800fc1a
Binary files /dev/null and b/docs/_docs/images/machine_learning.png differ
diff --git a/docs/_docs/images/naive-bayes.png b/docs/_docs/images/naive-bayes.png
new file mode 100644
index 0000000..660c866
Binary files /dev/null and b/docs/_docs/images/naive-bayes.png differ
diff --git a/docs/_docs/images/naive-bayes2.png b/docs/_docs/images/naive-bayes2.png
new file mode 100644
index 0000000..7e3e29a
Binary files /dev/null and b/docs/_docs/images/naive-bayes2.png differ
diff --git a/docs/_docs/images/naive-bayes3.png b/docs/_docs/images/naive-bayes3.png
new file mode 100644
index 0000000..cc02903
Binary files /dev/null and b/docs/_docs/images/naive-bayes3.png differ
diff --git a/docs/_docs/images/naive-bayes3png b/docs/_docs/images/naive-bayes3png
new file mode 100644
index 0000000..cc02903
Binary files /dev/null and b/docs/_docs/images/naive-bayes3png differ
diff --git a/docs/_docs/images/preprocessing.png b/docs/_docs/images/preprocessing.png
new file mode 100644
index 0000000..3601b59
Binary files /dev/null and b/docs/_docs/images/preprocessing.png differ
diff --git a/docs/_docs/images/preprocessing2.png b/docs/_docs/images/preprocessing2.png
new file mode 100644
index 0000000..07fda7c
Binary files /dev/null and b/docs/_docs/images/preprocessing2.png differ
diff --git a/docs/_docs/machine-learning/binary-classification/ann.adoc b/docs/_docs/machine-learning/binary-classification/ann.adoc
new file mode 100644
index 0000000..ac50631
--- /dev/null
+++ b/docs/_docs/machine-learning/binary-classification/ann.adoc
@@ -0,0 +1,73 @@
+= ANN (Approximate Nearest Neighbor)
+
+An approximate nearest neighbor search algorithm is allowed to return points, whose distance from the query is at most *c* times the distance from the query to its nearest points.
+
+The appeal of this approach is that, in many cases, an approximate nearest neighbor is almost as good as the exact one. In particular, if the distance measure accurately captures the notion of user quality, then small differences in the distance should not matter.
+
+The ANN algorithm is able to solve multi-class classification tasks. The Apache Ignite implementation is a heuristic algorithm based upon searching of small limited size *N* of candidate points (internally it uses a distributed KMeans clustering algorithm to find centroids) that can vote for class labels like a KNN algorithm.
+
+The difference between KNN and ANN is that in the prediction phase, all training points are involved in searching k-nearest neighbors in the KNN algorithm, but in ANN this search starts only on a small subset of candidates points.
+
+NOTE: if *N* is set to the size of the training set, the ANN reduces to KNN with enormous time spent in the training phase. So, instead, choose *N* comparable with *k* (e.g. 10 x k, 100 x k, and so on).
+
+== Model
+
+ANN classification output represents a class membership. An object is classified by the majority votes of its neighbors. The object is assigned to a particular class that is most common among its *k* nearest neighbors. *k* is a positive integer, typically small. There is a special case when *k* is 1, then the object is simply assigned to the class of that single nearest neighbor.
+At present, Ignite supports the following parameters for the ANN classification algorithm:
+
+  * k - the number of nearest neighbors.
+  * distanceMeasure - one of the distance metrics provided by the Machine Learning (ML) framework, such as Euclidean, Hamming or Manhattan.
+  * isWeighted - false by default, if true it enables a weighted KNN algorithm.
+
+
+[source, java]
+----
+NNClassificationModel knnMdl = trainer.fit(
+...
+).withK(5)
+ .withDistanceMeasure(new EuclideanDistance())
+ .withWeighted(true);
+
+
+// Make a prediction.
+double prediction = knnMdl.predict(observation);
+----
+
+== Trainer
+
+The trainer of the ANN model uses KMeans to calculate the candidate subset and this is the reason that it has the same parameters as the KMeans algorithm to tune its hyperparameters. It builds not only the set of candidates but also their class-label distributions to vote for the class label during the prediction phase.
+
+At present, Ignite supports the following parameters for the ANNClassificationTrainer:
+
+  * k - the number of possible clusters.
+  * maxIterations - one stop criteria (the other one is epsilon).
+  * epsilon - delta of convergence (delta between old and new centroid values).
+  * distance - one of the distance metrics provided by the ML framework, such as Euclidean, Hamming or Manhattan.
+  * seed - one of initialization parameters which helps to reproduce models (trainer has a random initialization step to get the first centroids).
+
+
+[source, java]
+----
+// Set up the trainer
+ANNClassificationTrainer trainer = new ANNClassificationTrainer()
+  .withDistance(new ManhattanDistance())
+  .withK(50)
+  .withMaxIterations(1000)
+  .withSeed(1234L)
+  .withEpsilon(1e-2);
+
+// Build the model
+NNClassificationModel knnMdl = trainer.fit(
+  ignite,
+  dataCache,
+  vectorizer
+).withK(5)
+ .withDistanceMeasure(new EuclideanDistance())
+ .withWeighted(true);
+----
+
+== Example
+
+
+To see how ANNClassificationModel can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/knn/ANNClassificationExample.java[example] that is available on GitHub and delivered with every Apache Ignite distribution. The training dataset is the Iris dataset that can be loaded from the https://archive.ics.uci.edu/ml/datasets/iris[UCI Machine Learning Repository].
+
diff --git a/docs/_docs/machine-learning/binary-classification/decision-trees.adoc b/docs/_docs/machine-learning/binary-classification/decision-trees.adoc
new file mode 100644
index 0000000..03e5bdf
--- /dev/null
+++ b/docs/_docs/machine-learning/binary-classification/decision-trees.adoc
@@ -0,0 +1,63 @@
+= Decision Trees
+
+Decision trees and their ensembles are popular methods for the machine learning tasks of classification and regression. Decision trees are widely used since they are easy to interpret, handle categorical features, extend to the multiclass classification setting, do not require feature scaling, and are able to capture non-linearities and feature interactions. Tree ensemble algorithms such as random forests and boosting are among the top performers for classification and regression tasks.
+
+== Overview
+
+Decision trees are a simple yet powerful model in supervised machine learning. The main idea is to split a feature space into regions such as that the value in each region varies a little. The measure of the values' variation in a region is called the impurity of the region.
+
+Apache Ignite provides an implementation of the algorithm optimized for data stored in rows (see link:machine-learning/partition-based-dataset[Partition Based Dataset]).
+
+Splits are done recursively and every region created from a split can be split further. Therefore, the whole process can be described by a binary tree, where each node is a particular region and its children are the regions derived from it by another split.
+
+Let each sample from a training set belong to some space `S` and let `p_i` be a projection on a feature with index `i`, then a split by continuous feature with index `i` has the form:
+
+image::images/555.gif[]
+
+and a split by categorical feature with values from some set `X` has the form:
+
+image::images/666.gif[]
+
+Here `X_0` is a subset of `X`.
+
+The model works this way - the split process stops when either the algorithm has reached the configured maximal depth, or splitting of any region has not resulted in significant impurity loss. Prediction of a value for point `s` from `S` is a traversal of the tree down to the node that corresponds to the region containing `s` and getting back a value associated with this leaf.
+
+
+== Model
+
+The Model in a decision tree classification is represented by the class `DecisionTreeNode`. We can make a prediction for a given vector of features in the following way:
+
+
+[source, java]
+----
+DecisionTreeNode mdl = ...;
+
+double prediction = mdl.apply(observation);
+----
+
+The model is a fully independent object and after the training it can be saved, serialized and restored.
+
+== Trainer
+
+A Decision Tree algorithm can be used for classification and regression depending upon the impurity measure and node instantiation approach.
+
+=== Classification
+
+The Classification Decision Tree uses the https://en.wikipedia.org/wiki/Decision_tree_learning#Gini_impurity[Gini] impurity measure and you can use it in the following way:
+
+[source, java]
+----
+// Create decision tree classification trainer.
+DecisionTreeClassificationTrainer trainer = new DecisionTreeClassificationTrainer(
+    4, // Max deep.
+    0  // Min impurity decrease.
+);
+
+// Train model.
+DecisionTreeNode mdl = trainer.fit(ignite, dataCache, vectorizer);
+----
+
+
+== Examples
+
+To see how the Decision Tree can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/tree/DecisionTreeClassificationTrainerExample.java[classification example] that is available on GitHub and delivered with every Apache Ignite distribution.
diff --git a/docs/_docs/machine-learning/binary-classification/introduction.adoc b/docs/_docs/machine-learning/binary-classification/introduction.adoc
new file mode 100644
index 0000000..e7fe52c
--- /dev/null
+++ b/docs/_docs/machine-learning/binary-classification/introduction.adoc
@@ -0,0 +1,20 @@
+---
+layout: toc
+---
+= Introduction
+
+In machine learning and statistics, classification is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known.
+
+All existing training algorithms presented in this section are designed to solve binary classification tasks:
+
+
+*  Linear SVM (Support Vector Machines)
+*  Decision Trees
+*    Multilayer perceptron
+*    Logistic Regression
+*    k-NN Classification
+*    ANN (Approximate Nearest Neighbor)
+*    Naive Bayes
+
+
+Binary or binomial classification is the task of classifying the elements of a given set into two groups (predicting which group each one belongs to) on the basis of a classification rule.
diff --git a/docs/_docs/machine-learning/binary-classification/knn-classification.adoc b/docs/_docs/machine-learning/binary-classification/knn-classification.adoc
new file mode 100644
index 0000000..3479b5f
--- /dev/null
+++ b/docs/_docs/machine-learning/binary-classification/knn-classification.adoc
@@ -0,0 +1,49 @@
+= k-NN Classification
+
+The Apache Ignite Machine Learning component provides two versions of the widely used k-NN (k-nearest neighbors) algorithm - one for classification tasks and the other for regression tasks.
+
+This documentation reviews k-NN as a solution for classification tasks.
+
+== Trainer and Model
+
+The k-NN algorithm is a non-parametric method whose input consists of the k-closest training examples in the feature space.
+
+Also, k-NN classification's output represents a class membership. An object is classified by the majority votes of its neighbors. The object is assigned to a particular class that is most common among its k nearest neighbors. `k` is a positive integer, typically small. There is a special case when `k` is `1`, then the object is simply assigned to the class of that single nearest neighbor.
+
+Presently, Ignite supports a few parameters for k-NN classification algorithm:
+
+* `k` - a number of nearest neighbors
+* `distanceMeasure` - one of the distance metrics provided by the ML framework such as Euclidean, Hamming or Manhattan.
+* `isWeighted` - false by default, if true it enables a weighted KNN algorithm.
+* `dataCache` -  holds a training set of objects for which the class is already known.
+* `indexType` - distributed spatial index, has three values: ARRAY, KD_TREE, BALL_TREE.
+
+
+[source, java]
+----
+// Create trainer
+KNNClassificationTrainer trainer = new KNNClassificationTrainer();
+
+// Create trainer
+KNNClassificationTrainer trainer = new KNNClassificationTrainer()
+  .withK(3)
+  .withIdxType(SpatialIndexType.BALL_TREE)
+  .withDistanceMeasure(new EuclideanDistance())
+  .withWeighted(true);
+
+// Train model.
+KNNClassificationModel knnMdl = trainer.fit(
+  ignite,
+  dataCache,
+  vectorizer
+);
+
+// Make a prediction.
+double prediction = knnMdl.predict(observation);
+----
+
+== Example
+
+To see how kNN Classification can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/knn/KNNClassificationExample.java[example] that is available on GitHub and delivered with every Apache Ignite distribution.
+
+The training dataset is the Iris dataset which can be loaded from the https://archive.ics.uci.edu/ml/datasets/iris[UCI Machine Learning Repository].
diff --git a/docs/_docs/machine-learning/binary-classification/linear-svm.adoc b/docs/_docs/machine-learning/binary-classification/linear-svm.adoc
new file mode 100644
index 0000000..812e2dc
--- /dev/null
+++ b/docs/_docs/machine-learning/binary-classification/linear-svm.adoc
@@ -0,0 +1,38 @@
+= Linear SVM (Support Vector Machine)
+
+Support Vector Machines (SVMs) are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis.
+
+Given a set of training examples, each marked as belonging to one or the other of two categories, an SVM training algorithm builds a model that assigns new examples to one category or the other, making it a non-probabilistic binary linear classifier.
+
+Apache Ignite Machine Learning module only supports Linear SVM. For more information look at SVM in link:https://en.wikipedia.org/wiki/Support_vector_machine[Wikipedia].
+
+== Model
+
+A Model in the case of SVM is represented by the class `SVMLinearClassificationModel`. It enables a prediction to be made for a given vector of features, in the following way:
+
+
+[source, java]
+----
+SVMLinearClassificationModel model = ...;
+
+double prediction = model.predict(observation);
+----
+
+Presently Ignite supports a few parameters for SVMLinearClassificationModel:
+
+* `isKeepingRawLabels` - controls the output label format: -1 and +1 for false value and raw distances from the separating hyperplane (default value: false)
+* `threshold` - a threshold to assign +1 label to the observation if the raw value is more than this threshold (default value: 0.0)
+
+
+[source, java]
+----
+SVMLinearClassificationModel model = ...;
+
+double prediction = model
+  .withRawLabels(true)
+  .withThreshold(5)
+  .predict(observation);
+----
+
+
+
diff --git a/docs/_docs/machine-learning/binary-classification/logistic-regression.adoc b/docs/_docs/machine-learning/binary-classification/logistic-regression.adoc
new file mode 100644
index 0000000..5e719ae
--- /dev/null
+++ b/docs/_docs/machine-learning/binary-classification/logistic-regression.adoc
@@ -0,0 +1,71 @@
+= Logistic Regression
+
+Binary Logistic Regression is a special type of regression where a binary response variable is related to a set of explanatory variables, which can be discrete and/or continuous. The important point here to note is that in linear regression, the expected values of the response variable are modeled based on a combination of values taken by the predictors. In logistic regression Probability or Odds of the response taking a particular value is modeled based on the combination of values take [...]
+
+image::images/logistic-regression.png[]
+
+For binary classification problems, the algorithm outputs a binary logistic regression model. Given a new data point, denoted by x, the model makes predictions by applying the logistic function:
+
+
+image::images/logistic-regression2.png[]
+
+By default, if `f(wTx)>0.5` or `\mathrm{f}(\wv^T x) > 0.5` (Tex formula), the outcome is positive, or negative otherwise. However, unlike linear SVMs, the raw output of the logistic regression model f(z) has a probabilistic interpretation (i.e., the probability that it is positive).
+
+== Model
+
+The model is represented by the class `LogisticRegressionModel` and keeps the weight vector. It enables a prediction to be made for a given vector of features, in the following way:
+
+
+[source, java]
+----
+LogisticRegressionModel mdl = …;
+
+double prediction = mdl.predict(observation);
+----
+
+Ignite supports several parameters for LogisticRegressionModel:
+
+* `isKeepingRawLabels` - controls the output label format: 0 and 1 for false value and raw distances from the separating hyperplane otherwise (default value: false)
+* `threshold` - a threshold to assign label ‘1’ to the observation if the raw value is more than this threshold (default value: 0.5)
+
+
+
+[source, java]
+----
+LogisticRegressionModel mdl = …;
+
+double prediction = mdl.withRawLabels(true).withThreshold(0.5).predict(observation);
+----
+
+== Trainer
+
+Trainer of the binary logistic regression model builds a MLP 1-level trainer under the hood.
+
+Ignite supports the following parameters for LogisticRegressionSGDTrainer:
+
+  * updatesStgy - update strategy
+  * maxIterations - max amount of iterations before convergence
+  * batchSize - the size of learning batch
+  * locIterations - the amount of local iterations of SGD algorithm
+  * seed - seed value for internal random purposes to reproduce training results
+
+
+Set up the trainer:
+
+[source, java]
+----
+LogisticRegressionSGDTrainer trainer = new LogisticRegressionSGDTrainer()
+  .withUpdatesStgy(UPDATES_STRATEGY)
+  .withAmountOfIterations(MAX_ITERATIONS)
+  .withAmountOfLocIterations(BATCH_SIZE)
+  .withBatchSize(LOC_ITERATIONS)
+  .withSeed(SEED);
+
+// Build the model
+LogisticRegressionModel mdl = trainer.fit(ignite, dataCache, vectorizer);
+----
+
+
+== Example
+
+To see how `LogRegressionMultiClassModel` can be used in practice, try this link:https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/regression/logistic/multiclass/LogRegressionMultiClassClassificationExample.java[example, window=_blank], available on GitHub and delivered with every Apache Ignite distribution.
diff --git a/docs/_docs/machine-learning/binary-classification/multilayer-perceptron.adoc b/docs/_docs/machine-learning/binary-classification/multilayer-perceptron.adoc
new file mode 100644
index 0000000..07fb3c5
--- /dev/null
+++ b/docs/_docs/machine-learning/binary-classification/multilayer-perceptron.adoc
@@ -0,0 +1,64 @@
+= Multilayer Perceptron
+
+Multiplayer Perceptron (MLP) is the basic form of neural network. It consists of one input layer and 0 or more transformation layers. Each transformation layer depends on the previous layer in the following way:
+
+image::images/333.gif[]
+
+In the above equation, the dot operator is the dot product of two vectors, functions denoted by `sigma` are called activators, vectors denoted by `w` are called weights, and vectors denoted by `b` are called biases. Each transformation layer has associated weights, activator, and optionally biases. The set of all weights and biases of MLP is the set of MLP parameters.
+
+
+== Model
+
+
+Model in case of neural network is represented by class `MultilayerPerceptron`. It allows you to make a prediction for a given vector of features in the following way:
+
+
+[source, java]
+----
+MultilayerPerceptron mlp = ...
+
+Matrix prediction = mlp.apply(observation);
+----
+
+The model is a fully independent object and after the training it can be saved, serialized and restored.
+
+== Trainer
+
+One of the popular ways for supervised model training is batch training. In this approach, training is done in iterations; during each iteration we extract a `subpart(batch)` of labeled data (data consisting of input of approximated function and corresponding values of this function which are often called 'ground truth') on which we train and update model parameters using this subpart. Updates are made to minimize loss function on batches.
+
+Apache Ignite `MLPTrainer` is used for distributed batch training, which works in a map-reduce way. Each iteration (let's call it global iteration) consists of several parallel iterations which in turn consists of several local steps. Each local iteration is executed by its own worker and performs the specified number of local steps (called synchronization period) to compute its update of model parameters. Then all updates are accumulated on the node that started the training, and are tr [...]
+
+`MLPTrainer` can be parameterized by neural network architecture, loss function, update strategy (`SGD`, `RProp` or `Nesterov`), max number of iterations, batch size, number of local iterations and seed.
+
+
+[source, java]
+----
+// Define a layered architecture.
+MLPArchitecture arch = new MLPArchitecture(2).
+    withAddedLayer(10, true, Activators.RELU).
+    withAddedLayer(1, false, Activators.SIGMOID);
+
+// Define a neural network trainer.
+MLPTrainer<SimpleGDParameterUpdate> trainer = new MLPTrainer<>(
+    arch,
+    LossFunctions.MSE,
+    new UpdatesStrategy<>(
+        new SimpleGDUpdateCalculator(0.1),
+        SimpleGDParameterUpdate::sumLocal,
+        SimpleGDParameterUpdate::avg
+    ),
+    3000,   // Max iterations.
+    4,      // Batch size.
+    50,     // Local iterations.
+    123L    // Random seed.
+);
+
+// Train model.
+MultilayerPerceptron mlp = trainer.fit(ignite, dataCache, vectorizer);
+----
+
+
+== Example
+
+To see how Deep Learning can be used in practice, try link:https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/nn/MLPTrainerExample.java[this example, window=_blank], available on GitHub and delivered with every Apache Ignite distribution.
+
diff --git a/docs/_docs/machine-learning/binary-classification/naive-bayes.adoc b/docs/_docs/machine-learning/binary-classification/naive-bayes.adoc
new file mode 100644
index 0000000..cbcb27f
--- /dev/null
+++ b/docs/_docs/machine-learning/binary-classification/naive-bayes.adoc
@@ -0,0 +1,95 @@
+= Naive Bayes
+
+== Overview
+
+Naive Bayes classifiers are a family of simple probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence assumptions between the features.
+In all trainers, prior probabilities can be preset or calculated. Also, there is an option to use equal probabilities.
+
+
+
+== Gaussian Naive Bayes
+
+Gaussian Naive Bayes algorithm is based on https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Gaussian_naive_Bayes[this information^].
+
+When dealing with continuous data, a typical assumption is that the continuous values associated with each class are distributed according to a normal (or Gaussian) distribution
+
+The model predicts the result value y belongs to a class C_k, k in [0..K] as
+
+image::images/naive-bayes.png[]
+
+Where
+
+image::images/naive-bayes2.png[]
+
+
+The model returns the number (index) of the most possible class.
+The trainer counts means and variances for each class.
+
+
+[source, java]
+----
+GaussianNaiveBayesTrainer trainer = new GaussianNaiveBayesTrainer();
+
+GaussianNaiveBayesModel mdl = trainer.fit(ignite, dataCache, vectorizer);
+----
+
+The full example could be found https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/naivebayes/GaussianNaiveBayesTrainerExample.java[here].
+
+== Discrete (Bernoulli) Naive Bayes
+
+Naive Bayes algorithm over Bernoulli or multinomial distribution based on next https://en.wikipedia.org/wiki/Naive_Bayes_classifier#Multinomial_naive_Bayes[information].
+
+It can be used for non-continuous features. The thresholds to convert a feature to a discrete value should be set to a trainer. If the features are binary, the discrete Bayes becomes Bernoulli.
+
+The model predicts the result value y belongs to a class C_k, k in [0..K] as
+
+image::images/naive-bayes3.png[]
+
+Where x_i is a discrete feature, p_ki is a prior probability of class p(C_k).
+
+The model returns the number (index) of the most possible class.
+
+
+[source, java]
+----
+double[][] thresholds = new double[][] {{.5}, {.5}, {.5}, {.5}, {.5}};
+
+DiscreteNaiveBayesTrainer trainer = new DiscreteNaiveBayesTrainer()
+  .setBucketThresholds(thresholds);
+
+ DiscreteNaiveBayesModel mdl = trainer.fit(ignite, dataCache, vectorizer);
+----
+
+
+The full example could be found https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/naivebayes/DiscreteNaiveBayesTrainerExample.java[here].
+
+
+== Compound Naive Bayes
+
+Compound Naive Bayes is a composition of several Naive Bayes classifiers where each classifier represents subset of features of one type.
+
+The model contains both Gaussian and Discrete Bayes. A user can select which set of features will be trained on each model.
+
+The model returns the number (index) of the most possible class.
+
+
+
+[source, java]
+----
+double[] priorProbabilities = new double[] {.5, .5};
+
+double[][] thresholds = new double[][] {{.5}, {.5}, {.5}, {.5}, {.5}};
+
+CompoundNaiveBayesTrainer trainer = new CompoundNaiveBayesTrainer()
+  .withPriorProbabilities(priorProbabilities)
+  .withGaussianNaiveBayesTrainer(new GaussianNaiveBayesTrainer())
+  .withGaussianFeatureIdsToSkip(asList(3, 4, 5, 6, 7))
+  .withDiscreteNaiveBayesTrainer(new DiscreteNaiveBayesTrainer()
+                                 .setBucketThresholds(thresholds))
+  .withDiscreteFeatureIdsToSkip(asList(0, 1, 2));
+
+  CompoundNaiveBayesModel mdl = trainer.fit(ignite, dataCache, vectorizer);
+----
+
+The full example could be found https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/naivebayes/CompoundNaiveBayesExample.java[here].
+
diff --git a/docs/_docs/machine-learning/clustering/gaussian-mixture.adoc b/docs/_docs/machine-learning/clustering/gaussian-mixture.adoc
new file mode 100644
index 0000000..46b393e
--- /dev/null
+++ b/docs/_docs/machine-learning/clustering/gaussian-mixture.adoc
@@ -0,0 +1,57 @@
+= Gaussian mixture (GMM)
+
+A Gaussian mixture model is a probabilistic model that assumes all the data points are generated from a mixture of a finite number of Gaussian distributions with unknown parameters.
+
+NOTE: You could think of mixture models as generalizing k-means clustering to incorporate information about the covariance structure of the data as well as the centers of the latent Gaussians.
+
+== Model
+
+This algorithm represents a soft clustering model where each cluster is a Gaussian distribution with its own mean value and covariation matrix. Such a model can predict a cluster using the maximum likelihood principle.
+
+It defines the labels by the following way:
+
+
+[source, java]
+----
+KMeansModel mdl = trainer.fit(
+    ignite,
+    dataCache,
+    vectorizer
+);
+
+double clusterLabel = mdl.predict(inputVector);
+----
+
+
+== Trainer
+
+
+GMM is a unsupervised learning algorithm. The GaussianMixture object implements the expectation-maximization (EM) algorithm for fitting mixture-of-Gaussian models. It can compute the Bayesian Information Criterion to assess the number of clusters in the data.
+
+Presently, Ignite ML supports a few parameters for the GMM classification algorithm:
+
+* `maxCountOfClusters ` - the number of possible clusters
+* `maxCountOfIterations ` - one stop criteria (the other one is epsilon)
+* `epsilon` - delta of convergence(delta between old and new centroid's values)
+* `countOfComponents` - the number of components
+* `maxLikelihoodDivergence` - maximum divergence between maximum of likelihood of vector in dataset and other for anomalies identification
+* `minElementsForNewCluster` - minimum required anomalies in terms of maxLikelihoodDivergence for creating new cluster
+* `minClusterProbability` - minimum cluster probability
+
+
+[source, java]
+----
+// Set up the trainer
+GmmTrainer trainer = new GmmTrainer(COUNT_OF_COMPONENTS);
+
+// Build the model
+GmmModel mdl = trainer
+    .withMaxCountIterations(MAX_COUNT_ITERATIONS)
+    .withMaxCountOfClusters(MAX_AMOUNT_OF_CLUSTERS)
+    .fit(ignite, dataCache, vectorizer);
+----
+
+== Example
+
+To see how GMM clustering can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/clustering/GmmClusterizationExample.java[example] that is available on GitHub and delivered with every Apache Ignite distribution.
+
diff --git a/docs/_docs/machine-learning/clustering/introduction.adoc b/docs/_docs/machine-learning/clustering/introduction.adoc
new file mode 100644
index 0000000..701e81e
--- /dev/null
+++ b/docs/_docs/machine-learning/clustering/introduction.adoc
@@ -0,0 +1,8 @@
+= Introduction
+
+The Apache Ignite Machine Learning module provides K-Means and GMM algorithms to group the unlabeled data into clusters.
+
+All existing training algorithms presented in this section are designed to solve unsupervised (clustering) tasks:
+
+* K-Means Clustering
+* Gaussian mixture (GMM)
diff --git a/docs/_docs/machine-learning/clustering/k-means-clustering.adoc b/docs/_docs/machine-learning/clustering/k-means-clustering.adoc
new file mode 100644
index 0000000..5c75637
--- /dev/null
+++ b/docs/_docs/machine-learning/clustering/k-means-clustering.adoc
@@ -0,0 +1,66 @@
+= K-Means Clustering
+
+K-means is one of the most commonly used clustering algorithms that clusters the data points into a predefined number of clusters.
+
+== Model
+
+K-Means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster.
+
+The model holds a vector of k centers and one of the distance metrics provided by the ML framework such as Euclidean, Hamming, Manhattan and etc.
+
+It creates the label as follows:
+
+
+
+[source, java]
+----
+KMeansModel mdl = trainer.fit(
+    ignite,
+    dataCache,
+    vectorizer
+);
+
+
+double clusterLabel = mdl.predict(inputVector);
+----
+
+== Trainer
+
+
+KMeans is an unsupervised learning algorithm. It solves a clustering task which is the task of grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters).
+
+KMeans is a parametrized iterative algorithm which calculates the new means to be the centroids of the observations in the clusters on each iteration.
+
+Presently, Ignite supports a few parameters for the KMeans classification algorithm:
+
+* `k` - a number of possible clusters
+* `maxIterations` - one stop criteria (the other one is epsilon)
+* `epsilon` - delta of convergence (delta between old and new centroid's values)
+* `distance` - one of the distance metrics provided by the ML framework such as Euclidean, Hamming or Manhattan
+* `seed` - one of initialization parameters which helps to reproduce models (trainer has a random initialization step to get the first centroids)
+
+
+[source, java]
+----
+// Set up the trainer
+KMeansTrainer trainer = new KMeansTrainer()
+   .withDistance(new EuclideanDistance())
+   .withK(AMOUNT_OF_CLUSTERS)
+   .withMaxIterations(MAX_ITERATIONS)
+   .withEpsilon(PRECISION);
+
+// Build the model
+KMeansModel mdl = trainer.fit(
+    ignite,
+    dataCache,
+    vectorizer
+);
+----
+
+
+== Example
+
+
+To see how K-Means clustering can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/clustering/KMeansClusterizationExample.java[example^] that is available on GitHub and delivered with every Apache Ignite distribution.
+
+The training dataset is the subset of the Iris dataset (classes with labels 1 and 2, which are presented linear separable two-classes dataset) which can be loaded from the https://archive.ics.uci.edu/ml/datasets/iris[UCI Machine Learning Repository].
diff --git a/docs/_docs/machine-learning/ensemble-methods/bagging.adoc b/docs/_docs/machine-learning/ensemble-methods/bagging.adoc
new file mode 100644
index 0000000..579439e
--- /dev/null
+++ b/docs/_docs/machine-learning/ensemble-methods/bagging.adoc
@@ -0,0 +1,42 @@
+= Bagging
+
+Bagging stands for bootstrap aggregation. One way to reduce the variance of an estimate is to average together multiple estimates. For example, we can train M different trees on different subsets of the data (chosen randomly with replacement) and compute the ensemble:
+
+image::images/bagging.png[]
+
+Bagging uses bootstrap sampling to obtain the data subsets for training the base learners. For aggregating the outputs of base learners, bagging uses voting for classification and averaging for regression.
+
+
+[source, java]
+----
+// Define the weak classifier.
+DecisionTreeClassificationTrainer trainer = new DecisionTreeClassificationTrainer(5, 0);
+
+// Set up the bagging process.
+BaggedTrainer<Double> baggedTrainer = TrainerTransformers.makeBagged(
+  trainer, // Trainer for making bagged
+  10,      // Size of ensemble
+  0.6,     // Subsample ratio to whole dataset
+  4,       // Feature vector dimensionality
+  3,       // Feature subspace dimensionality
+  new OnMajorityPredictionsAggregator())
+  .withEnvironmentBuilder(LearningEnvironmentBuilder
+                          .defaultBuilder()
+                          .withRNGSeed(1)
+                         );
+
+// Train the Bagged Model.
+BaggedModel mdl = baggedTrainer.fit(
+  ignite,
+  dataCache,
+  vectorizer
+);
+----
+
+
+TIP: A commonly used class of ensemble algorithms are forests of randomized trees.
+
+== Example
+
+The full example could be found as a part of the Titanic tutorial https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/tutorial/Step_10_Bagging.java[here].
+
diff --git a/docs/_docs/machine-learning/ensemble-methods/gradient-boosting.adoc b/docs/_docs/machine-learning/ensemble-methods/gradient-boosting.adoc
new file mode 100644
index 0000000..7f16166
--- /dev/null
+++ b/docs/_docs/machine-learning/ensemble-methods/gradient-boosting.adoc
@@ -0,0 +1,85 @@
+= Gradient Boosting
+
+In machine learning, boosting is an ensemble meta-algorithm for primarily reducing bias, and also variance in supervised learning, and a family of machine learning algorithms that convert weak learners to strong ones.
+
+[NOTE]
+====
+[discrete]
+=== Question posed by Kearns and Valiant (1988, 1989)
+"Can a set of weak learners create a single strong learner?"
+
+ A weak learner is defined to be a classifier that is only slightly correlated with the true classification (it can label examples better than random guessing). In contrast, a strong learner is a classifier that is arbitrarily well-correlated with the true classification.
+====
+
+Later, in 1990 it was demonstrated by Robert Schapire and led to the boosting technique development.
+
+The boosing is presented in Ignite ML library as a Gradient Boosting (the most popular boosting implementation).
+
+== Overview
+
+
+Gradient boosting is a machine learning technique that produces a prediction model in the form of an https://en.wikipedia.org/wiki/Ensemble_learning[ensemble] of weak prediction models. A gradient boosting algorithm tries to solve the minimization error problem on learning samples in a functional space where each function is a model. Each model in this composition tries to predict a gradient of error for points in a feature space and these predictions will be summed with some weight to m [...]
+
+In Ignite ML there is an implementation of a general GDB algorithm and GDB-on-trees algorithm. General GDB (GDBRegressionTrainer and GDBBinaryClassifierTrainer) allows any trainer for training each model in composition. GDB on trees uses some optimizations specific for trees, such as indexes, for avoiding sorting during the decision tree build phase.
+
+
+== Model
+
+Apache Ignite ML purposes all implementations of the GDB algorithm to use GDBModel, wrapping ModelsComposition for representing the composition of a few models. ModelsComposition implements a common Model interface and can be used as follows:
+
+
+[source, java]
+----
+GDBModel model = ...;
+
+double prediction = model.predict(observation);
+----
+
+GDBModel uses WeightedPredictionsAggregator as the model answer reducer. This aggregator computes an answer of a meta-model, since “result = bias + p1*w1 + p2*w2 + ...” where
+
+ * `pi` - answer of i-th model.
+ * `wi` - weight of model in composition.
+
+GDB uses the mean value of labels for the bias-parameter in the aggregator.
+
+== Trainer
+
+Training of GDB is represented by `GDBRegressionTrainer`, `GDBBinaryClassificationTrainer` and `GDBRegressionOnTreesTrainer`, `GDBBinaryClassificationOnTreesTrainer` for general GDB and GDB on trees respectively. All trainers have the following parameters:
+
+  * `gradStepSize` - sets the constant weight of each model in composition; in future versions of Ignite ML this parameter may be computed dynamically.
+  * `cntOfIterations` - sets the maximum of models in the composition after training.
+  * `checkConvergenceFactory` - sets factory for construction of convergence checker used for preventing overfitting and learning of many useless models while training.
+
+For classifier trainers there is addition parameter:
+
+  * `loss` - sets loss computer on some learning example from a training dataset.
+
+There are several factories for convergence checkers:
+
+  * `ConvergenceCheckerStubFactory` creates a checker that always returns false for a convergence check. So in this case, model composition size will have cntOfIterations models.
+  * `MeanAbsValueConvergenceCheckerFactory` creates a checker that compute a mean value of the absolute gradient values on each example from a dataset and returns true if this it is less than the used-defined threshold.
+  * `MedianOfMedianConvergenceCheckerFactory` creates a checker that computes the median of median absolute gradient values on each data partition. This method is less sensitive for anomalies in the learning dataset, but GDB may converge longer.
+
+Example of training:
+
+
+
+[source, java]
+----
+// Set up trainer
+GDBTrainer trainer = new GDBBinaryClassifierOnTreesTrainer(
+  learningRate, countOfIterations, new LogLoss()
+).withCheckConvergenceStgyFactory(new MedianOfMedianConvergenceCheckFactory(precision));
+
+// Build the model
+GDBModel mdl = trainer.fit(
+  ignite,
+  dataCache,
+  vectorizer
+);
+----
+
+
+== Example
+
+To see how GDB Classifier can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/tree/boosting/GDBOnTreesClassificationTrainerExample.java[example] that is available on GitHub and delivered with every Apache Ignite distribution.
diff --git a/docs/_docs/machine-learning/ensemble-methods/introduction.adoc b/docs/_docs/machine-learning/ensemble-methods/introduction.adoc
new file mode 100644
index 0000000..1bd662f
--- /dev/null
+++ b/docs/_docs/machine-learning/ensemble-methods/introduction.adoc
@@ -0,0 +1,11 @@
+= Introduction
+
+In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone.  Typically, ML ensemble consists of only a concrete finite set of alternative models.
+
+Ensemble methods are meta-algorithms that combine several machine learning techniques into one predictive model in order to decrease variance (bagging), bias (boosting), or improve predictions (stacking).
+
+The most popular ensemble models are supported in Apache Ignite ML:
+
+* Stacking
+* Boosting via GradientBoosting
+* Bagging (Bootstrap aggregating) and RandomForest as a special case
diff --git a/docs/_docs/machine-learning/ensemble-methods/random-forest.adoc b/docs/_docs/machine-learning/ensemble-methods/random-forest.adoc
new file mode 100644
index 0000000..9150f18
--- /dev/null
+++ b/docs/_docs/machine-learning/ensemble-methods/random-forest.adoc
@@ -0,0 +1,71 @@
+= Random Forest
+
+== Random Forest in Apache Ignite
+
+Random forest is an ensemble learning method to solve any classification and regression problem. Random forest training builds a model composition (ensemble) of one type and uses some aggregation algorithm of several answers from models. Each model is trained on a part of the training dataset. The part is defined according to bagging and feature subspace methods. More information about these concepts may be found here: https://en.wikipedia.org/wiki/Random_forest, https://en.wikipedia.org [...]
+
+There are several implementations of aggregation algorithms in Apache Ignite ML:
+
+* `MeanValuePredictionsAggregator` - computes answer of a random forest as mean value of predictions from all models in the given composition. Often this is is used for regression tasks.
+* `OnMajorityPredictionsAggegator` - gets a mode of predictions from all models in the given composition. This can be useful for a classification task. NOTE: This aggregator supports multi-classification tasks.
+
+
+== Model
+
+The random forest algorithm is implemented in Ignite ML as a special case of a model composition with specific aggregators for different problems (`MeanValuePredictionsAggregator` for regression, `OnMajorityPredictionsAggegator` for classification).
+
+Here is an example of model usage:
+
+
+[source, java]
+----
+ModelsComposition randomForest = ….
+
+double prediction = randomForest.apply(featuresVector);
+
+----
+
+
+== Trainer
+
+The random forest training algorithm is implemented with RandomForestRegressionTrainer and RandomForestClassifierTrainer trainers with the following parameters:
+
+`meta` - features meta, list of feature type description such as:
+
+  * `featureId` - index in features vector.
+  * `isCategoricalFeature` - flag having true value if a feature is categorical.
+  * `featureName`.
+
+This meta-information is important for random forest training algorithms because it builds feature histograms and categorical features should be represented in histograms for all feature values:
+
+  * `featuresCountSelectionStrgy` - sets strategy defining count of random features for learning one tree. There are several strategies: SQRT, LOG2, ALL and ONE_THIRD strategies implemented in the FeaturesCountSelectionStrategies class.
+  * `maxDepth` - sets the maximum tree depth.
+  * `minInpurityDelta` - a node in a decision tree is split into two nodes if the impurity values on these two nodes is less than the unspilt node's minImpurityDecrease value.
+  * `subSampleSize` - value lying in the [0; MAX_DOUBLE]-interval. This parameter defines the count of sample repetitions in uniformly sampling with replacement.
+  * `seed` - seed value used in random generators.
+
+Random forest training may be used as follows:
+
+
+[source, java]
+----
+RandomForestClassifierTrainer trainer = new RandomForestClassifierTrainer(featuresMeta)
+  .withCountOfTrees(101)
+  .withFeaturesCountSelectionStrgy(FeaturesCountSelectionStrategies.ONE_THIRD)
+  .withMaxDepth(4)
+  .withMinImpurityDelta(0.)
+  .withSubSampleSize(0.3)
+  .withSeed(0);
+
+ModelsComposition rfModel = trainer.fit(
+  ignite,
+  dataCache,
+  vectorizer
+);
+----
+
+
+
+== Example
+
+To see how Random Forest Classifier can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/tree/randomforest/RandomForestClassificationExample.java[example] that is available on GitHub and delivered with every Apache Ignite distribution. In this example, a Wine recognition dataset was used. Description of this dataset and data are available from the https://archive.ics.uci.edu/ml/datasets/wine[UCI Machine Learni [...]
diff --git a/docs/_docs/machine-learning/ensemble-methods/stacking.adoc b/docs/_docs/machine-learning/ensemble-methods/stacking.adoc
new file mode 100644
index 0000000..d4d23d7
--- /dev/null
+++ b/docs/_docs/machine-learning/ensemble-methods/stacking.adoc
@@ -0,0 +1,35 @@
+= Stacking
+
+Stacking (sometimes called stacked generalization) involves training a learning algorithm to combine the predictions of several other learning algorithms.
+
+First, all of the other algorithms are trained using the available data, then a combiner algorithm is trained to make a final prediction using all the predictions of the other algorithms as additional inputs. If an arbitrary combiner algorithm is used, then stacking can theoretically represent any of the widely known ensemble techniques, although, in practice, a logistic regression model is often used as the combiner like in the example below.
+
+
+[source, java]
+----
+DecisionTreeClassificationTrainer trainer = new DecisionTreeClassificationTrainer(5, 0);
+DecisionTreeClassificationTrainer trainer1 = new DecisionTreeClassificationTrainer(3, 0);
+DecisionTreeClassificationTrainer trainer2 = new DecisionTreeClassificationTrainer(4, 0);
+
+LogisticRegressionSGDTrainer aggregator = new LogisticRegressionSGDTrainer()
+  .withUpdatesStgy(new UpdatesStrategy<>(new SimpleGDUpdateCalculator(0.2),
+                                         SimpleGDParameterUpdate.SUM_LOCAL,
+                                         SimpleGDParameterUpdate.AVG));
+
+StackedModel<Vector, Vector, Double, LogisticRegressionModel> mdl = new StackedVectorDatasetTrainer<>(aggregator)
+  .addTrainerWithDoubleOutput(trainer)
+  .addTrainerWithDoubleOutput(trainer1)
+  .addTrainerWithDoubleOutput(trainer2)
+  .fit(ignite,
+       dataCache,
+       vectorizer
+      );
+
+----
+
+NOTE: The Evaluator works well with the StackedModel
+
+
+== Example
+
+The full example could be found as a part of the Titanic tutorial https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/tutorial/Step_9_Scaling_With_Stacking.java[here].
diff --git a/docs/_docs/machine-learning/importing-model/introduction.adoc b/docs/_docs/machine-learning/importing-model/introduction.adoc
new file mode 100644
index 0000000..3e277b4
--- /dev/null
+++ b/docs/_docs/machine-learning/importing-model/introduction.adoc
@@ -0,0 +1,12 @@
+= Introduction
+
+Apache Ignite since 2.8 supports importing Machine Learning models from external platforms including Apache Spark ML and XGBoost. By working with imported models, you can:
+
+- store imported models in Ignite for further inference,
+- use imported models as part of pipelines,
+- apply ensembling methods such as boosting, bagging, or stacking to those models.
+
+Also, imported pre-trained models can be updated inside Apache Ignite.
+
+Apache Ignite provides an API for distributed inference for models trained in [Apache Spark ML], [XGBoost], and [H2O].
+
diff --git a/docs/_docs/machine-learning/importing-model/model-import-from-apache-spark.adoc b/docs/_docs/machine-learning/importing-model/model-import-from-apache-spark.adoc
new file mode 100644
index 0000000..2a9bdbb
--- /dev/null
+++ b/docs/_docs/machine-learning/importing-model/model-import-from-apache-spark.adoc
@@ -0,0 +1,70 @@
+= Import Model from Apache Spark
+
+Starting with Ignite 2.8,  it's possible to import the following models of Apache Spark ML:
+
+- Logistic regression (`org.apache.spark.ml.classification.LogisticRegressionModel`)
+- Linear regression (`org.apache.spark.ml.classification.LogisticRegressionModel`)
+- Decision tree (`org.apache.spark.ml.classification.DecisionTreeClassificationModel`)
+- Support Vector Machine (`org.apache.spark.ml.classification.LinearSVCModel`)
+- Random forest (`org.apache.spark.ml.classification.RandomForestClassificationModel`)
+- K-Means (`org.apache.spark.ml.clustering.KMeansModel`)
+- Decision tree regression (`org.apache.spark.ml.regression.DecisionTreeRegressionModel`)
+- Random forest regression (`org.apache.spark.ml.regression.RandomForestRegressionModel`)
+- Gradient boosted trees regression (`org.apache.spark.ml.regression.GBTRegressionModel`)
+- Gradient boosted trees (`org.apache.spark.ml.classification.GBTClassificationModel`)
+
+This feature works with models saved in _snappy.parquet_ files.
+
+Supported and tested Spark version: 2.3.0
+Possibly might work with next Spark versions: 2.1, 2.2, 2.3, 2.4
+
+To get the model from Spark ML you should save the model built as a result of training in Spark ML to the parquet file like in example below:
+
+
+[source, scala]
+----
+val spark: SparkSession = TitanicUtils.getSparkSession
+
+val passengers = TitanicUtils.readPassengersWithCasting(spark)
+    .select("survived", "pclass", "sibsp", "parch", "sex", "embarked", "age")
+
+// Step - 1: Make Vectors from dataframe's columns using special VectorAssmebler
+val assembler = new VectorAssembler()
+    .setInputCols(Array("pclass", "sibsp", "parch", "survived"))
+    .setOutputCol("features")
+
+// Step - 2: Transform dataframe to vectorized dataframe with dropping rows
+val output = assembler.transform(
+    passengers.na.drop(Array("pclass", "sibsp", "parch", "survived", "age"))
+).select("features", "age")
+
+
+val lr = new LinearRegression()
+    .setMaxIter(100)
+    .setRegParam(0.1)
+    .setElasticNetParam(0.1)
+    .setLabelCol("age")
+    .setFeaturesCol("features")
+
+// Fit the model
+val model = lr.fit(output)
+model.write.overwrite().save("/home/models/titanic/linreg")
+----
+
+
+To load in Ignite ML you should use SparkModelParser class via method parse() call
+
+
+[source, java]
+----
+DecisionTreeNode mdl = (DecisionTreeNode)SparkModelParser.parse(
+   SPARK_MDL_PATH,
+   SupportedSparkModels.DECISION_TREE
+);
+----
+
+You can see more examples of using this API in the examples module in the package: `org.apache.ignite.examples.ml.inference.spark.modelparser`
+
+NOTE: It does not support loading from PipelineModel in Spark.
+It does not support intermediate feature transformers from Spark due to different nature of preprocessing on Ignite and Spark side.
+
diff --git a/docs/_docs/machine-learning/importing-model/model-import-from-gxboost.adoc b/docs/_docs/machine-learning/importing-model/model-import-from-gxboost.adoc
new file mode 100644
index 0000000..99def27
--- /dev/null
+++ b/docs/_docs/machine-learning/importing-model/model-import-from-gxboost.adoc
@@ -0,0 +1,21 @@
+= Import Model from XGBoost
+
+Using Apache Ignite you can import pre-trained models from XGBoost. The models are translated into Apache Ignite ML models. Apache Ignite ML also provides the ability to import pre-trained XGBoost models for local or distributed inference.
+
+The difference between translating the model into an Apache Ignite ML model and performing distributed inference is in the parser implementation. This example shows how you can import a model from XGBoost and translate it to an Apache Ignite ML model for distributed inference:
+
+
+[source, java]
+----
+File mdlRsrc = IgniteUtils.resolveIgnitePath(TEST_MODEL_RES);
+
+ModelReader reader = new FileSystemModelReader(mdlRsrc.getPath());
+
+XGModelParser parser = new XGModelParser();
+
+AsyncModelBuilder mdlBuilder = new IgniteDistributedModelBuilder(ignite, 4, 4);
+
+Model<NamedVector, Future<Double>> mdl = mdlBuilder.build(reader, parser);
+
+----
+
diff --git a/docs/_docs/machine-learning/machine-learning.adoc b/docs/_docs/machine-learning/machine-learning.adoc
new file mode 100644
index 0000000..c7790a4
--- /dev/null
+++ b/docs/_docs/machine-learning/machine-learning.adoc
@@ -0,0 +1,125 @@
+= Machine Learning
+
+== Overview
+
+Apache Ignite Machine Learning (ML) is a set of simple, scalable and efficient tools that allow the building of predictive Machine Learning models without costly data transfers.
+
+The rationale for adding machine and deep learning (DL) to Apache Ignite is quite simple. Today's data scientists have to deal with two major factors that keep ML from mainstream adoption:
+
+* First, the models are trained and deployed (after the training is over) in different systems. The data scientists have to wait for ETL or some other data transfer process to move the data into a system like Apache Mahout or Apache Spark for a training purpose. Then they have to wait while this process completes and redeploy the models in a production environment. The whole process can take hours moving terabytes of data from one system to another. Moreover, the training part usually ha [...]
+
+* The second factor is related to scalability. ML and DL algorithms that have to process data sets which no longer fit within a single server unit are constantly growing. This urges the data scientist to come up with sophisticated solutions or turn to distributed computing platforms such as Apache Spark and TensorFlow. However, those platforms mostly solve only a part of the puzzle which is the model training, making it a burden of the developers to decide how do deploy the models in pr [...]
+
+
+image::images/machine_learning.png[]
+
+
+=== Zero ETL and Massive Scalability
+
+Ignite Machine Learning relies on Ignite's memory-centric storage that brings massive scalability for ML and DL tasks and eliminates the wait imposed by ETL between the different systems. For instance, it allows users to run ML/DL training and inference directly on data stored across memory and disk in an Ignite cluster. Next, Ignite provides a host of ML and DL algorithms that are optimized for Ignite's collocated distributed processing. These implementations deliver in-memory speed and [...]
+
+
+=== Fault Tolerance and Continuous Learning
+
+Apache Ignite Machine Learning is tolerant to node failures. This means that in the case of node failures during the learning process, all recovery procedures will be transparent to the user, learning processes won't be interrupted, and we will get results in the time similar to the case when all nodes work fine. For more information please see link:machine-learning/partition-based-dataset[Partition Based Dataset].
+
+
+== Algorithms and Applicability
+
+=== Classification
+
+Identifying to which category a new observation belongs, on the basis of a training set.
+
+*Applicability:* spam detection, image recognition, credit scoring, disease identification.
+
+*Algorithms:* link:machine-learning/binary-classification/logistic-regression[Logistic Regression], link:machine-learning/binary-classification/linear-svm[Linear SVM (Support Vector Machine)], link:machine-learning/binary-classification/knn-classification[k-NN Classification], link:machine-learning/binary-classification/naive-bayes[Naive Bayes], link:machine-learning/binary-classification/decision-trees[Decision Trees], link:machine-learning/binary-classification/random-forest[Random For [...]
+
+
+=== Regression
+
+Modeling the relationship between a scalar dependent variable (y) and one or more explanatory variables or independent variables (x).
+
+
+*Applicability:* drug response, stock prices, supermarket revenue.
+
+*Algorithms:* Linear Regression, Decision Trees Regression, k-NN Regression.
+
+=== Clustering
+
+Grouping a set of objects in such a way that objects in the same group (called a cluster) are more similar (in some sense) to each other than to those in other groups (clusters).
+
+*Applicability:* customer segmentation, grouping experiment outcomes, grouping of shopping items.
+
+*Algorithms:* K-Means Clustering, Gaussian mixture (GMM).
+
+=== Recommendation
+
+Building a recommendation system, which is a subclass of information filtering systems that seeks to predict the "rating" or "preference" a user would give to an item.
+
+*Applicability:*  playlist generators for video and music services, product recommenders for services
+
+*Algorithms:* link:machine-learning/recommendation-systems[Matrix Factorization].
+
+=== Preprocessing
+
+Feature extraction and normalization.
+
+*Applicability:* transform input data such as text for use with machine learning algorithms, to extract features we need to fit on, to normalize input data.
+
+*Algorithms:* Apache Ignite ML supports custom preprocessing using partition based dataset capabilities and has default link:machine-learning/preprocessing[preprocessors] such as normalization preprocessor, one-hot-encoder, min-max scaler and so on.
+
+
+== Getting Started
+
+The fastest way to get started with the Machine Learning is to build and run existing examples, study their output and keep coding. The ML examples are located in the https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/ml[examples] folder of every Apache Ignite distribution.
+
+Follow the steps below to try out the examples:
+
+. Download Apache Ignite version 2.8 or later.
+. Open the `examples` project in an IDE, such as IntelliJ IDEA or Eclipse.
+. Go to the `src/main/java/org/apache/ignite/examples/ml` folder in the IDE and run an ML example.
+
+The examples do not require any special configuration. All ML  examples will launch, run and stop successfully without any user intervention and provide meaningful output on the console. Additionally, the Tracer API example will launch a web browser and generate HTML output.
+
+=== Get it With Maven
+
+Add the Maven dependency below to your project in order to include the ML functionality provided by Ignite:
+
+[source, xml]
+----
+<dependency>
+    <groupId>org.apache.ignite</groupId>
+    <artifactId>ignite-ml</artifactId>
+    <version>${ignite.version}</version>
+</dependency>
+
+----
+
+
+Replace `${ignite-version}` with an actual Ignite version.
+
+=== Build From Sources
+
+The latest Apache Ignite Machine Learning jar is always uploaded to the Maven repository. If you need to take the jar and deploy it in a custom environment, then it can be either downloaded from Maven or built from scratch. To build the Machine Learning component from sources:
+
+1. Download the latest Apache Ignite source release.
+2. Clean the local Maven repository (this is to ensure that older Maven builds don’t impact the build).
+3. Build and install Apache Ignite from the project's root directory:
++
+[source, shell]
+----
+mvn clean install -DskipTests -Dmaven.javadoc.skip=true
+----
+
+4. Locate the Machine Learning jar in your local Maven repository under the path `{user_dir}/.m2/repository/org/apache/ignite/ignite-ml/{ignite-version}/ignite-ml-{ignite-version}.jar`.
+
+5. If you want to build ML or DL examples from sources, execute the following commands:
++
+[source, shell]
+----
+cd examples
+mvn clean package -DskipTests
+----
+
+
+If needed, refer to `DEVNOTES.txt` in the project's root folder and the `README` files in the `ignite-ml` component for more details.
diff --git a/docs/_docs/machine-learning/model-selection/cross-validation.adoc b/docs/_docs/machine-learning/model-selection/cross-validation.adoc
new file mode 100644
index 0000000..4d64895
--- /dev/null
+++ b/docs/_docs/machine-learning/model-selection/cross-validation.adoc
@@ -0,0 +1,76 @@
+= Cross-Validation
+
+Cross validation functionality in Apache Ignite is represented by the `CrossValidation` class. This is a calculator parameterized by the type of model, type of label and key-value types of data. After instantiation (constructor doesn’t accept any additional parameters) we can use a score method to perform cross validation.
+
+Let’s imagine that we have a trainer, a training set and we want to make cross validation using accuracy as a metric and using 4 folds. Apache Ignite allows us to do this as shown in the following example:
+
+
+== Cross-Validation (without Pipeline API usage)
+
+[source, java]
+----
+// Create classification trainer
+DecisionTreeClassificationTrainer trainer = new DecisionTreeClassificationTrainer(4, 0);
+
+// Create cross-validation instance
+CrossValidation<DecisionTreeNode, Integer, Vector> scoreCalculator
+  = new CrossValidation<>();
+
+// Set up the cross-validation process
+scoreCalculator
+    .withIgnite(ignite)
+    .withUpstreamCache(trainingSet)
+    .withTrainer(trainer)
+    .withMetric(MetricName.ACCURACY)
+    .withPreprocessor(vectorizer)
+    .withAmountOfFolds(4)
+    .isRunningOnPipeline(false)
+
+// Calculate accuracy for each fold
+double[] accuracyScores = scoreCalculator.scoreByFolds();
+----
+
+In this example we specify trainer and metric as parameters, after that we pass common training arguments such as a link to the Ignite instance, cache, vectorizers, and finally specify the number of folds. This method returns an array containing chosen metrics for all possible splits of the training set.
+
+== Cross-Validation (with Pipeline API usage)
+
+Define the pipeline and pass it as a parameter to Cross-Validation instance to run cross-validation on Pipeline.
+
+CAUTION: The Pipeline API is experimental and could be changed in the next releases.
+
+
+[source, java]
+----
+// Create classification trainer
+DecisionTreeClassificationTrainer trainer = new DecisionTreeClassificationTrainer(4, 0);
+
+Pipeline<Integer, Vector, Integer, Double> pipeline
+  = new Pipeline<Integer, Vector, Integer, Double>()
+    .addVectorizer(vectorizer)
+    .addPreprocessingTrainer(new ImputerTrainer<Integer, Vector>())
+    .addPreprocessingTrainer(new MinMaxScalerTrainer<Integer, Vector>())
+    .addTrainer(trainer);
+
+
+// Create cross-validation instance
+CrossValidation<DecisionTreeNode, Integer, Vector> scoreCalculator
+  = new CrossValidation<>();
+
+// Set up the cross-validation process
+scoreCalculator
+    .withIgnite(ignite)
+    .withUpstreamCache(trainingSet)
+    .withPipeline(pipeline)
+    .withMetric(MetricName.ACCURACY)
+    .withPreprocessor(vectorizer)
+    .withAmountOfFolds(4)
+    .isRunningOnPipeline(false)
+
+// Calculate accuracy for each fold
+double[] accuracyScores = scoreCalculator.scoreByFolds();
+----
+
+
+== Example
+
+To see how the Cross Validation can be used in practice, try https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/selection/cv/CrossValidationExample.java[this example] and see step https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/tutorial/Step_8_CV_with_Param_Grid_and_pipeline.java[8 of ML Tutorial] that are available on GitHub and delivered with every Apache Ignite distribution.
diff --git a/docs/_docs/machine-learning/model-selection/evaluator.adoc b/docs/_docs/machine-learning/model-selection/evaluator.adoc
new file mode 100644
index 0000000..7a2bb2f
--- /dev/null
+++ b/docs/_docs/machine-learning/model-selection/evaluator.adoc
@@ -0,0 +1,93 @@
+= Evaluator
+
+Apache Ignite ML comes with a number of machine learning algorithms that can be used to learn from and make predictions on data. When these algorithms are applied to build machine learning models, there is a need to evaluate the performance of the model on some criteria, which depends on the application and its requirements. Apache Ignite ML also provides a suite of classification and regression metrics for the purpose of evaluating the performance of machine learning models.
+
+== Classification model evaluation
+
+While there are many different types of classification algorithms, the evaluation of classification models all share similar principles. In a supervised classification problem, there exists a true output and a model-generated predicted output for each data point. For this reason, the results for each data point can be assigned to one of four categories:
+
+* True Positive (TP) - label is positive and prediction is also positive
+* True Negative (TN) - label is negative and prediction is also negative
+* False Positive (FP) - label is negative but prediction is positive
+* False Negative (FN) - label is positive but prediction is negative
+
+Especially, these metrics are important for binary classification.
+
+CAUTION: Multiclass classification evalution is not supported yet in Apache Ignite ML.
+
+The full list of binary classification metrics supported in Apache Ignite ML is next:
+
+* Accuracy
+* Balanced accuracy
+* F-Measure
+* FallOut
+* FN
+* FP
+* FDR
+* MissRate
+* NPV
+* Precision
+* Recall
+* Specificity
+* TN
+* TP
+
+The explanation and formulas for these metrics can be found https://en.wikipedia.org/wiki/Evaluation_of_binary_classifiers[here].
+
+
+[source, java]
+----
+// Define the vectorizer.
+Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>()
+   .labeled(Vectorizer.LabelCoordinate.FIRST);
+
+// Define the trainer.
+SVMLinearClassificationTrainer trainer = new SVMLinearClassificationTrainer();
+
+// Train the model.
+SVMLinearClassificationModel mdl = trainer.fit(ignite, dataCache, vectorizer);
+
+// Calculate all classification metrics.
+EvaluationResult res = Evaluator
+  .evaluateBinaryClassification(dataCache, mdl, vectorizer);
+
+double accuracy = res.get(MetricName.ACCURACY)
+----
+
+
+== Regression model evaluation
+
+Regression analysis is used when predicting a continuous output variable from a number of independent variables.
+
+The full list of regression metrics supported in Apache Ignite ML is as follows:
+
+* MAE
+* R2
+* RMSE
+* RSS
+* MSE
+
+
+[source, java]
+----
+// Define the vectorizer.
+Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>()
+   .labeled(Vectorizer.LabelCoordinate.FIRST);
+
+// Define the trainer.
+KNNRegressionTrainer trainer = new KNNRegressionTrainer()
+    .withK(5)
+    .withDistanceMeasure(new ManhattanDistance())
+    .withIdxType(SpatialIndexType.BALL_TREE)
+    .withWeighted(true);
+
+// Train the model.
+KNNRegressionModel knnMdl = trainer.fit(ignite, dataCache, vectorizer);
+
+// Calculate all classification metrics.
+EvaluationResult res = Evaluator
+  .evaluateRegression(dataCache, mdl, vectorizer);
+
+double mse = res.get(MetricName.MSE);
+----
+
diff --git a/docs/_docs/machine-learning/model-selection/hyper-parameter-tuning.adoc b/docs/_docs/machine-learning/model-selection/hyper-parameter-tuning.adoc
new file mode 100644
index 0000000..089d79b
--- /dev/null
+++ b/docs/_docs/machine-learning/model-selection/hyper-parameter-tuning.adoc
@@ -0,0 +1,51 @@
+= Hyper-parameter tuning
+
+In machine learning, hyperparameter optimization or tuning is the problem of choosing a set of optimal hyperparameters for a learning algorithm. A hyperparameter is a parameter whose value is used to control the learning process. By contrast, the values of other parameters (typically node weights) are learned.
+
+In Apache Ignite ML you could tune the model by changing of hyper-parameters (preprocessor and trainer's hyper-parameters).
+
+The main object to keep the all possible values of hyper-parameters is the ParamGrid object.
+
+
+[source, java]
+----
+DecisionTreeClassificationTrainer trainerCV = new DecisionTreeClassificationTrainer();
+
+ParamGrid paramGrid = new ParamGrid()
+    .addHyperParam("maxDeep", trainerCV::withMaxDeep,
+                   new Double[] {1.0, 2.0, 3.0, 4.0, 5.0, 10.0})
+    .addHyperParam("minImpurityDecrease", trainerCV::withMinImpurityDecrease,
+                   new Double[] {0.0, 0.25, 0.5});
+----
+
+There are a few approaches to find the optimal set of hyper-parameters:
+
+* *BruteForce (GridSearch)* - The traditional way of performing hyperparameter optimization has been grid search, or a parameter sweep, which is simply an exhaustive searching through a manually specified subset of the hyperparameter space of a learning algorithm.
+* *Random search* - It replaces the exhaustive enumeration of all combinations by selecting them randomly.
+* *Evolutionary optimization* - Evolutionary optimization is a methodology for the global optimization of noisy black-box functions. In hyperparameter optimization, evolutionary optimization uses evolutionary algorithms to search the space of hyperparameters for a given algorithm.
+
+The Random Search ParamGrid is could be set up as follows:
+
+
+[source, java]
+----
+ParamGrid paramGrid = new ParamGrid()
+    .withParameterSearchStrategy(
+         new RandomStrategy()
+             .withMaxTries(10)
+             .withSeed(12L))
+    .addHyperParam("p", normalizationTrainer::withP,
+                   new Double[] {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0})
+    .addHyperParam("maxDeep", trainerCV::withMaxDeep,
+                   new Double[] {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0})
+    .addHyperParam("minImpurityDecrease", trainerCV::withMinImpurityDecrease,
+                   new Double[] {0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 1.0});
+----
+
+
+[TIP]
+====
+Performance Tip:
+
+The GridSearch (BruteForce) and Evolutionary optimization methods could be easily parallelized because all training runs are independent from each other.
+====
diff --git a/docs/_docs/machine-learning/model-selection/introduction.adoc b/docs/_docs/machine-learning/model-selection/introduction.adoc
new file mode 100644
index 0000000..0b28f59
--- /dev/null
+++ b/docs/_docs/machine-learning/model-selection/introduction.adoc
@@ -0,0 +1,18 @@
+= Introduction
+
+This section describes how to use Ignite ML for tuning ML algorithms and [Pipelines](doc:pipeline-api) . Built-in Cross-Validation and other tooling allow users to optimize [hyper-parameters](doc:hyper-parameter-tuning) in algorithms and Pipelines.
+
+Model selection is a set of tools that provides the ability to prepare and [evaluate](doc:evaluator)  models efficiently. Use it to link:machine-learning/model-selection/split-the-dataset-on-test-and-train-datasets[split] data based on training and test data as well as perform cross validation.
+
+
+== Overview
+
+It is not good practice to learn the parameters of a prediction function and validate it on the same data. This leads to overfitting. To avoid this problem, one of the most efficient solutions is to save part of the training data as a validation set. However, by partitioning the available data and excluding one or more parts from the training set, we significantly reduce the number of samples which can be used for learning the model and the results can depend on a particular random choic [...]
+
+A solution to this problem is a procedure called link:machine-learning/model-selection/cross-validation[Cross-Validation]. In the basic approach, called k-fold CV, the training set is split into k smaller sets and after that the following procedure works: a model is trained using k-1 of the folds (parts) as a training data, the resulting model is validated on the remaining part of the data (it’s used as a test set to compute metrics such as accuracy).
+
+Apache Ignite provides cross validation functionality that allows it to parameterize the trainer to be validated, metrics to be calculated for the model trained on every step and the number of folds training data should be split on.
+
+
+
+
diff --git a/docs/_docs/machine-learning/model-selection/pipeline-api.adoc b/docs/_docs/machine-learning/model-selection/pipeline-api.adoc
new file mode 100644
index 0000000..55728cb
--- /dev/null
+++ b/docs/_docs/machine-learning/model-selection/pipeline-api.adoc
@@ -0,0 +1,111 @@
+= Pipelines API
+
+Apache Ignite ML standardizes APIs for machine learning algorithms to make it easier to combine multiple algorithms into a single pipeline, or workflow. This section covers the key concepts introduced by the Pipelines API, where the pipeline concept is mostly inspired by the scikit-learn and Apache Spark projects.
+
+* **Preprocessor Model **- This is an algorithm which can transform one DataSet into another DataSet.
+
+* **Preprocessor Trainer**- This is an algorithm which can be fit on a DataSet to produce a PreprocessorModel.
+
+* **Pipeline **-  A Pipeline chains multiple Trainers and Preprocessors together to specify an ML workflow.
+
+* **Parameter **- All ML Trainers and Preprocessor Trainers now share a common API for specifying parameters.
+
+CAUTION: The Pipeline API is experimental and could be changed in the next releases.
+
+
+The Pipeline could replace the pieces of code with .fit() method calls as in the next examples:
+
+
+[tabs]
+--
+tab:Without Pipeline API[]
+
+[source, java]
+----
+final Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>(0, 3, 4, 5, 6, 8, 10).labeled(1);
+
+TrainTestSplit<Integer, Vector> split = new TrainTestDatasetSplitter<Integer, Vector>()
+  .split(0.75);
+
+Preprocessor<Integer, Vector> imputingPreprocessor = new ImputerTrainer<Integer, Vector>()
+  .fit(ignite,
+       dataCache,
+       vectorizer
+      );
+
+Preprocessor<Integer, Vector> minMaxScalerPreprocessor = new MinMaxScalerTrainer<Integer, Vector>()
+  .fit(ignite,
+       dataCache,
+       imputingPreprocessor
+      );
+
+Preprocessor<Integer, Vector> normalizationPreprocessor = new NormalizationTrainer<Integer, Vector>()
+  .withP(1)
+  .fit(ignite,
+       dataCache,
+       minMaxScalerPreprocessor
+      );
+
+// Tune hyper-parameters with K-fold Cross-Validation on the split training set.
+
+DecisionTreeClassificationTrainer trainerCV = new DecisionTreeClassificationTrainer();
+
+CrossValidation<DecisionTreeNode, Integer, Vector> scoreCalculator = new CrossValidation<>();
+
+ParamGrid paramGrid = new ParamGrid()
+  .addHyperParam("maxDeep", trainerCV::withMaxDeep, new Double[] {1.0, 2.0, 3.0, 4.0, 5.0, 10.0})
+  .addHyperParam("minImpurityDecrease", trainerCV::withMinImpurityDecrease, new Double[] {0.0, 0.25, 0.5});
+
+scoreCalculator
+  .withIgnite(ignite)
+  .withUpstreamCache(dataCache)
+  .withTrainer(trainerCV)
+  .withMetric(MetricName.ACCURACY)
+  .withFilter(split.getTrainFilter())
+  .isRunningOnPipeline(false)
+  .withPreprocessor(normalizationPreprocessor)
+  .withAmountOfFolds(3)
+  .withParamGrid(paramGrid);
+
+CrossValidationResult crossValidationRes = scoreCalculator.tuneHyperParameters();
+----
+
+tab:With Pipeline API[]
+
+[source, java]
+----
+final Vectorizer<Integer, Vector, Integer, Double> vectorizer = new DummyVectorizer<Integer>(0, 4, 5, 6, 8).labeled(1);
+
+TrainTestSplit<Integer, Vector> split = new TrainTestDatasetSplitter<Integer, Vector>()
+  .split(0.75);
+
+DecisionTreeClassificationTrainer trainerCV = new DecisionTreeClassificationTrainer();
+
+Pipeline<Integer, Vector, Integer, Double> pipeline = new Pipeline<Integer, Vector, Integer, Double>()
+  .addVectorizer(vectorizer)
+  .addPreprocessingTrainer(new ImputerTrainer<Integer, Vector>())
+  .addPreprocessingTrainer(new MinMaxScalerTrainer<Integer, Vector>())
+  .addTrainer(trainer);
+
+CrossValidation<DecisionTreeNode, Integer, Vector> scoreCalculator = new CrossValidation<>();
+
+ParamGrid paramGrid = new ParamGrid()
+  .addHyperParam("maxDeep", trainer::withMaxDeep, new Double[] {1.0, 2.0, 3.0, 4.0, 5.0, 10.0})
+  .addHyperParam("minImpurityDecrease", trainer::withMinImpurityDecrease, new Double[] {0.0, 0.25, 0.5});
+
+scoreCalculator
+  .withIgnite(ignite)
+  .withUpstreamCache(dataCache)
+  .withPipeline(pipeline)
+  .withMetric(MetricName.ACCURACY)
+  .withFilter(split.getTrainFilter())
+  .withAmountOfFolds(3)
+  .withParamGrid(paramGrid);
+
+
+CrossValidationResult crossValidationRes = scoreCalculator.tuneHyperParameters();
+----
+--
+
+The full code could be found in the https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/tutorial/Step_8_CV_with_Param_Grid_and_pipeline.java[Titanic tutorial].
+
diff --git a/docs/_docs/machine-learning/model-selection/split-the-dataset-on-test-and-train-datasets.adoc b/docs/_docs/machine-learning/model-selection/split-the-dataset-on-test-and-train-datasets.adoc
new file mode 100644
index 0000000..aae2b84
--- /dev/null
+++ b/docs/_docs/machine-learning/model-selection/split-the-dataset-on-test-and-train-datasets.adoc
@@ -0,0 +1,52 @@
+= Split the dataset on test and train datasets
+
+Data splitting is meant to split the data stored in a cache into two parts: the training part that is used to train the model, and the test part that is used to estimate the model quality.
+
+All fit() methods has a special parameter to pass a filter condition to each cache.
+
+[NOTE]
+====
+Due to distributed and lazy nature of dataset operations, the dataset split is the lazy operation too and could be defined as a filter condition that could be applied to the initial cache to form both, the train and test datasets.
+====
+
+In the example below the model is trained only on 75% of the initial dataset. The filter parameter value is the result of the `split.getTrainFilter()` that could continue with or reject the row from the initial dataset to handle it during the training.
+
+
+[source, java]
+----
+// Define the cache.
+IgniteCache<Integer, Vector> dataCache = ...;
+
+// Define the percentage of the train sub-set of the initial dataset.
+TrainTestSplit<Integer, Vector> split = new TrainTestDatasetSplitter<>().split(0.75);
+
+IgniteModel<Vector, Double> mdl = trainer
+  .fit(ignite, dataCache, split.getTrainFilter(), vectorizer);
+----
+
+
+The `split.getTestFilter()` could be used to validate the model on the test data.
+Below is the example of working with the cache directly: printing the predicted and real regression value from the test sub-set of the initial dataset.
+
+
+[source, java]
+----
+// Define the cache query and set the filter.
+ScanQuery<Integer, Vector> qry = new ScanQuery<>();
+qry.setFilter(split.getTestFilter());
+
+
+try (QueryCursor<Cache.Entry<Integer, Vector>> observations = dataCache.query(qry)) {
+    for (Cache.Entry<Integer, Vector> observation : observations) {
+         Vector val = observation.getValue();
+         Vector inputs = val.copyOfRange(1, val.size());
+         double groundTruth = val.get(0);
+
+         double prediction = mdl.predict(inputs);
+
+         System.out.printf(">>> | %.4f\t\t| %.4f\t\t|\n", prediction, groundTruth);
+    }
+}
+----
+
+
diff --git a/docs/_docs/machine-learning/multiclass-classification.adoc b/docs/_docs/machine-learning/multiclass-classification.adoc
new file mode 100644
index 0000000..a82b167
--- /dev/null
+++ b/docs/_docs/machine-learning/multiclass-classification.adoc
@@ -0,0 +1,41 @@
+= Multiclass Classification
+
+In machine learning, multiclass or multinomial classification is the problem of classifying instances into one of three or more classes.
+
+Currently, Apache Ignite ML support the most popular method of Multiclass classification known as One-vs-Rest.
+
+One-vs-Rest strategy involves training a single classifier per class, with the samples of that class as positive samples and all other samples as negatives.
+
+Internally it uses one dataset but with the different changed labels for each trained classifier. If you have N classes, the N classifiers will be trained to become a MultiClassModel.
+
+MultiClassModel uses soft-margin technique to predict the real label. It means that the MultiClassModel returns the label of the class which is better suited for the predicted vector.
+
+
+== Example
+
+To see how One-vs-Rest trainer parametrized by binary SVM classifier can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/multiclass/OneVsRestClassificationExample.java[example] that is available on GitHub and delivered with every Apache Ignite distribution.
+
+The preprocessed Glass dataset is from the https://archive.ics.uci.edu/ml/datasets/Glass+Identification[UCI Machine Learning Repository].
+
+There are 3 classes with labels: 1 (building_windows_float_processed), 3 (vehicle_windows_float_processed), 7 (headlamps) and feature names: 'Na-Sodium', 'Mg-Magnesium', 'Al-Aluminum', 'Ba-Barium', 'Fe-Iron'.
+
+
+[source, java]
+----
+OneVsRestTrainer<SVMLinearClassificationModel> trainer
+                    = new OneVsRestTrainer<>(new SVMLinearClassificationTrainer()
+                    .withAmountOfIterations(20)
+                    .withAmountOfLocIterations(50)
+                    .withLambda(0.2)
+                    .withSeed(1234L)
+                );
+
+MultiClassModel<SVMLinearClassificationModel> mdl = trainer.fit(
+                    ignite,
+                    dataCache,
+                    new DummyVectorizer<Integer>().labeled(0)
+                );
+
+double prediction = mdl.predict(inputVector);
+----
+
diff --git a/docs/_docs/machine-learning/partition-based-dataset.adoc b/docs/_docs/machine-learning/partition-based-dataset.adoc
new file mode 100644
index 0000000..ed1f14e
--- /dev/null
+++ b/docs/_docs/machine-learning/partition-based-dataset.adoc
@@ -0,0 +1,86 @@
+= Partition Based Dataset
+
+== Overview
+
+Partition-Based Dataset is an abstraction layer on top of the Apache Ignite storage and computational capabilities that allow us to build algorithms in accordance with link:machine-learning/machine-learning#section-zero-etl-and-massive-scalability[zero ETL] and link:machine-learning/machine-learning#section-fault-tolerance-and-continuous-learning[fault tolerance] principles.
+
+A main idea behind the partition-based datasets is the classic MapReduce approach implemented using the Compute Grid in Ignite.
+
+The most important advantage of MapReduce is the ability to perform computations on data distributed across the cluster without involving significant data transfers over the network. This idea is adopted in the partition-based datasets in the following way:
+
+  * Every dataset is spread across partitions;
+  * Partitions hold a persistent *training context* and recoverable *training data* stored locally on every node;
+  * Computations needed to be performed on a dataset splits on *Map* operations which executes on every partition and *Reduce* operations which reduces results of *Map* operations to one final result.
+
+**Training Context (Partition Context)** is a persistent part of the partition which is kept in an Apache Ignite, so that all changes made in this part will be consistently maintained until a partition-based dataset is closed. Training context survives node failures but requires additional time to read and write, so it should be used only when it's not possible to use partition data.
+
+**Training Data (Partition Data)** is a part of the partition that can be recovered from the upstream data and context at any time. Because of this, it is not necessary to maintain partition data in some persistent storage, so that partition data is kept on every node in local storage (On-Heap, Off-Heap or even in GPU memory) and in case of node failure is recovered from upstream data and context on another node.
+
+Why have partitions been selected as dataset and learning building blocks instead of cluster nodes?
+
+One of the fundamental ideas of an Apache Ignite is that partitions are atomic, which means that they cannot be split between multiple nodes for more details). As a result in the case of rebalancing or node failure, a partition will be recovered on another node with the same data it contained on the previous node.
+
+In case of a machine learning algorithm, it's vital because most of the ML algorithms are iterative and require some context maintained between iterations. This context cannot be split or merged and should be maintained in a consistent state during the whole learning process.
+
+== Usage
+
+To build a partition-based dataset you need to specify:
+
+* Upstream Data Source which can be an Ignite Cache or just a Map with data;
+* Partition Context Builder that defines how to build a partition context from upstream data rows corresponding to this partition;
+* Partition Data Builder that defines how to build partition data from upstream data rows corresponding to this partition.
+
+
+.Cache-based Dataset
+[source, java]
+----
+Dataset<MyPartitionContext, MyPartitionData> dataset =
+    new CacheBasedDatasetBuilder<>(
+        ignite,                            // Upstream Data Source
+        upstreamCache
+    ).build(
+        new MyPartitionContextBuilder<>(), // Training Context Builder
+        new MyPartitionDataBuilder<>()     // Training Data Builder
+    );
+----
+
+
+.Local Dataset
+[source, java]
+----
+Dataset<MyPartitionContext, MyPartitionData> dataset =
+    new LocalDatasetBuilder<>(
+        upstreamMap,                       // Upstream Data Source
+        10
+    ).build(
+        new MyPartitionContextBuilder<>(), // Partition Context Builder
+        new MyPartitionDataBuilder<>()     // Partition Data Builder
+    );
+----
+
+After this you are able to perform different computations on this dataset in a MapReduce manner.
+
+
+[source, java]
+----
+int numerOfRows = dataset.compute(
+    (partitionData, partitionIdx) -> partitionData.getRows(),
+    (a, b) -> a == null ? b : a + b
+);
+----
+
+And, finally, when all computations are completed it's important to close the dataset and free resources.
+
+
+[source, java]
+----
+dataset.close();
+----
+
+== Example
+
+To see how the Partition Based Dataset can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/dataset/AlgorithmSpecificDatasetExample.java[example] that is available on GitHub and delivered with every Apache Ignite distribution.
+
+
+
+
diff --git a/docs/_docs/machine-learning/preprocessing.adoc b/docs/_docs/machine-learning/preprocessing.adoc
new file mode 100644
index 0000000..467dbfb
--- /dev/null
+++ b/docs/_docs/machine-learning/preprocessing.adoc
@@ -0,0 +1,239 @@
+= Preprocessing
+
+Preprocessing is required to transform raw data stored in an Ignite cache to the dataset of feature vectors suitable for further use in a machine learning pipeline.
+
+This section covers algorithms for working with features, roughly divided into the following groups:
+
+  * Extracting features from “raw” data
+  * Scaling features
+  * Converting features
+  * Modifying features
+
+NOTE: Usually it starts from label and feature extraction via vectorizer usage and can be complicated with other preprocessing stages.
+
+== Normalization preprocessor
+
+The normal flow is to extract features and labels from Ignite data via a vectorizer, transform the features and then normalize them.
+
+In addition to the ability to build any custom preprocessor, Apache Ignite provides a built-in normalization preprocessor. This preprocessor makes normalization on each vector using p-norm.
+
+For normalization, you need to create a NormalizationTrainer and fit a normalization preprocessor as follows:
+
+
+[source, java]
+----
+// Train the preprocessor on the given data
+Preprocessor<Integer, Vector> preprocessor = new NormalizationTrainer<Integer, Vector>()
+  .withP(1)
+  .fit(ignite, data, vectorizer);
+
+// Create linear regression trainer.
+LinearRegressionLSQRTrainer trainer = new LinearRegressionLSQRTrainer();
+
+// Train model.
+LinearRegressionModel mdl = trainer.fit(
+    ignite,
+    upstreamCache,
+    preprocessor
+);
+----
+
+
+== Examples
+
+To see how the Normalization Preprocessor can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/preprocessing/NormalizationExample.java[example] that is available on GitHub and delivered with every Apache Ignite distribution.
+
+== Binarization preprocessor
+
+Binarization is the process of thresholding numerical features to binary (0/1) features.
+Feature values greater than the threshold are binarized to 1.0; values equal to or less than the threshold are binarized to 0.0.
+
+It contains only one significant parameter, which is the threshold.
+
+
+[source, java]
+----
+// Create binarization trainer.
+BinarizationTrainer<Integer, Vector> binarizationTrainer
+    = new BinarizationTrainer<>().withThreshold(40);
+
+// Build the preprocessor.
+Preprocessor<Integer, Vector> preprocessor = binarizationTrainer
+    .fit(ignite, data, vectorizer);
+----
+
+To see how the Binarization Preprocessor can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/preprocessing/BinarizationExample.java[example].
+
+
+== Imputer preprocessor
+
+
+The Imputer preprocessor completes missing values in a dataset, either using the mean or another statistic of the column in which the missing values are located. The missing values should be presented as Double.NaN. The input dataset column should be of Double. Currently, the Imputer preprocessor does not support categorical features and possibly creates incorrect values for columns containing categorical features.
+
+During the training phase, the Imputer Trainer collects statistics about the preprocessing dataset and in the preprocessing phase it changes the data according to the collected statistics.
+
+The Imputer Trainer contains only one parameter: `imputingStgy` that is presented as enum  *ImputingStrategy* with two available values (NOTE: future releases may support more values):
+
+  * MEAN: The default strategy. If this strategy is chosen, then replace missing values using the mean for the numeric features along the axis.
+  * MOST_FREQUENT: If this strategy is chosen, then replace missing values using the most frequent value along the axis.
+
+
+[source, java]
+----
+// Create imputer trainer.
+ImputerTrainer<Integer, Vector>() imputerTrainer =
+    new ImputerTrainer<>().withImputingStrategy(ImputingStrategy.MOST_FREQUENT);
+
+// Train imputer preprocessor.
+Preprocessor<Integer, Vector> preprocessor = new ImputerTrainer<Integer, Vector>()
+                    .fit(ignite, data, vectorizer);
+----
+
+To see how the Imputer Preprocessor can be used in practice, try https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/preprocessing/ImputingExample.java[this].
+
+== One-Hot Encoder preprocessor
+
+One-hot encoding maps a categorical feature, represented as a label index (Double or String value), to a binary vector with at most a single one-value indicating the presence of a specific feature value from among the set of all feature values.
+
+This preprocessor can transform multiple columns in which indices are handled during the training process. These indexes could be defined via a `withEncodedFeature(featureIndex)` call.
+
+[NOTE]
+====
+Each one-hot encoded binary vector adds its cells to the end of the current feature vector.
+
+  * This preprocessor always creates a separate column for NULL values.
+  * The index value associated with NULL will be located in a binary vector according to the frequency of NULL values.
+====
+
+`StringEncoderPreprocessor` and `OneHotEncoderPreprocessor` use the same EncoderTraining to collect data about categorial features during the training phase. To preprocess the dataset with the One-Hot Encoder preprocessor, set the `encoderType` with the value `EncoderType.ONE_HOT_ENCODER` as shown in the code snippet below:
+
+
+[source, java]
+----
+Preprocessor<Integer, Object[]> encoderPreprocessor = new EncoderTrainer<Integer, Object[]>()
+   .withEncoderType(EncoderType.ONE_HOT_ENCODER)
+   .withEncodedFeature(0)
+   .withEncodedFeature(1)
+   .withEncodedFeature(4)
+   .fit(ignite,
+       dataCache,
+       vectorizer
+);
+----
+
+== String Encoder preprocessor
+
+The String Encoder encodes string values (categories) to double values in the range [0.0, amountOfCategories) where the most popular value will be presented as 0.0 and the least popular value presented with amountOfCategories-1 value.
+
+This preprocessor can transform multiple columns in which indices are handled during the training process. These indexes could be defined via a `withEncodedFeature(featureIndex)` call.
+
+NOTE: It doesn’t add a new column but changes data in-place.
+
+*Example*
+
+Assume that we have the following Dataset with features id and category:
+
+
+[cols="1,1",opts="header"]
+|===
+|Id| Category
+|0|   a
+|1|   b
+|2|   c
+|3|   a
+|4|   a
+|5|   c
+|===
+
+[cols="1,1",opts="header"]
+|===
+|Id|  Category
+|0|   0.0
+|1|   2.0
+|2|   1.0
+|3|   0.0
+|4|   0.0
+|5|   1.0
+|===
+
+“a” gets index 0 because it is the most frequent, followed by “c” with index 1 and “b” with index 2.
+
+[NOTE]
+====
+There is only one strategy regarding how StringEncoder will handle unseen labels when you have to fit a StringEncoder on one dataset and then use it to transform another: put unseen labels in a special additional bucket, at the index equal to `amountOfCategories`.
+====
+
+`StringEncoderPreprocessor` and `OneHotEncoderPreprocessor` use the same EncoderTraining to collect data about categorial features during the training phase. To preprocess the dataset with the `StringEncoderPreprocessor`, set the `encoderType` with the value `EncoderType.STRING_ENCODER` as shown below in the code snippet:
+
+
+[source, java]
+----
+Preprocessor<Integer, Object[]> encoderPreprocessor
+  = new EncoderTrainer<Integer, Object[]>()
+   .withEncoderType(EncoderType.STRING_ENCODER)
+   .withEncodedFeature(1)
+   .withEncodedFeature(4)
+   .fit(ignite,
+       dataCache,
+       vectorizer
+);
+
+----
+
+
+To see how the String Encoder or OHE can be used in practice, try https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/ml/preprocessing/encoding[this] example.
+
+
+== MinMax Scaler preprocessor
+
+The MinMax Scaler transforms the given dataset, rescaling each feature to a specific range.
+
+From a mathematical point of view, it is the following function which is applied to every element in the dataset:
+
+image::images/preprocessing.png[]
+
+for all i, where i is a number of column, max_i is the value of the maximum element in this column, min_i is the value of the minimal element in this column.
+
+
+[source, java]
+----
+// Create min-max scaler trainer.
+MinMaxScalerTrainer<Integer, Vector> trainer = new MinMaxScalerTrainer<>();
+
+// Build the preprocessor.
+Preprocessor<Integer, Vector> preprocessor = trainer
+    .fit(ignite, data, vectorizer);
+----
+
+`MinMaxScalerTrainer` computes summary statistics on a data set and produces a `MinMaxScalerPreprocessor`
+The preprocessor can then transform each feature individually such that it is in the given range.
+
+To see how the `MinMaxScalerPreprocessor` can be used in practice, try https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/preprocessing/MinMaxScalerExample.java[this] tutorial example.
+
+
+== MaxAbsScaler Preprocessor
+
+The MaxAbsScaler transforms the given dataset, rescaling each feature to the range [-1, 1] by dividing through the maximum absolute value in each feature.
+
+NOTE:It does not shift or center the data, and thus does not destroy any sparsity.
+
+
+[source, java]
+----
+// Create max-abs trainer.
+MaxAbsScalerTrainer<Integer, Vector> trainer = new MaxAbsScalerTrainer<>();
+
+// Build the preprocessor.
+Preprocessor<Integer, Vector> preprocessor = trainer
+    .fit(ignite, data, vectorizer);
+----
+
+From a mathematical point of view it is the following function which is applied to every element in a dataset:
+
+image::images/preprocessing2.png[]
+
+for all i, where i is a number of column, maxabs_i is the value of the absolute maximum element in this column.
+
+`MaxAbsScalerTrainer` computes summary statistics on a data set and produces a `MaxAbsScalerPreprocessor`
+
+To see how the `MaxAbsScalerPreprocessor` can be used in practice, try https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/preprocessing/MaxAbsScalerExample.java[this] tutorial example.
diff --git a/docs/_docs/machine-learning/recommendation-systems.adoc b/docs/_docs/machine-learning/recommendation-systems.adoc
new file mode 100644
index 0000000..a207dd6
--- /dev/null
+++ b/docs/_docs/machine-learning/recommendation-systems.adoc
@@ -0,0 +1,57 @@
+= Recommendation Systems
+
+CAUTION: This is an experimental API that could be changed in the next releases.
+
+Collaborative filtering is commonly used for recommender systems. These techniques aim to fill in the missing entries of a user-item association matrix. Apache Ignite ML currently supports model-based collaborative filtering, in which users and products are described by a small set of latent factors that can be used to predict missing entries.
+
+The standard approach to matrix factorization-based collaborative filtering treats the entries in the user-item matrix as explicit preferences given by the user to the item, for example, users giving ratings to movies.
+
+Example of recommendation system based on https://grouplens.org/datasets/movielens[MovieLens dataset].
+
+
+
+[source, java]
+----
+IgniteCache<Integer, RatingPoint> movielensCache = loadMovieLensDataset(ignite, 10_000);
+
+RecommendationTrainer trainer = new RecommendationTrainer()
+  .withMaxIterations(-1)
+  .withMinMdlImprovement(10)
+  .withBatchSize(10)
+  .withLearningRate(10)
+  .withLearningEnvironmentBuilder(envBuilder)
+  .withTrainerEnvironment(envBuilder.buildForTrainer());
+
+RecommendationModel<Integer, Integer> mdl = trainer.fit(new CacheBasedDatasetBuilder<>(ignite, movielensCache));
+----
+
+CAUTION: The Evaluator is not support the recommendation systems yet.
+
+The next example demonstrates how to calculate metrics over the given cache manually and locally on the client node:
+
+
+[source, java]
+----
+double mean = 0;
+
+try (QueryCursor<Cache.Entry<Integer, RatingPoint>> cursor = movielensCache.query(new ScanQuery<>())) {
+  for (Cache.Entry<Integer, RatingPoint> e : cursor) {
+    ObjectSubjectRatingTriplet<Integer, Integer> triplet = e.getValue();
+    mean += triplet.getRating();
+  }
+  mean /= movielensCache.size();
+}
+
+double tss = 0, rss = 0;
+
+try (QueryCursor<Cache.Entry<Integer, RatingPoint>> cursor = movielensCache.query(new ScanQuery<>())) {
+  for (Cache.Entry<Integer, RatingPoint> e : cursor) {
+    ObjectSubjectRatingTriplet<Integer, Integer> triplet = e.getValue();
+    tss += Math.pow(triplet.getRating() - mean, 2);
+    rss += Math.pow(triplet.getRating() - mdl.predict(triplet), 2);
+  }
+}
+
+double r2 = 1.0 - rss / tss;
+----
+
diff --git a/docs/_docs/machine-learning/regression/decision-trees-regression.adoc b/docs/_docs/machine-learning/regression/decision-trees-regression.adoc
new file mode 100644
index 0000000..7eb6516
--- /dev/null
+++ b/docs/_docs/machine-learning/regression/decision-trees-regression.adoc
@@ -0,0 +1,61 @@
+= Decision Trees Regression
+
+Decision trees and their ensembles are popular methods for the machine learning tasks of classification and regression. Decision trees are widely used since they are easy to interpret, handle categorical features, extend to the multiclass classification setting, do not require feature scaling, and are able to capture non-linearities and feature interactions. Tree ensemble algorithms such as random forests and boosting are among the top performers for classification and regression tasks.
+
+== Overview
+
+Decision trees are a simple yet powerful model in supervised machine learning. The main idea is to split a feature space into regions such as that the value in each region varies a little. The measure of the values' variation in a region is called the `impurity` of the region.
+
+Apache Ignite provides an implementation of the algorithm optimized for data stored in rows (see link:machine-learning/partition-based-dataset[partition-based dataset].
+
+Splits are done recursively and every region created from a split can be split further. Therefore, the whole process can be described by a binary tree, where each node is a particular region and its children are the regions derived from it by another split.
+
+Let each sample from a training set belong to some space `S` and let `p_i` be a projection on a feature with index `i`, then a split by continuous feature with index `i` has the form:
+
+
+image::images/555.gif[]
+
+and a split by categorical feature with values from some set `X` has the form:
+
+image::images/666.gif[]
+
+Here `X_0` is a subset of `X`.
+
+The model works this way - the split process stops when either the algorithm has reached the configured maximal depth, or splitting of any region has not resulted in significant impurity loss. Prediction of a value for point `s` from `S` is a traversal of the tree down to the node that corresponds to the region containing `s` and getting back a value associated with this leaf.
+
+== Model
+
+The Model in a decision tree classification is represented by the class `DecisionTreeNode`. We can make a prediction for a given vector of features in the following way:
+
+
+[source, java]
+----
+DecisionTreeNode mdl = ...;
+
+double prediction = mdl.apply(observation);
+----
+
+Model is fully independent object and after the training it can be saved, serialized and restored.
+
+== Trainer
+
+A Decision Tree algorithm can be used for classification and regression depending upon the impurity measure and node instantiation approach.
+
+The Regression Decision Tree uses the https://en.wikipedia.org/wiki/Mean_squared_error[MSE^] impurity measure and you can use it in the following way:
+
+
+[source, java]
+----
+// Create decision tree classification trainer.
+DecisionTreeRegressionTrainer trainer = new DecisionTreeRegressionTrainer(
+    4, // Max deep.
+    0  // Min impurity decrease.
+);
+
+// Train model.
+DecisionTreeNode mdl = trainer.fit(ignite, dataCache, vectorizer);
+----
+
+== Examples
+
+To see how the Decision Tree can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/tree/DecisionTreeRegressionTrainerExample.java[regression example^] that is available on GitHub and delivered with every Apache Ignite distribution.
diff --git a/docs/_docs/machine-learning/regression/introduction.adoc b/docs/_docs/machine-learning/regression/introduction.adoc
new file mode 100644
index 0000000..4e38fb2
--- /dev/null
+++ b/docs/_docs/machine-learning/regression/introduction.adoc
@@ -0,0 +1,9 @@
+= Introduction
+
+Regression is a ML algorithm that can be trained to predict real numbered outputs, like temperature, stock price, etc. Regression is based on a hypothesis that can be linear, quadratic, polynomial, non-linear, etc. The hypothesis is a function that is based on some hidden parameters and the input values.
+
+All existing training algorithms presented in this section are designed to solve regression tasks:
+
+* Linear Regression
+* Decision Trees Regression
+* k-NN Regression
diff --git a/docs/_docs/machine-learning/regression/knn-regression.adoc b/docs/_docs/machine-learning/regression/knn-regression.adoc
new file mode 100644
index 0000000..90ee218
--- /dev/null
+++ b/docs/_docs/machine-learning/regression/knn-regression.adoc
@@ -0,0 +1,49 @@
+= k-NN Regression
+
+The Apache Ignite Machine Learning component provides two versions of the widely used k-NN (k-nearest neighbors) algorithm - one for classification tasks and the other for regression tasks.
+
+This documentation reviews k-NN as a solution for regression tasks.
+
+== Trainer and Model
+
+The k-NN regression algorithm is a non-parametric method whose input consists of the k-closest training examples in the feature space. Each training example has a property value in a numerical form associated with the given training example.
+
+The k-NN regression  algorithm uses all training sets to predict a property value for the given test sample.
+This predicted property value is an average of the values of its k nearest neighbors. If `k` is `1`, then the test sample is simply assigned to the property value of a single nearest neighbor.
+
+Presently, Ignite supports a few parameters for k-NN regression algorithm:
+
+* `k` - a number of nearest neighbors
+* `distanceMeasure` - one of the distance metrics provided by the ML framework such as Euclidean, Hamming or Manhattan
+* `isWeighted` - false by default, if true it enables a weighted KNN algorithm.
+* `dataCache` -  holds a training set of objects for which the class is already known.
+* `indexType` - distributed spatial index, has three values: ARRAY, KD_TREE, BALL_TREE
+
+
+[source, java]
+----
+// Create trainer
+KNNRegressionTrainer trainer = new KNNRegressionTrainer()
+  .withK(5)
+  .withIdxType(SpatialIndexType.BALL_TREE)
+  .withDistanceMeasure(new ManhattanDistance())
+  .withWeighted(true);
+
+// Train model.
+KNNClassificationModel knnMdl = trainer.fit(
+  ignite,
+  dataCache,
+  vectorizer
+);
+
+// Make a prediction.
+double prediction = knnMdl.predict(observation);
+----
+
+
+== Example
+
+
+To see how kNN Regression can be used in practice, try this https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/ml/knn/KNNRegressionExample.java[example^] that is available on GitHub and delivered with every Apache Ignite distribution.
+
+The training dataset is the Iris dataset which can be loaded from the https://archive.ics.uci.edu/ml/datasets/iris[UCI Machine Learning Repository^].
diff --git a/docs/_docs/machine-learning/regression/linear-regression.adoc b/docs/_docs/machine-learning/regression/linear-regression.adoc
new file mode 100644
index 0000000..09ef40d
--- /dev/null
+++ b/docs/_docs/machine-learning/regression/linear-regression.adoc
@@ -0,0 +1,85 @@
+= Linear Regression
+
+== Overview
+
+Apache Ignite supports the ordinary least squares Linear Regression algorithm - one of the most basic and powerful machine learning algorithms. This documentation describes how the algorithm works, and is implemented in Apache Ignite.
+
+The basic idea behind the Linear Regression algorithm is an assumption that a dependent variable `y` and an explanatory variable `x` are in the following relationship:
+
+image::images/111.gif[]
+
+
+WARNING:Be aware that further documentation uses a dot product of vectors `x` and `b`, and explicitly avoids using a constant term. It is mathematically correct in the case where vector `x` is supplemented by one value equal to 1.
+
+The above assumption allows us to make a prediction based on a feature vector `x` if a vector `b` is known. This fact is reflected in Apache Ignite in the `LinearRegressionModel` class responsible for making predictions.
+
+
+== Model
+
+A Model in the case of linear regression is represented by the class `LinearRegressionModel`. It enables a prediction to be made for a given vector of features, in the following way:
+
+
+[source, java]
+----
+LinearRegressionModel model = ...;
+
+double prediction = model.predict(observation);
+----
+
+Model is fully independent object and after the training it can be saved, serialized and restored.
+
+== Trainers
+
+Linear Regression is a supervised learning algorithm. This means that to find parameters (vector `b`), we need to train on a training dataset and minimize the loss function:
+
+image::images/222.gif[]
+
+Apache Ignite provides two linear regression trainers: trainer based on the LSQR algorithm and another trainer based on the Stochastic Gradient Descent method.
+
+=== LSQR Trainer
+
+The LSQR algorithm finds the least-squares solution to a large, sparse, linear system of equations. The Apache Ignite implementation is a distributed version of this algorithm.
+
+
+[source, java]
+----
+// Create linear regression trainer.
+LinearRegressionLSQRTrainer trainer = new LinearRegressionLSQRTrainer();
+
+// Train model.
+LinearRegressionModel mdl = trainer.fit(ignite, dataCache, vectorizer);
+
+// Make a prediction.
+double prediction = mdl.apply(coordinates);
+----
+
+
+=== SGD Trainer
+
+Another Linear Regression Trainer uses the stochastic gradient descent method to find a minimum of the loss function. The configuration of this trainer is similar to link:machine-learning/binary-classification/multilayer-perceptron[multilayer perceptron trainer] configuration and we can specify the type of updater (`SGD`, `RProp` of `Nesterov`), max number of iterations, batch size, number of local iterations and seed.
+
+[source, java]
+----
+// Create linear regression trainer.
+LinearRegressionSGDTrainer<?> trainer = new LinearRegressionSGDTrainer<>(
+    new UpdatesStrategy<>(
+        new RPropUpdateCalculator(),
+        RPropParameterUpdate::sumLocal,
+        RPropParameterUpdate::avg
+    ),
+    100000,  // Max iterations.
+    10,      // Batch size.
+    100,     // Local iterations.
+    123L     // Random seed.
+);
+
+// Train model.
+LinearRegressionModel mdl = trainer.fit(ignite, dataCache, vectorizer);
+
+// Make a prediction.
+double prediction = mdl.apply(coordinates);
+----
+
+== Examples
+
+To see how the Linear Regression can be used in practice, try these https://github.com/apache/ignite/tree/master/examples/src/main/java/org/apache/ignite/examples/ml/regression/linear[examples] that are available on GitHub and delivered with every Apache Ignite distribution.
diff --git a/docs/_docs/machine-learning/updating-trained-models.adoc b/docs/_docs/machine-learning/updating-trained-models.adoc
new file mode 100644
index 0000000..b98c042
--- /dev/null
+++ b/docs/_docs/machine-learning/updating-trained-models.adoc
@@ -0,0 +1,63 @@
+= Updating Trained Models
+
+Updating Already Trained Models in Apache Ignite
+
+The model updating interface in Ignite ML provides relearning of an already trained model on a new portion of data using the state of the model trained earlier. This interface is represented in the DatasetTrainer class and it repeats the training interface with an already learned model as the first parameter:
+
+* M update (M mdl, DatasetBuilder<K, V> datasetBuilder, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
+* M update (M mdl, Ignite ignite, IgniteCache<K, V> cache, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
+* M update (M mdl, Ignite ignite, IgniteCache<K, V> cache, IgniteBiPredicate<K, V> filter, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
+*   M update(M mdl, Map<K, V> data, int parts, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
+*  M update (M mdl, Map<K, V> data, IgniteBiPredicate<K, V> filter, int parts, IgniteBiFunction<K, V, Vector> featureExtractor, IgniteBiFunction<K, V, L> lbExtractor).
+
+The interface brings online learning and online batch learning. Online learning means that you can train a model and when you get a new example for learning, such as clicks on a website, you can update the model as if the model were trained on this example too. Batch online learning requires a batch of examples instead of one training example for model update. Some models allow both update strategies, some allow only batch updating. It depends upon the learning algorithm. Further details [...]
+
+[NOTE]
+====
+The new portion of data should be compatible with the first trainer’s parameters and previous dataset that was used for previous pieces of training in terms of feature vector size and feature value distributions. For example, if you train an ANN model then you should provide the trainer with distance measure and candidates parameter count as at the first learning stage. If you update k-means then the new dataset should contain at least k-rows.
+====
+
+Each model has a special implementation of this interface. Read the next section to get more information about the updating process for each algorithm.
+
+
+== KMeans
+
+Model updating takes already learned centroids and updates them by new rows. We recommend to use batch online learning for this model. First, the dataset should have a size equal to the k-value at least. Second, a dataset with a small number of rows can move centroids to invalid positions.
+
+== KNN
+
+Model updating just adds a new dataset to the old dataset. In this case, model updating isn’t restricted.
+
+== ANN
+
+As in the case of KNN, a new trainer should provide the same distance measure and k-value. Those parameters are important because internally ANN use KMeans and statistics over centroids provided by KMeans. During an update, the trainer gets statistics over centroids from the last learning and updates it with new observations. From this point of view, ANN allows “mini-batch” online learning where batch size is equal to the k-parameter.
+
+== Neural Network (NN)
+
+NN updating just gets current neural network state and updates it according to the gradient of error on a new dataset. In this case the NN requires only feature vector compatibility between different datasets.
+
+== Logistic Regression
+
+Logistic regression inherits all restrictions from the neural network trainer because it uses perceptron internally.
+
+== Linear Regression
+
+The LinearRegressionSGD trainer inherits all restrictions from the neural network trainer. LinearRegressionLSQRTrainer restores state from the last learning and uses it as a first approximation in learning on a new dataset. In this way, LinearRegressionLSQRTrainer also requires only feature vectors compatibility.
+
+== SVM
+
+SVM trainer uses the state of a learned model as first approximation during a training process. From this point of view, the algorithm only requires feature vectors compatibility.
+
+== Decision Tree
+
+There is no correct implementation for decision tree updating. Updating learns a new model on a given dataset.
+
+== GDB
+
+GDB trainer updating gets already learned models from composition and tries to minimize the error gradient on a given dataset through learning of new models predicting gradient. It also uses a convergence checker and if there is no large error on a new dataset then GDB skips the update stage. From this point of view, GDB requires only feature vector compatibility.
+
+NOTE: Every update can increase the model composition size. All models depend upon each other. So, frequent updating based upon small datasets can produce an enormous model that requires a lot of memory.
+
+== Random Forest (RF)
+
+The RF trainer just learns new decision trees on a given dataset and adds them to an already learned composition. In this way, RF requires feature vector compatibility and the dataset should have a size bigger than one element because a decision tree cannot be trained on such a small dataset. In contrast to GDB models in a trained composition, RF models aren’t dependent upon each other and if the composition is too big then a user can manually remove some models.
diff --git a/docs/_docs/quick-start/dotnet.adoc b/docs/_docs/quick-start/dotnet.adoc
index f671030..59b9880 100644
--- a/docs/_docs/quick-start/dotnet.adoc
+++ b/docs/_docs/quick-start/dotnet.adoc
@@ -45,7 +45,7 @@ tab:C#/.NET[]
 using System;
 using Apache.Ignite.Core;
 
-namespace ggqsg
+namespace  IgniteTest
 {
     class Program
     {