You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spot.apache.org by ra...@apache.org on 2017/06/23 16:40:06 UTC

[36/50] [abbrv] incubator-spot git commit: Updated ML_OPS.md with information about LDA_OPTIMIZER, LDA_ALPHA and LDA_BETA

Updated ML_OPS.md with information about LDA_OPTIMIZER, LDA_ALPHA and LDA_BETA


Project: http://git-wip-us.apache.org/repos/asf/incubator-spot/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spot/commit/1de5a65a
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spot/tree/1de5a65a
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spot/diff/1de5a65a

Branch: refs/heads/SPOT-35_graphql_api
Commit: 1de5a65a1e9eae4342d05a0366de85289e28f501
Parents: 0fbea08
Author: Ricardo Barona <ri...@intel.com>
Authored: Wed Jun 14 11:53:55 2017 -0500
Committer: Ricardo Barona <ri...@intel.com>
Committed: Fri Jun 16 17:55:10 2017 -0500

----------------------------------------------------------------------
 spot-ml/ML_OPS.md | 4 ++++
 1 file changed, 4 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spot/blob/1de5a65a/spot-ml/ML_OPS.md
----------------------------------------------------------------------
diff --git a/spot-ml/ML_OPS.md b/spot-ml/ML_OPS.md
index 06a6548..3251077 100644
--- a/spot-ml/ML_OPS.md
+++ b/spot-ml/ML_OPS.md
@@ -66,4 +66,8 @@ The ml_ops.sh script takes its values for the following parameters from the /etc
 * **TOPIC_COUNT** Number of topics used for the topic modelling at the heart of the Suspicious Connects anomaly detection. Roughly, the analysis attempts to generate TOPIC_COUNT many profiles of common traffic in the cluster.
 * **DUPFACTOR** Used to downgrade the threat level of records similar to those marked as non-threatening by the feedback function of Spot UI. DUPFACTOR inflate the frequency of such records to make them appear less anomalous. A DUPFACTOR of 1 has no effect, and a DUPFACTOR of 1000 increases the frequency of the connection's pattern by a factor of 1000, increasing its estimated probability accordingly.
 * **TOL** The default value for the _suspicion threshold_ described above. In particular: If no third argument is provided to ml_ops.sh, the suspicion threshold is filled in with the TOL value from /etc/spot.conf. If a third argument is provided to ml_ops.sh, that is the supicion threshold used.
+* **LDA_OPTIMIZER** The LDA implementation to use. Set equal to "em" to execute LDA using EMLDAOptimizer or "online" to
+ use OnlineLDAOptimizer. See [LDA Spark documentation for more information.](https://spark.apache.org/docs/2.1.0/mllib-clustering.html#latent-dirichlet-allocation-lda)
+* **LDA_ALPHA** Document concentration. See [LDA Spark documentation for more information.](https://spark.apache.org/docs/2.1.0/mllib-clustering.html#latent-dirichlet-allocation-lda)
+* **LDA_BETA** Topic concentration. See [LDA Spark documentation for more information.](https://spark.apache.org/docs/2.1.0/mllib-clustering.html#latent-dirichlet-allocation-lda)