You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mrql.apache.org by Apache Wiki <wi...@apache.org> on 2013/10/01 20:06:10 UTC
[Mrql Wiki] Trivial Update of "Kmeans" by LeonidasFegaras
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Mrql Wiki" for change notification.
The "Kmeans" page has been changed by LeonidasFegaras:
https://wiki.apache.org/mrql/Kmeans?action=diff&rev1=2&rev2=3
To run the same query using Hama, you need to know the number of simultaneous BSP tasks that can run
in parallel on your Hama cluster without a problem. For example, if you have 16 nodes with 4 cores each,
- you need to set {{{-bsp_tasks}}} less than 64, eg 50.
+ you need to set {{{-nodes}}} less than 64, eg 50.
First, you need to generate random points and store them in a HDFS file
(if you haven't done so for the !MapReduce example):
@@ -47, +47 @@
=== Run K-means Clustering on a Spark Standalone Cluster ===
- To run the same query using Spark, change the SPARK_MASTER and SPARK_DEFAULT_URI in {{{conf/mrql-conf.sh}}} to point to your Spark cluster URLs.
+ To run the same query using Spark, change the SPARK_MASTER and FS_DEFAULT_NAME in {{{conf/mrql-conf.sh}}} to point to your Spark cluster URLs.
Then, you need to generate random points and store them in a HDFS file
(if you haven't done so for the other examples):