You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mrql.apache.org by Apache Wiki <wi...@apache.org> on 2013/10/01 20:06:10 UTC

[Mrql Wiki] Trivial Update of "Kmeans" by LeonidasFegaras

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Mrql Wiki" for change notification.

The "Kmeans" page has been changed by LeonidasFegaras:
https://wiki.apache.org/mrql/Kmeans?action=diff&rev1=2&rev2=3

  
  To run the same query using Hama, you need to know the number of simultaneous BSP tasks that can run
  in parallel on your Hama cluster without a problem. For example, if you have 16 nodes with 4 cores each,
- you need to set {{{-bsp_tasks}}} less than 64, eg 50.
+ you need to set {{{-nodes}}} less than 64, eg 50.
  First, you need to generate random points and store them in a HDFS file
  (if you haven't done so for the !MapReduce example):
  
@@ -47, +47 @@

  
  === Run K-means Clustering on a Spark Standalone Cluster ===
  
- To run the same query using Spark, change the SPARK_MASTER and SPARK_DEFAULT_URI in {{{conf/mrql-conf.sh}}} to point to your Spark cluster URLs.
+ To run the same query using Spark, change the SPARK_MASTER and FS_DEFAULT_NAME in {{{conf/mrql-conf.sh}}} to point to your Spark cluster URLs.
  Then, you need to generate random points and store them in a HDFS file
  (if you haven't done so for the other examples):