You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by pa...@apache.org on 2014/09/04 16:56:38 UTC

svn commit: r1622492 - /mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext

Author: pat
Date: Thu Sep  4 14:56:38 2014
New Revision: 1622492

URL: http://svn.apache.org/r1622492
Log:
updating cli help sessage

Modified:
    mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext

Modified: mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext
URL: http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext?rev=1622492&r1=1622491&r2=1622492&view=diff
==============================================================================
--- mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext (original)
+++ mahout/site/mahout_cms/trunk/content/users/recommender/intro-cooccurrence-spark.mdtext Thu Sep  4 14:56:38 2014
@@ -21,16 +21,15 @@ cross-cooccurrence is a more principled 
 to recommend.   
 
 
-    spark-itemsimilarity Mahout 1.0-SNAPSHOT
+    spark-itemsimilarity Mahout 1.0
     Usage: spark-itemsimilarity [options]
     
+    Disconnected from the target VM, address: '127.0.0.1:64676', transport: 'socket'
     Input, output options
       -i <value> | --input <value>
-            Input path, may be a filename, directory name, or comma delimited list of 
-            HDFS supported URIs (required)
+            Input path, may be a filename, directory name, or comma delimited list of HDFS supported URIs (required)
       -i2 <value> | --input2 <value>
-            Secondary input path for cross-similarity calculation, same restrictions 
-            as "--input" (optional). Default: empty.
+            Secondary input path for cross-similarity calculation, same restrictions as "--input" (optional). Default: empty.
       -o <value> | --output <value>
             Path for output, any local or HDFS supported URI (required)
     
@@ -38,8 +37,7 @@ to recommend.   
       -mppu <value> | --maxPrefs <value>
             Max number of preferences to consider per user (optional). Default: 500
       -m <value> | --maxSimilaritiesPerItem <value>
-            Limit the number of similarities per item to this number (optional). 
-            Default: 100
+            Limit the number of similarities per item to this number (optional). Default: 100
     
     Note: Only the Log Likelihood Ratio (LLR) is supported as a similarity measure.
     
@@ -47,56 +45,42 @@ to recommend.   
       -id <value> | --inDelim <value>
             Input delimiter character (optional). Default: "[,\t]"
       -f1 <value> | --filter1 <value>
-            String (or regex) whose presence indicates a datum for the primary item 
-            set (optional). Default: no filter, all data is used
+            String (or regex) whose presence indicates a datum for the primary item set (optional). Default: no filter, all data is used
       -f2 <value> | --filter2 <value>
-            String (or regex) whose presence indicates a datum for the secondary item 
-            set (optional). If not present no secondary dataset is collected
-      -rc <value> | --rowIDPosition <value>
-            Column number (0 based Int) containing the row ID string (optional). 
-            Default: 0
-      -ic <value> | --itemIDPosition <value>
-            Column number (0 based Int) containing the item ID string (optional). 
-            Default: 1
-      -fc <value> | --filterPosition <value>
-            Column number (0 based Int) containing the filter string (optional). 
-            Default: -1 for no filter
+            String (or regex) whose presence indicates a datum for the secondary item set (optional). If not present no secondary dataset is collected
+      -rc <value> | --rowIDColumn <value>
+            Column number (0 based Int) containing the row ID string (optional). Default: 0
+      -ic <value> | --itemIDColumn <value>
+            Column number (0 based Int) containing the item ID string (optional). Default: 1
+      -fc <value> | --filterColumn <value>
+            Column number (0 based Int) containing the filter string (optional). Default: -1 for no filter
     
     Using all defaults the input is expected of the form: "userID<tab>itemId" or "userID<tab>itemID<tab>any-text..." and all rows will be used
     
     File discovery options:
       -r | --recursive
-            Searched the -i path recursively for files that match --filenamePattern 
-            (optional), default: false
+            Searched the -i path recursively for files that match --filenamePattern (optional), Default: false
       -fp <value> | --filenamePattern <value>
-            Regex to match in determining input files (optional). Default: filename 
-            in the --input option or "^part-.*" if --input is a directory
+            Regex to match in determining input files (optional). Default: filename in the --input option or "^part-.*" if --input is a directory
     
     Output text file schema options:
       -rd <value> | --rowKeyDelim <value>
-            Separates the rowID key from the vector values list (optional). Default: 
-    \t"
+            Separates the rowID key from the vector values list (optional). Default: "\t"
       -cd <value> | --columnIdStrengthDelim <value>
-            Separates column IDs from their values in the vector values list (optional). 
-            Default: ":"
+            Separates column IDs from their values in the vector values list (optional). Default: ":"
       -td <value> | --elementDelim <value>
             Separates vector element values in the values list (optional). Default: " "
       -os | --omitStrength
             Do not write the strength to the output files (optional), Default: false.
-            This option is used to output indexable data for creating a search engine 
-            recommender.
+    This option is used to output indexable data for creating a search engine recommender.
     
     Default delimiters will produce output of the form: "itemID1<tab>itemID2:value2<space>itemID10:value10..."
     
     Spark config options:
       -ma <value> | --master <value>
-            Spark Master URL (optional). Default: "local". Note that you can specify 
-            the number of cores to get a performance improvement, for example "local[4]"
+            Spark Master URL (optional). Default: "local". Note that you can specify the number of cores to get a performance improvement, for example "local[4]"
       -sem <value> | --sparkExecutorMem <value>
-            Max Java heap available as "executor memory" on each node (optional). 
-            Default: 4g
-    
-    General config options:
+            Max Java heap available as "executor memory" on each node (optional). Default: 4g
       -rs <value> | --randomSeed <value>
             
       -h | --help
@@ -236,61 +220,48 @@ One significant output option is --omitS
 
 The command line interface is:
 
-    spark-rowsimilarity Mahout 1.0-SNAPSHOT
+    spark-rowsimilarity Mahout 1.0
     Usage: spark-rowsimilarity [options]
     
     Input, output options
       -i <value> | --input <value>
-            Input path, may be a filename, directory name, or comma delimited list 
-            of HDFS supported URIs (required)
-     -o <value> | --output <value>
+            Input path, may be a filename, directory name, or comma delimited list of HDFS supported URIs (required)
+      -o <value> | --output <value>
             Path for output, any local or HDFS supported URI (required)
     
     Algorithm control options:
       -mo <value> | --maxObservations <value>
             Max number of observations to consider per row (optional). Default: 500
       -m <value> | --maxSimilaritiesPerRow <value>
-            Limit the number of similarities per item to this number (optional). 
-            Default: 100
+            Limit the number of similarities per item to this number (optional). Default: 100
     
     Note: Only the Log Likelihood Ratio (LLR) is supported as a similarity measure.
+    Disconnected from the target VM, address: '127.0.0.1:49162', transport: 'socket'
     
     Output text file schema options:
       -rd <value> | --rowKeyDelim <value>
-            Separates the rowID key from the vector values list (optional). 
-            Default: "\t"
+            Separates the rowID key from the vector values list (optional). Default: "\t"
       -cd <value> | --columnIdStrengthDelim <value>
-            Separates column IDs from their values in the vector values list 
-            (optional). Default: ":"
+            Separates column IDs from their values in the vector values list (optional). Default: ":"
       -td <value> | --elementDelim <value>
-            Separates vector element values in the values list (optional). 
-            Default: " "
+            Separates vector element values in the values list (optional). Default: " "
       -os | --omitStrength
-            Do not write the strength to the output files (optional), Default: 
-            false.
-    This option is used to output indexable data for creating a search engine 
-    recommender.
+            Do not write the strength to the output files (optional), Default: false.
+    This option is used to output indexable data for creating a search engine recommender.
     
     Default delimiters will produce output of the form: "itemID1<tab>itemID2:value2<space>itemID10:value10..."
     
     File discovery options:
       -r | --recursive
-            Searched the -i path recursively for files that match 
-            --filenamePattern (optional), Default: false
+            Searched the -i path recursively for files that match --filenamePattern (optional), Default: false
       -fp <value> | --filenamePattern <value>
-            Regex to match in determining input files (optional). Default: 
-            filename in the --input option or "^part-.*" if --input is a directory
+            Regex to match in determining input files (optional). Default: filename in the --input option or "^part-.*" if --input is a directory
     
     Spark config options:
       -ma <value> | --master <value>
-            Spark Master URL (optional). Default: "local". Note that you can 
-            specify the number of cores to get a performance improvement, for 
-            example "local[4]"
+            Spark Master URL (optional). Default: "local". Note that you can specify the number of cores to get a performance improvement, for example "local[4]"
       -sem <value> | --sparkExecutorMem <value>
-            Max Java heap available as "executor memory" on each node (optional). 
-            Default: 4g
-    
-    General config options:
+            Max Java heap available as "executor memory" on each node (optional). Default: 4g
       -rs <value> | --randomSeed <value>
             
       -h | --help