You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Yexi Jiang (JIRA)" <ji...@apache.org> on 2013/12/24 04:52:50 UTC

[jira] [Created] (MAHOUT-1388) Add command line support and logging for MLP

Yexi Jiang created MAHOUT-1388:
----------------------------------

             Summary: Add command line support and logging for MLP
                 Key: MAHOUT-1388
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1388
             Project: Mahout
          Issue Type: Improvement
          Components: Classification
    Affects Versions: 1.0
            Reporter: Yexi Jiang
             Fix For: 1.0


The user should have the ability to run the Perceptron from the command line.

There are two modes for MLP, the training and labeling, the first one takes the data as input and outputs the model, the second one takes the model and unlabeled data as input and outputs the results.

The parameters are as follows:
------------------------------------------------
--mode -mo // train or label
--input -i (input data)
--model -mo  // in training mode, this is the location to store the model (if the specified location has an existing model, it will update the model through incremental learning), in labeling mode, this is the location to store the result
--output -o           // this is only useful in labeling mode
--layersize -ls (no. of units per hidden layer) // use comma separated number to indicate the number of neurons in each layer (including input layer and output layer)
--momentum -m 
--learningrate -l
--regularizationweight -r
--costfunction -cf   // the type of cost function,
------------------------------------------------
For example, train a 3-layer (including input, hidden, and output) MLP with Minus_Square cost function, 0.1 learning rate, 0.1 momentum rate, and 0.01 regularization weight, the parameter would be:

mlp -mo train -i /tmp/training-data.csv -o /tmp/model.model -ls 5,3,1 -l 0.1 -m 0.1 -r 0.01 -cf minus_squared

This command would read the training data from /tmp/training-data.csv and write the trained model to /tmp/model.model.

If a user need to use an existing model, it will use the following command:
mlp -mo label -i /tmp/unlabel-data.csv -m /tmp/model.model -o /tmp/label-result

Moreover, we should be providing default values if the user does not specify any. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)