You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2017/12/11 18:43:17 UTC

[GitHub] chsin opened a new issue #9028: How to use the C API to train simple a Softmax in pure C?

chsin opened a new issue #9028: How to use the C API to train simple a Softmax in pure C?
URL: https://github.com/apache/incubator-mxnet/issues/9028
 
 
   I want to train Softmax(Wx + b) using the C API defined here: https://github.com/apache/incubator-mxnet/blob/1.0.0/include/mxnet/c_api.h. I would like to know how to do so in pure C and if this is not possible, please tell me why? I see that the C API is built using C++ from this code:
   https://github.com/apache/incubator-mxnet/blob/1.0.0/src/c_api/c_api.cc
   So I am confused by why there is a C API that seems to expose training capabilities when there doesn't seem to be a way to only use C to train?
   
   The code I want to translate from C++ to pure C is a simple test I wrote that learns a scaled identity matrix.
   ```c++
   /* define data */
   int data_count = 6;
   int num_fe = 2;
   int num_labels = 2;
   float data_vec[] = { 5, 2, 3, 1, 7, 3, 0, 2, 1, 3, 2, 5 };
   float label_vec[] = { 0, 0, 0, 1, 1, 1 };
   const float *dptr = &data_vec[0];
   const float *lptr = &label_vec[0];
   NDArray data_array = NDArray(Shape(data_count, num_fe), ctx_dev, false);
   NDArray label_array = NDArray(Shape(data_count), ctx_dev, false);
   data_array.SyncCopyFromCPU(dptr, data_count * num_fe);
   label_array.SyncCopyFromCPU(lptr, data_count);
   data_array.WaitToRead();
   label_array.WaitToRead();
   
   /*define the symbolic net*/
   Symbol data = Symbol::Variable("data");
   Symbol data_label = Symbol::Variable("data_label");
   Symbol fc1_w("fc1_w"), fc1_b("fc1_b");
   Symbol fc1 = FullyConnected("fc1", data, fc1_w, fc1_b, num_labels);
   Symbol net = SoftmaxOutput("softmax", fc1, data_label);
   
   /* define optimizer */
   Optimizer* opt = OptimizerRegistry::Find("ccsgd");
   opt->SetParam("momentum", 0.9)
      ->SetParam("rescale_grad", 1.0)
      ->SetParam("clip_gradient", 10)
      ->SetParam("lr", learning_rate)
      ->SetParam("wd", 1e-3);
   
   /* bind */
   std::map<std::string, NDArray> args_map;
   args_map["data"] = data_array.Slice(0, batch_size).Copy(ctx_dev);
   args_map["data_label"] = label_array.Slice(0, batch_size).Copy(ctx_dev);
   NDArray::WaitAll();
   net.InferArgsMap(ctx_dev, &args_map, args_map);
   Executor *exe = net.SimpleBind(ctx_dev, args_map);
   args_map = exe->arg_dict();
   auto arg_names = net.ListArguments();
   
   /* train */
   for (int iter = 0; iter < max_epoch; ++iter) {
     size_t start_index = 0;
     while (start_index < data_count) {
       if (start_index + batch_size > data_count) {
         start_index = data_count - batch_size;
       }
       args_map["data"] = data_array.Slice(start_index, start_index + batch_size).Copy(ctx_dev);
       args_map["data_label"] = label_array.Slice(start_index, start_index + batch_size).Copy(ctx_dev);
       start_index += batch_size;
       NDArray::WaitAll();
   
       exe->Forward(true);
       exe->Backward();
       // Update parameters
       for (size_t i = 0; i < arg_names.size(); ++i) {
         if (arg_names[i] == "data" || arg_names[i] == "data_label") continue;
         opt->Update(i, exe->arg_arrays[i], exe->grad_arrays[i]);
       }
     }
   }
   /*save the parameters*/
   auto save_args = args_map;
   /* we do not want to save the data and label */
   save_args.erase(save_args.find("data"));
   save_args.erase(save_args.find("data_label"));
   NDArray::Save(param_filename, save_args);
   
   /* release memory  */
   delete exe;
   MXNotifyShutdown();
   ```
   
   I am having much difficulty translating this C++ code to pure C because I cannot find all the equivalent C functions and structs to replace the C++ calls. I also cannot find an example anywhere that only uses C to train even though there are examples that only uses C to predict. I think I have three main problems at the moment:
   
   1) While I can create the data, data_label, W, b variables, I am unable  to create the C API equivalent (SymbolHandle) of the network because I cannot get the FullyConnected and SoftmaxOutput symbols. I see that the symbol related code is here:
   https://github.com/apache/incubator-mxnet/blob/1.0.0/include/mxnet/c_api.h#L835
   But to actually get FullyConnected and SoftmaxOutput requires I use an std::map for the opMap, which I cannot do since I am programming in pure C. I see that SoftmaxOutput is defined here:
   https://github.com/apache/incubator-mxnet/blob/master/src/operator/softmax_output-inl.h
   And it looks like there should be a way to use NativeOpInfo or NDArrayOpInfo (both defined in https://github.com/apache/incubator-mxnet/blob/1.0.0/include/mxnet/c_api.h#L97) for this, but I don't understand how.
   
   2) I am also unable to use the C API to call the optimizer defined here:
   https://github.com/apache/incubator-mxnet/blob/master/src/operator/optimizer_op-inl.h
   But it seems like there should be a way to do so since there is a C API Executor interface here:
   https://github.com/apache/incubator-mxnet/blob/1.0.0/include/mxnet/c_api.h#L1226
   Which means there is a way to do the forward and backward pass, which you would primarily do in order to update the W matrix and b vector.
   
   3) I am unsure how to translate the NDArray portion to be just C since the C++ code relies on an std::map for the args. I also see that in the NDArray part in the C API
   https://github.com/apache/incubator-mxnet/blob/1.0.0/include/mxnet/c_api.h#L251
   has options to invoke functions and get gradients. If there is no way to call FullyConnected or SoftmaxOutput, does these C API NDArray functions imply there is a way to create them here?
   
   Thanks.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services