You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@singa.apache.org by "wangwei (JIRA)" <ji...@apache.org> on 2016/10/06 12:01:20 UTC

[jira] [Closed] (SINGA-210) Enable checkpoint and resume for v1.0

     [ https://issues.apache.org/jira/browse/SINGA-210?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

wangwei closed SINGA-210.
-------------------------
    Resolution: Fixed

> Enable checkpoint and resume for v1.0
> -------------------------------------
>
>                 Key: SINGA-210
>                 URL: https://issues.apache.org/jira/browse/SINGA-210
>             Project: Singa
>          Issue Type: New Feature
>            Reporter: wangwei
>
> This ticket is going to add code for dumping the model parameters as checkpoint files, which could be used for fine-tuning and deployment.
> The model parameters should be separated from model definition, i.e., net construction. Users either random initialize the layer parameters or using the parameters from checkpoint files after creating the neural net. In other words, we do not add a pair of serializing and parsing functions in the Layer class.
> We need to decide the format of the checkpoint file and how to write and read it:
> 1. the checkpoint file consists of the model parameters, which could be serialized as key-value pairs, where the key is the parameter name and value is a protobuf object including the shape and values. Optionally, there could be a text file including the parameter meta info, e..g, name and shape, which would be useful for users to know the model parameters without parsing the binary checkpoint file.
> 2. the binary checkpoint file can be serialized using the Writer SINGA-202 and loaded into memory using the Reader (SINGA-202).
> 3. A checkpoint utility class should be implemented for 1 and 2. Compatibility with caffe checkpoint files may also be considered to re-use models from caffe model zoo http://caffe.berkeleyvision.org/model_zoo.html.
> {code}
> class Checkpoint {
>   // <prefix>.model is the binary file for parameter key-value pair;   
>   // <prefix>.meta is the text file, one line per parameter. 
>   Checkpoint(prefix, mode=[R|W]);  
>   Read();  // read .model
>   ReadMeta() ; // read meta only
>   Get(key);  // return the value protobuf obj.
>   GetMeta(key);
>   Read(key);
>   Write(key, value);  // write to both .model and .meta files.
> };
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)