You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/09/20 00:45:43 UTC

[GitHub] stu1130 commented on a change in pull request #12606: Refine the documentation of im2rec

stu1130 commented on a change in pull request #12606: Refine the documentation of im2rec
URL: https://github.com/apache/incubator-mxnet/pull/12606#discussion_r219006206
 
 

 ##########
 File path: docs/faq/recordio.md
 ##########
 @@ -6,35 +6,39 @@ RecordIO implements a file format for a sequence of records. We recommend storin
 * Packing data together allows continuous reading on the disk.
 * RecordIO has a simple way to partition, simplifying distributed setting. We provide an example later.
 
-We provide the [im2rec tool](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc) so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how.
+We provide the [im2rec tool](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc) so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how. Note that there is python version of [im2rec tool](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) and [example](https://mxnet.incubator.apache.org/tutorials/basic/data.html) using real-world data.
 
 ### Prerequisites
+
 Download the data. You don't need to resize the images manually. You can use ```im2rec``` to resize them automatically. For details, see the "Extension: Using Multiple Labels for a Single Image," later in this topic.
 
 ### Step 1. Make an Image List File
+
+* Note that the im2rec.py provide a param `--list` to generate the list for you but im2rec.cc doesn't support it.
+
 After you download the data, you need to make an image list file.  The format is:
 
 ```
 integer_image_index \t label_index \t path_to_image
 ```
 Typically, the program takes the list of names of all of the images, shuffles them, then separates them into two lists: a training filename list and a testing filename list. Write the list in the right format.
-
 This is an example file:
 
 ```bash
-95099  464     n04467665_17283.JPEG
-10025081        412     ILSVRC2010_val_00025082.JPEG
-74181   789     n01915811_2739.JPEG
-10035553        859     ILSVRC2010_val_00035554.JPEG
-10048727        929     ILSVRC2010_val_00048728.JPEG
-94028   924     n01980166_4956.JPEG
-1080682 650     n11807979_571.JPEG
-972457  633     n07723039_1627.JPEG
-7534    11      n01630670_4486.JPEG
-1191261 249     n12407079_5106.JPEG
+95099  464.000000     n04467665_17283.JPEG
 
 Review comment:
   If the image.lst is generated by im2rec.py instead of doing it manually, the label will have those decimal point. I think it would be less confused for users? 
   And the reason why it uses floating point is that the label value could be generated by the regression, e.g. 68.6 kg for a human body weight.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services