You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/09/20 04:30:22 UTC
[GitHub] sandeep-krishnamurthy closed pull request #12606: Refine the
documentation of im2rec
sandeep-krishnamurthy closed pull request #12606: Refine the documentation of im2rec
URL: https://github.com/apache/incubator-mxnet/pull/12606
This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:
As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):
diff --git a/docs/faq/recordio.md b/docs/faq/recordio.md
index 10ab6c71d20..f61571882bd 100644
--- a/docs/faq/recordio.md
+++ b/docs/faq/recordio.md
@@ -6,35 +6,39 @@ RecordIO implements a file format for a sequence of records. We recommend storin
* Packing data together allows continuous reading on the disk.
* RecordIO has a simple way to partition, simplifying distributed setting. We provide an example later.
-We provide the [im2rec tool](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc) so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how.
+We provide the [im2rec tool](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc) so you can create an Image RecordIO dataset by yourself. The following walkthrough shows you how. Note that there is python version of [im2rec tool](https://github.com/apache/incubator-mxnet/blob/master/tools/im2rec.py) and [example](https://mxnet.incubator.apache.org/tutorials/basic/data.html) using real-world data.
### Prerequisites
+
Download the data. You don't need to resize the images manually. You can use ```im2rec``` to resize them automatically. For details, see the "Extension: Using Multiple Labels for a Single Image," later in this topic.
### Step 1. Make an Image List File
+
+* Note that the im2rec.py provide a param `--list` to generate the list for you but im2rec.cc doesn't support it.
+
After you download the data, you need to make an image list file. The format is:
```
integer_image_index \t label_index \t path_to_image
```
Typically, the program takes the list of names of all of the images, shuffles them, then separates them into two lists: a training filename list and a testing filename list. Write the list in the right format.
-
This is an example file:
```bash
-95099 464 n04467665_17283.JPEG
-10025081 412 ILSVRC2010_val_00025082.JPEG
-74181 789 n01915811_2739.JPEG
-10035553 859 ILSVRC2010_val_00035554.JPEG
-10048727 929 ILSVRC2010_val_00048728.JPEG
-94028 924 n01980166_4956.JPEG
-1080682 650 n11807979_571.JPEG
-972457 633 n07723039_1627.JPEG
-7534 11 n01630670_4486.JPEG
-1191261 249 n12407079_5106.JPEG
+95099 464.000000 n04467665_17283.JPEG
+10025081 412.000000 ILSVRC2010_val_00025082.JPEG
+74181 789.000000 n01915811_2739.JPEG
+10035553 859.000000 ILSVRC2010_val_00035554.JPEG
+10048727 929.000000 ILSVRC2010_val_00048728.JPEG
+94028 924.000000 n01980166_4956.JPEG
+1080682 650.000000 n11807979_571.JPEG
+972457 633.000000 n07723039_1627.JPEG
+7534 11.000000 n01630670_4486.JPEG
+1191261 249.000000 n12407079_5106.JPEG
```
### Step 2. Create the Binary File
+
To generate a binary image, use `im2rec` in the tool folder. `im2rec` takes the path of the `_image list file_` you generated, the `_root path_` of the images, and the `_output file path_` as input. This process usually takes several hours, so be patient.
Sample command:
diff --git a/docs/tutorials/basic/data.md b/docs/tutorials/basic/data.md
index 0a5dd59c1ce..b5d0884f749 100644
--- a/docs/tutorials/basic/data.md
+++ b/docs/tutorials/basic/data.md
@@ -315,6 +315,8 @@ print(mx.recordio.unpack_img(s))
You can also convert raw images into *RecordIO* format using the ``im2rec.py`` utility script that is provided in the MXNet [src/tools](https://github.com/dmlc/mxnet/tree/master/tools) folder.
An example of how to use the script for converting to *RecordIO* format is shown in the `Image IO` section below.
+* Note that there is a C++ version of [im2rec](https://github.com/dmlc/mxnet/blob/master/tools/im2rec.cc), please refer to [here](https://mxnet.incubator.apache.org/faq/recordio.html) for more information.
+
## Image IO
In this section, we will learn how to preprocess and load image data in MXNet.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services