You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/01/31 20:47:13 UTC
[GitHub] piiswrong closed pull request #9457: Move get_images to test_utils.py, so it can be used from other tests. Fix documentation with recent im2rec changes.

piiswrong closed pull request #9457: Move get_images to test_utils.py, so it can be used from other tests. Fix documentation with recent im2rec changes.
URL: https://github.com/apache/incubator-mxnet/pull/9457
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/docs/faq/finetune.md b/docs/faq/finetune.md
index 533c3caf52..04244d15b0 100644
--- a/docs/faq/finetune.md
+++ b/docs/faq/finetune.md
@@ -61,8 +61,8 @@ for i in 256_ObjectCategories/*; do
     done
 done
 
-python ~/mxnet/tools/im2rec.py --list True --recursive True caltech-256-60-train caltech_256_train_60/
-python ~/mxnet/tools/im2rec.py --list True --recursive True caltech-256-60-val 256_ObjectCategories/
+python ~/mxnet/tools/im2rec.py --list --recursive caltech-256-60-train caltech_256_train_60/
+python ~/mxnet/tools/im2rec.py --list --recursive caltech-256-60-val 256_ObjectCategories/
 python ~/mxnet/tools/im2rec.py --resize 256 --quality 90 --num-thread 16 caltech-256-60-val 256_ObjectCategories/
 python ~/mxnet/tools/im2rec.py --resize 256 --quality 90 --num-thread 16 caltech-256-60-train caltech_256_train_60/
 ```
diff --git a/docs/tutorials/basic/data.md b/docs/tutorials/basic/data.md
index 60a7ec185b..0c3f93ab24 100644
--- a/docs/tutorials/basic/data.md
+++ b/docs/tutorials/basic/data.md
@@ -366,7 +366,7 @@ Now let's convert them into record io format using the `im2rec.py` utility scrip
 First, we need to make a list that contains all the image files and their categories:
 
 ```python
-os.system('python %s/tools/im2rec.py --list=1 --recursive=1 --shuffle=1 --test-ratio=0.2 data/caltech data/101_ObjectCategories'%os.environ['MXNET_HOME'])
+os.system('python %s/tools/im2rec.py --list --recursive --test-ratio=0.2 data/caltech data/101_ObjectCategories'%os.environ['MXNET_HOME'])
 ```
 
 The resulting list file (./data/caltech_train.lst) is in the format `index\t(one or more label)\tpath`. In this case, there is only one label for each image but you can modify the list to add in more for multi-label training.
@@ -375,7 +375,7 @@ Then we can use this list to create our record io file:
 
 
 ```python
-os.system("python %s/tools/im2rec.py --num-thread=4 --pass-through=1 data/caltech data/101_ObjectCategories"%os.environ['MXNET_HOME'])
+os.system("python %s/tools/im2rec.py --num-thread=4 --pass-through data/caltech data/101_ObjectCategories"%os.environ['MXNET_HOME'])
 ```
 
 The record io files are now saved at here (./data)
diff --git a/docs/tutorials/basic/image_io.md b/docs/tutorials/basic/image_io.md
index e6434257b7..8d60ee8fc0 100644
--- a/docs/tutorials/basic/image_io.md
+++ b/docs/tutorials/basic/image_io.md
@@ -55,7 +55,7 @@ contains all the image files and their categories:
 
 ```python
 assert(MXNET_HOME != '/scratch/mxnet'), "Please update your MXNet location"
-os.system('python %s/tools/im2rec.py --list=1 --recursive=1 --shuffle=1 --test-ratio=0.2 data/caltech data/101_ObjectCategories'%MXNET_HOME)
+os.system('python %s/tools/im2rec.py --list --recursive --test-ratio=0.2 data/caltech data/101_ObjectCategories'%MXNET_HOME)
 ```
 
 The resulting [list file](./data/caltech_train.lst) is in the format
@@ -66,7 +66,7 @@ Then we can use this list to create our record io file:
 
 
 ```python
-os.system("python %s/tools/im2rec.py --num-thread=4 --pass-through=1 data/caltech data/101_ObjectCategories"%MXNET_HOME)
+os.system("python %s/tools/im2rec.py --num-thread=4 --pass-through data/caltech data/101_ObjectCategories"%MXNET_HOME)
 ```
 
 The record io files are now saved in the "data" directory.
diff --git a/docs/tutorials/vision/large_scale_classification.md b/docs/tutorials/vision/large_scale_classification.md
index 1cf22708ef..12c8721a32 100644
--- a/docs/tutorials/vision/large_scale_classification.md
+++ b/docs/tutorials/vision/large_scale_classification.md
@@ -105,7 +105,7 @@ To create the recordIO files, we first create a list of images we want in the re
 
 ```
 mkdir -p train_meta
-python ${MXNET}/tools/im2rec.py --list True --chunks 8 --recursive True \
+python ${MXNET}/tools/im2rec.py --list --chunks 8 --recursive \
 train_meta/${NAME} ${ROOT}
 ```
 
@@ -127,7 +127,7 @@ We do similar preprocessing for the validation set.
 
 ```
 mkdir -p val_meta
-python ${MXNET}/tools/im2rec.py --list True --recursive True \
+python ${MXNET}/tools/im2rec.py --list --recursive \
 val_meta/${NAME} ${VAL_ROOT}
 python ${MXNET}/tools/im2rec.py --resize 480 --quality 90 \
 --num-thread 16 val_meta/${NAME} ${VAL_ROOT}
diff --git a/example/image-classification/README.md b/example/image-classification/README.md
index 8a64b5530a..27ee15f13f 100644
--- a/example/image-classification/README.md
+++ b/example/image-classification/README.md
@@ -68,7 +68,7 @@ We first prepare two `.lst` files, which consist of the labels and image paths
 can be used for generating `rec` files.
 
 ```bash
-python tools/im2rec.py --list True --recursive True --train-ratio 0.95 mydata img_data
+python tools/im2rec.py --list --recursive --train-ratio 0.95 mydata img_data
 ```
 
 Then we generate the `.rec` files. We resize the images such that the short edge
diff --git a/python/mxnet/test_utils.py b/python/mxnet/test_utils.py
index 6461904486..9f62dc6367 100644
--- a/python/mxnet/test_utils.py
+++ b/python/mxnet/test_utils.py
@@ -1417,6 +1417,34 @@ def download(url, fname=None, dirname=None, overwrite=False):
     logging.info("downloaded %s into %s successfully", url, fname)
     return fname
 
+def get_images(directory):
+    """Download test images into given directory as argument
+
+    Parameters
+    ----------
+    directory : str
+        path where the images are stored and decompressed
+
+    Returns
+    -------
+    list
+        A list of image paths
+    """
+    import tarfile
+    url = "http://data.mxnet.io/data/test_images.tar.gz"
+    logging.info("Downloading '%s' to '%s'", url, directory)
+    download(url, dirname=directory, overwrite=False)
+    fname = os.path.join(directory, url.split('/')[-1])
+    logging.info("decompressing '%s'", fname)
+    tar = tarfile.open(fname)
+    source_images = [os.path.join(directory, x.name) for x in tar.getmembers() if x.isfile()]
+    if len(source_images) < 1 or not os.path.isfile(source_images[0]):
+        # skip extracting if exists
+        tar.extractall(path=directory)
+    tar.close()
+    return source_images
+
+
 def get_mnist():
     """Download and load the MNIST dataset
 
diff --git a/tests/python/unittest/test_image.py b/tests/python/unittest/test_image.py
index 124c94c5eb..970d348d93 100644
--- a/tests/python/unittest/test_image.py
+++ b/tests/python/unittest/test_image.py
@@ -25,17 +25,6 @@
 
 from nose.tools import raises
 
-def _get_data(url, dirname):
-    import os, tarfile
-    download(url, dirname=dirname, overwrite=False)
-    fname = os.path.join(dirname, url.split('/')[-1])
-    tar = tarfile.open(fname)
-    source_images = [os.path.join(dirname, x.name) for x in tar.getmembers() if x.isfile()]
-    if len(source_images) < 1 or not os.path.isfile(source_images[0]):
-        # skip extracting if exists
-        tar.extractall(path=dirname)
-    tar.close()
-    return source_images
 
 def _generate_objects():
     num = np.random.randint(1, 10)
@@ -52,14 +41,15 @@ def _generate_objects():
 
 
 class TestImage(unittest.TestCase):
-    IMAGES_URL = "http://data.mxnet.io/data/test_images.tar.gz"
     IMAGES = []
     IMAGES_DIR = None
 
+    # Test fixture
+
     @classmethod
     def setupClass(cls):
         cls.IMAGES_DIR = tempfile.mkdtemp()
-        cls.IMAGES = _get_data(cls.IMAGES_URL, cls.IMAGES_DIR)
+        cls.IMAGES = get_images(cls.IMAGES_DIR)
         print("Loaded {} images".format(len(cls.IMAGES)))
 
     @classmethod
@@ -68,6 +58,8 @@ def teardownClass(cls):
             print("cleanup {}".format(cls.IMAGES_DIR))
             shutil.rmtree(cls.IMAGES_DIR)
 
+    # individual tests
+
     @raises(mx.base.MXNetError)
     def test_imread_not_found(self):
         x = mx.img.image.imread("/139810923jadjsajlskd.___adskj/blah.jpg")
diff --git a/tools/im2rec.py b/tools/im2rec.py
old mode 100644
new mode 100755
index 5547c534d8..ef3e3f3cf7
--- a/tools/im2rec.py
+++ b/tools/im2rec.py
@@ -1,3 +1,5 @@
+#!/usr/bin/env python3
+# -*- coding: utf-8 -*-
 # Licensed to the Apache Software Foundation (ASF) under one
 # or more contributor license agreements.  See the NOTICE file
 # distributed with this work for additional information
@@ -15,7 +17,6 @@
 # specific language governing permissions and limitations
 # under the License.
 
-# -*- coding: utf-8 -*-
 from __future__ import print_function
 import os
 import sys
@@ -229,7 +230,6 @@ def parse_args():
     cgroup.add_argument('--no-shuffle', dest='shuffle', action='store_false',
                         help='If this is passed, \
         im2rec will not randomize the image order in <prefix>.lst')
-
     rgroup = parser.add_argument_group('Options for creating database')
     rgroup.add_argument('--pass-through', action='store_true',
                         help='whether to skip transformation and save image as is')


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services