You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@tvm.apache.org by GitBox <gi...@apache.org> on 2022/04/06 21:05:51 UTC

[GitHub] [tvm] guberti opened a new pull request, #10921: Model training tutorial for microTVM

guberti opened a new pull request, #10921:
URL: https://github.com/apache/tvm/pull/10921

   This PR adds a new tutorial to the microTVM how_to gallery, showing how a MobileNet V1 model can be trained with transfer learning and modified to fit on embedded devices (in this case, the Arduino Nano 33 BLE).
   
   This tutorial was [originally designed as a Google Colab notebook](https://colab.research.google.com/drive/1JZJFJVM56N1C8DfGyKx6lJ-I0-nLa-Ar), but I'm in the process of modifying it so it works with the existing TVM tutorials. I'm also working on adding a button to let users open this tutorial in Google Colab (see #10706) - this new tutorial will also serve as a testing bed for that feature.
   
   For now, the best way to view the tutorial is by going to https://colab.research.google.com/drive/1JZJFJVM56N1C8DfGyKx6lJ-I0-nLa-Ar.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r865205437


##########
gallery/how_to/work_with_microtvm/micro_train.py:
##########
@@ -0,0 +1,638 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+.. _microtvm-train-arduino:
+
+Training Vision Models for microTVM on Arduino
+==============================================
+**Author**: `Gavin Uberti <https://github.com/guberti>`_
+
+This tutorial shows how MobileNetV1 models can be trained
+to fit on embedded devices, and how those models can be
+deployed to Arduino using TVM.
+"""
+
+######################################################################
+# .. note::
+#
+#   This tutorial is best viewed as a Jupyter Notebook. You can download and run it locally
+#   using the link at the bottom of this page, or open it online for free using Google Colab.
+#   Click the icon below to open in Google Colab.
+#
+# .. image:: https://raw.githubusercontent.com/guberti/web-data/micro-train-tutorial-data/images/utilities/colab_button.png
+#      :align: center
+#      :target: https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb
+#      :width: 600px
+#
+# Motivation
+# ----------
+# When building IOT devices, we often want them to **see and understand** the world around them.
+# This can take many forms, but often times a device will want to know if a certain **kind of
+# object** is in its field of vision.
+#
+# For example, a security camera might look for **people**, so it can decide whether to save a video
+# to memory. A traffic light might look for **cars**, so it can judge which lights should change
+# first. Or a forest camera might look for a **kind of animal**, so they can estimate how large
+# the animal population is.
+#
+# To make these devices affordable, we would like them to need only a low-cost processor like the
+# `nRF52840 <https://www.nordicsemi.com/Products/nRF52840>`_ (costing five dollars each on Mouser) or the `RP2040 <https://www.raspberrypi.com/products/rp2040/>`_ (just $1.45 each!).
+#
+# These devices have very little memory (~250 KB RAM), meaning that no conventional edge AI
+# vision model (like MobileNet or EfficientNet) will be able to run. In this tutorial, we will
+# show how these models can be modified to work around this requirement. Then, we will use TVM
+# to compile and deploy it for an Arduino that uses one of these processors.
+#
+# Installing the Prerequisites
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+#
+# To run this tutorial, we will need Tensorflow and TFLite to train our model, pyserial and tlcpack
+# (a community build of TVM) to compile and test it, and imagemagick and curl to preprocess data.
+# We will also need to install the Arduino CLI and the mbed_nano package to test our model.
+#
+#     .. code-block:: bash
+#
+#       %%bash
+#       pip install -q tensorflow tflite pyserial
+#       pip install -q tlcpack-nightly -f https://tlcpack.ai/wheels
+#       apt-get -qq install imagemagick curl
+#
+#       # Install Arduino CLI and library for Nano 33 BLE
+#       curl -fsSL https://raw.githubusercontent.com/arduino/arduino-cli/master/install.sh | sh
+#       /content/bin/arduino-cli core update-index
+#       /content/bin/arduino-cli core install arduino:mbed_nano
+#
+# Using the GPU
+# ^^^^^^^^^^^^^
+#
+# This tutorial demonstrates training a neural network, which is requires a lot of computing power
+# and will go much faster if you have a GPU. If you are viewing this tutorial on Google Colab, you
+# can enable a GPU by going to **Runtime->Change runtime type** and selecting "GPU" as our hardware
+# accelerator. If you are running locally, you can `follow Tensorflow's guide <https://www.tensorflow.org/guide/gpu>`_ instead.
+#
+# We can test our GPU installation with the following code:
+
+import tensorflow as tf
+
+if not tf.test.gpu_device_name():
+    print("No GPU was detected!")
+    print("Model training will take much longer (~30 minutes instead of ~5)")
+else:
+    print("GPU detected - you're good to go.")
+
+######################################################################
+# Choosing Our Work Dir
+# ^^^^^^^^^^^^^^^^^^^^^
+# We need to pick a directory where our image datasets, trained model, and eventual Arduino sketch
+# will all live. If running on Google Colab, we'll save everything in ``/root`` (aka ``~``) but you'll
+# probably want to store it elsewhere if running locally. Note that this variable only affects Python
+# scripts - you'll have to adjust the Bash commands too.
+
+import os
+
+FOLDER = "/root"
+# sphinx_gallery_start_ignore
+import tempfile
+
+FOLDER = tempfile.mkdtemp()
+# sphinx_gallery_end_ignore
+
+######################################################################
+# Downloading the Data
+# --------------------
+# Convolutional neural networks usually learn by looking at many images, along with labels telling
+# the network what those images are. To get these images, we'll need a publicly available dataset
+# with thousands of images of all sorts of objects and labels of what's in each image. We'll also
+# need a bunch of images that **aren't** of cars, as we're trying to distinguish these two classes.
+#
+# In this tutorial, we'll create a model to detect if an image contains a **car**, but you can use
+# whatever category you like! Just change the source URL below to one containing images of another
+# type of object.
+#
+# To get our car images, we'll be downloading the `Stanford Cars dataset <http://ai.stanford.edu/~jkrause/cars/car_dataset.html>`_,
+# which contains 16,185 full color images of cars. We'll also need images of random things that
+# aren't cars, so we'll use the `COCO 2017 <https://cocodataset.org/#home>`_ validation set (it's
+# smaller, and thus faster to download than the full training set. Training on the full data set
+# would yield better results). Note that there are some cars in the COCO 2017 data set, but it's
+# a small enough fraction not to matter - just keep in mind that this will drive down our percieved
+# accuracy slightly.
+#
+# We could use the Tensorflow dataloader utilities, but we'll instead do it manually to make sure
+# it's easy to change the datasets being used. We'll end up with the following file hierarchy:
+#
+#     .. code-block::
+#
+#         /root
+#         ├── images
+#         │   ├── object
+#         │   │   ├── 000001.jpg
+#         │   │   │ ...
+#         │   │   └── 016185.jpg
+#         │   ├── object.tgz
+#         │   ├── random
+#         │   │   ├── 000000000139.jpg
+#         │   │   │ ...
+#         │   │   └── 000000581781.jpg
+#         │   └── random.zip
+#
+# We should also note that Stanford cars has 8k images, while the COCO 2017 validation set is 5k
+# images - it is not a 50/50 split! If we wanted to, we could weight these classes differently
+# during training to correct for this, but training will still work if we ignore it. It should
+# take about **2 minutes** to download the Stanford Cars, while COCO 2017 validation will take
+# **1 minute**.
+
+import os
+import shutil
+import urllib.request
+
+# Download datasets
+os.makedirs(f"{FOLDER}/images")
+urllib.request.urlretrieve(
+    "http://ai.stanford.edu/~jkrause/car196/cars_train.tgz", f"{FOLDER}/images/target.tgz"
+)
+urllib.request.urlretrieve(
+    "http://images.cocodataset.org/zips/val2017.zip", f"{FOLDER}/images/random.zip"
+)
+
+# Extract them and rename their folders
+shutil.unpack_archive(f"{FOLDER}/images/target.tgz", f"{FOLDER}/images")
+shutil.unpack_archive(f"{FOLDER}/images/random.zip", f"{FOLDER}/images")
+shutil.move(f"{FOLDER}/images/cars_train", f"{FOLDER}/images/target")
+shutil.move(f"{FOLDER}/images/val2017", f"{FOLDER}/images/random")
+
+######################################################################
+# Loading the Data
+# ----------------
+# Currently, our data is stored on-disk as JPG files of various sizes. To train with it, we'll have
+# to load the images into memory, resize them to be 64x64, and convert them to raw, uncompressed
+# data. Keras's ``image_dataset_from_directory`` will take care of most of this, though it loads
+# images such that each pixel value is a float from 0 to 255.
+#
+# We'll also need to load labels, though Keras will help with this. From our subdirectory structure,
+# it knows the images in ``/objects`` are one class, and those in ``/random`` another. Setting
+# ``label_mode='categorical'`` tells Keras to convert these into **categorical labels** - a 2x1 vector
+# that's either ``[1, 0]`` for an object of our target class, or ``[0, 1]`` vector for anything else.
+# We'll also set ``shuffle=True`` to randomize the order of our examples.
+#
+# We will also **batch** the data - grouping samples into clumps to make our training go faster.
+# Setting ``batch_size = 32`` is a decent number.
+#
+# Lastly, in machine learning we generally want our inputs to be small numbers. We'll thus use a
+# ``Rescaling`` layer to change our images such that each pixel is a float between ``0.0`` and ``1.0``,
+# instead of ``0`` to ``255``. We need to be careful not to rescale our categorical labels though, so
+# we'll use a ``lambda`` function.
+
+IMAGE_SIZE = (64, 64, 3)
+unscaled_dataset = tf.keras.utils.image_dataset_from_directory(
+    f"{FOLDER}/images",
+    batch_size=32,
+    shuffle=True,
+    label_mode="categorical",
+    image_size=IMAGE_SIZE[0:2],
+)
+rescale = tf.keras.layers.Rescaling(scale=1.0 / 255)
+full_dataset = unscaled_dataset.map(lambda im, lbl: (rescale(im), lbl))
+
+######################################################################
+# What's Inside Our Dataset?
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^
+# Before giving this data set to our neural network, we ought to give it a quick visual inspection.
+# Does the data look properly transformed? Do the labels seem appropriate? And what's our ratio of
+# objects to other stuff? We can display some examples from our datasets using ``matplotlib``:
+
+import matplotlib.pyplot as plt
+
+num_target_class = len(os.listdir(f"{FOLDER}/images/target/"))
+num_random_class = len(os.listdir(f"{FOLDER}/images/random/"))
+print(f"{FOLDER}/images/target contains {num_target_class} images")
+print(f"{FOLDER}/images/random contains {num_random_class} images")
+
+# Show some samples and their labels
+SAMPLES_TO_SHOW = 10
+plt.figure(figsize=(20, 10))
+for i, (image, label) in enumerate(unscaled_dataset.unbatch()):
+    if i >= SAMPLES_TO_SHOW:
+        break
+    ax = plt.subplot(1, SAMPLES_TO_SHOW, i + 1)
+    plt.imshow(image.numpy().astype("uint8"))
+    plt.title(list(label.numpy()))
+    plt.axis("off")
+
+######################################################################
+# Validating our Accuracy
+# ^^^^^^^^^^^^^^^^^^^^^^^
+# While developing our model, we'll often want to check how accurate it is (e.g. to see if it
+# improves during training). How do we do this? We could just train it on *all* of the data, and
+# then ask it to classify that same data. However, our model could cheat by just memorizing all of
+# the samples, which would make it *appear* to have very high accuracy, but perform very badly in
+# reality. In practice, this "memorizing" is called **overfitting**.
+#
+# To prevent this, we will set aside some of the data (we'll use 20%) as a **validation set**. Our
+# model will never be trained on validation data - we'll only use it to check our model's accuracy.
+
+num_batches = len(full_dataset)
+train_dataset = full_dataset.take(int(num_batches * 0.8))
+validation_dataset = full_dataset.skip(len(train_dataset))
+
+######################################################################
+# Loading the Data
+# ----------------
+# In the past decade, `convolutional neural networks <https://en.wikipedia.org/wiki/Convolutional_neural_network>`_ have been widely
+# adopted for image classification tasks. State-of-the-art models like `EfficientNet V2 <https://arxiv.org/abs/2104.00298>`_ are able
+# to perform image classification better than even humans! Unfortunately, these models have tens of
+# millions of parameters, and thus won't fit on cheap security camera computers.
+#
+# Our applications generally don't need perfect accuracy - 90% is good enough. We can thus use the
+# older and smaller MobileNet V1 architecture. But this *still* won't be small enough - by default,
+# MobileNet V1 with 224x224 inputs and depth 1.0 takes ~50 MB to just **store**. To reduce the size
+# of the model, there are three knobs we can turn. First, we can reduce the size of the input images
+# from 224x224 to 96x96 or 64x64, and Keras makes it easy to do this. We can also reduce the **depth**

Review Comment:
   Fixed! I also now refer to it correctly as "alpha".



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r865211999


##########
tests/scripts/ci.py:
##########
@@ -267,7 +267,7 @@ def docs(
             "tlcpack-sphinx-addon==0.2.1",
             "synr==0.5.0",
             "image==1.5.33",
-            "sphinx-gallery==0.4.0",
+            "git+https://github.com/guberti/sphinx-gallery.git@ipynb-include-bash",

Review Comment:
   No, this is based on the latest sphinx-gallery version. You're right that this could potentially cause problems, but I haven't ran into any. We should definitely verify nothing breaks, and if there is an issue we can discuss backporting.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r865195926


##########
gallery/how_to/work_with_microtvm/micro_train.py:
##########
@@ -0,0 +1,638 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+.. _microtvm-train-arduino:
+
+Training Vision Models for microTVM on Arduino
+==============================================
+**Author**: `Gavin Uberti <https://github.com/guberti>`_
+
+This tutorial shows how MobileNetV1 models can be trained
+to fit on embedded devices, and how those models can be
+deployed to Arduino using TVM.
+"""
+
+######################################################################
+# .. note::
+#
+#   This tutorial is best viewed as a Jupyter Notebook. You can download and run it locally
+#   using the link at the bottom of this page, or open it online for free using Google Colab.
+#   Click the icon below to open in Google Colab.
+#
+# .. image:: https://raw.githubusercontent.com/guberti/web-data/micro-train-tutorial-data/images/utilities/colab_button.png
+#      :align: center
+#      :target: https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb
+#      :width: 600px

Review Comment:
   Yea, 600px is pretty big. Fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on PR #10921:
URL: https://github.com/apache/tvm/pull/10921#issuecomment-1117785901

   > thanks @guberti, few questions
   
   Thanks for taking a look! Addressed your comments and merged `main` into this branch to fix the tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch commented on pull request #10921: [docs] Google Colab compatible microTVM model training tutorial

Posted by GitBox <gi...@apache.org>.

areusch commented on PR #10921:
URL: https://github.com/apache/tvm/pull/10921#issuecomment-1115225788

   @guberti could you try pushing an empty commit? i think retriggering somehow didn't pick up the right Jenkinsfile changes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r883956625


##########
apps/microtvm/pyproject.toml:
##########
@@ -129,7 +129,7 @@ importer-tflite = ["tflite", "tensorflow", "tensorflow-estimator"]
 autodocsumm = "^0.1"
 black = "^19.10b0"
 sphinx = "^3.0"
-sphinx-gallery = "^0.8"
+sphinx-gallery = { git = "https://github.com/sphinx-gallery/sphinx-gallery.git", branch = "master" }

Review Comment:
   Agreed, fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r864420869


##########
gallery/how_to/work_with_microtvm/micro_train.py:
##########
@@ -0,0 +1,638 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+.. _microtvm-train-arduino:
+
+Training Vision Models for microTVM on Arduino
+==============================================
+**Author**: `Gavin Uberti <https://github.com/guberti>`_
+
+This tutorial shows how MobileNetV1 models can be trained
+to fit on embedded devices, and how those models can be
+deployed to Arduino using TVM.
+"""
+
+######################################################################
+# .. note::
+#
+#   This tutorial is best viewed as a Jupyter Notebook. You can download and run it locally
+#   using the link at the bottom of this page, or open it online for free using Google Colab.
+#   Click the icon below to open in Google Colab.
+#
+# .. image:: https://raw.githubusercontent.com/guberti/web-data/micro-train-tutorial-data/images/utilities/colab_button.png
+#      :align: center
+#      :target: https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb

Review Comment:
   Nope! It looks like a hash, but is not. It does not change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

areusch commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r883730195


##########
apps/microtvm/pyproject.toml:
##########
@@ -129,7 +129,7 @@ importer-tflite = ["tflite", "tensorflow", "tensorflow-estimator"]
 autodocsumm = "^0.1"
 black = "^19.10b0"
 sphinx = "^3.0"
-sphinx-gallery = "^0.8"
+sphinx-gallery = { git = "https://github.com/sphinx-gallery/sphinx-gallery.git", branch = "master" }

Review Comment:
   Let's at least pin to a deterministic revision so we don't have to handle churn in Sphinx-gallery in our ci rebuild process



##########
tests/scripts/task_python_docs.sh:
##########
@@ -84,6 +84,7 @@ IGNORED_WARNINGS=(
     'autotvm:Cannot find config for target=llvm -keys=cpu -link-params=0'
     'autotvm:One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.'
     'autotvm:Cannot find config for target=cuda -keys=cuda,gpu'
+    'absl:For model inputs containing unsupported operations'

Review Comment:
   Can you link me to an example log line?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r883958414


##########
tests/scripts/task_python_docs.sh:
##########
@@ -84,6 +84,7 @@ IGNORED_WARNINGS=(
     'autotvm:Cannot find config for target=llvm -keys=cpu -link-params=0'
     'autotvm:One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.'
     'autotvm:Cannot find config for target=cuda -keys=cuda,gpu'
+    'absl:For model inputs containing unsupported operations'

Review Comment:
   The full error is:
   ```WARNING:absl:For model inputs containing unsupported operations which cannot be quantized, the `inference_input_type` attribute will default to the original type.```
   This warning also occurs in the official TensorFlow tutorial - https://www.tensorflow.org/lite/performance/post_training_integer_quant.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r865206005


##########
gallery/how_to/work_with_microtvm/micro_train.py:
##########
@@ -0,0 +1,638 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+.. _microtvm-train-arduino:
+
+Training Vision Models for microTVM on Arduino
+==============================================
+**Author**: `Gavin Uberti <https://github.com/guberti>`_
+
+This tutorial shows how MobileNetV1 models can be trained
+to fit on embedded devices, and how those models can be
+deployed to Arduino using TVM.
+"""
+
+######################################################################
+# .. note::
+#
+#   This tutorial is best viewed as a Jupyter Notebook. You can download and run it locally
+#   using the link at the bottom of this page, or open it online for free using Google Colab.
+#   Click the icon below to open in Google Colab.
+#
+# .. image:: https://raw.githubusercontent.com/guberti/web-data/micro-train-tutorial-data/images/utilities/colab_button.png
+#      :align: center
+#      :target: https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb
+#      :width: 600px
+#
+# Motivation
+# ----------
+# When building IOT devices, we often want them to **see and understand** the world around them.
+# This can take many forms, but often times a device will want to know if a certain **kind of
+# object** is in its field of vision.
+#
+# For example, a security camera might look for **people**, so it can decide whether to save a video
+# to memory. A traffic light might look for **cars**, so it can judge which lights should change
+# first. Or a forest camera might look for a **kind of animal**, so they can estimate how large
+# the animal population is.
+#
+# To make these devices affordable, we would like them to need only a low-cost processor like the
+# `nRF52840 <https://www.nordicsemi.com/Products/nRF52840>`_ (costing five dollars each on Mouser) or the `RP2040 <https://www.raspberrypi.com/products/rp2040/>`_ (just $1.45 each!).
+#
+# These devices have very little memory (~250 KB RAM), meaning that no conventional edge AI
+# vision model (like MobileNet or EfficientNet) will be able to run. In this tutorial, we will
+# show how these models can be modified to work around this requirement. Then, we will use TVM
+# to compile and deploy it for an Arduino that uses one of these processors.
+#
+# Installing the Prerequisites
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+#
+# To run this tutorial, we will need Tensorflow and TFLite to train our model, pyserial and tlcpack
+# (a community build of TVM) to compile and test it, and imagemagick and curl to preprocess data.
+# We will also need to install the Arduino CLI and the mbed_nano package to test our model.
+#
+#     .. code-block:: bash
+#
+#       %%bash
+#       pip install -q tensorflow tflite pyserial
+#       pip install -q tlcpack-nightly -f https://tlcpack.ai/wheels
+#       apt-get -qq install imagemagick curl
+#
+#       # Install Arduino CLI and library for Nano 33 BLE
+#       curl -fsSL https://raw.githubusercontent.com/arduino/arduino-cli/master/install.sh | sh
+#       /content/bin/arduino-cli core update-index
+#       /content/bin/arduino-cli core install arduino:mbed_nano
+#
+# Using the GPU
+# ^^^^^^^^^^^^^
+#
+# This tutorial demonstrates training a neural network, which is requires a lot of computing power
+# and will go much faster if you have a GPU. If you are viewing this tutorial on Google Colab, you
+# can enable a GPU by going to **Runtime->Change runtime type** and selecting "GPU" as our hardware
+# accelerator. If you are running locally, you can `follow Tensorflow's guide <https://www.tensorflow.org/guide/gpu>`_ instead.
+#
+# We can test our GPU installation with the following code:
+
+import tensorflow as tf
+
+if not tf.test.gpu_device_name():
+    print("No GPU was detected!")
+    print("Model training will take much longer (~30 minutes instead of ~5)")
+else:
+    print("GPU detected - you're good to go.")
+
+######################################################################
+# Choosing Our Work Dir
+# ^^^^^^^^^^^^^^^^^^^^^
+# We need to pick a directory where our image datasets, trained model, and eventual Arduino sketch
+# will all live. If running on Google Colab, we'll save everything in ``/root`` (aka ``~``) but you'll
+# probably want to store it elsewhere if running locally. Note that this variable only affects Python
+# scripts - you'll have to adjust the Bash commands too.
+
+import os
+
+FOLDER = "/root"
+# sphinx_gallery_start_ignore
+import tempfile
+
+FOLDER = tempfile.mkdtemp()
+# sphinx_gallery_end_ignore
+
+######################################################################
+# Downloading the Data
+# --------------------
+# Convolutional neural networks usually learn by looking at many images, along with labels telling
+# the network what those images are. To get these images, we'll need a publicly available dataset
+# with thousands of images of all sorts of objects and labels of what's in each image. We'll also
+# need a bunch of images that **aren't** of cars, as we're trying to distinguish these two classes.
+#
+# In this tutorial, we'll create a model to detect if an image contains a **car**, but you can use
+# whatever category you like! Just change the source URL below to one containing images of another
+# type of object.
+#
+# To get our car images, we'll be downloading the `Stanford Cars dataset <http://ai.stanford.edu/~jkrause/cars/car_dataset.html>`_,
+# which contains 16,185 full color images of cars. We'll also need images of random things that
+# aren't cars, so we'll use the `COCO 2017 <https://cocodataset.org/#home>`_ validation set (it's
+# smaller, and thus faster to download than the full training set. Training on the full data set
+# would yield better results). Note that there are some cars in the COCO 2017 data set, but it's
+# a small enough fraction not to matter - just keep in mind that this will drive down our percieved
+# accuracy slightly.
+#
+# We could use the Tensorflow dataloader utilities, but we'll instead do it manually to make sure
+# it's easy to change the datasets being used. We'll end up with the following file hierarchy:
+#
+#     .. code-block::
+#
+#         /root
+#         ├── images
+#         │   ├── object
+#         │   │   ├── 000001.jpg
+#         │   │   │ ...
+#         │   │   └── 016185.jpg
+#         │   ├── object.tgz
+#         │   ├── random
+#         │   │   ├── 000000000139.jpg
+#         │   │   │ ...
+#         │   │   └── 000000581781.jpg
+#         │   └── random.zip
+#
+# We should also note that Stanford cars has 8k images, while the COCO 2017 validation set is 5k
+# images - it is not a 50/50 split! If we wanted to, we could weight these classes differently
+# during training to correct for this, but training will still work if we ignore it. It should
+# take about **2 minutes** to download the Stanford Cars, while COCO 2017 validation will take
+# **1 minute**.
+
+import os
+import shutil
+import urllib.request
+
+# Download datasets
+os.makedirs(f"{FOLDER}/images")
+urllib.request.urlretrieve(
+    "http://ai.stanford.edu/~jkrause/car196/cars_train.tgz", f"{FOLDER}/images/target.tgz"
+)
+urllib.request.urlretrieve(
+    "http://images.cocodataset.org/zips/val2017.zip", f"{FOLDER}/images/random.zip"
+)
+
+# Extract them and rename their folders
+shutil.unpack_archive(f"{FOLDER}/images/target.tgz", f"{FOLDER}/images")
+shutil.unpack_archive(f"{FOLDER}/images/random.zip", f"{FOLDER}/images")
+shutil.move(f"{FOLDER}/images/cars_train", f"{FOLDER}/images/target")
+shutil.move(f"{FOLDER}/images/val2017", f"{FOLDER}/images/random")
+
+######################################################################
+# Loading the Data
+# ----------------
+# Currently, our data is stored on-disk as JPG files of various sizes. To train with it, we'll have
+# to load the images into memory, resize them to be 64x64, and convert them to raw, uncompressed
+# data. Keras's ``image_dataset_from_directory`` will take care of most of this, though it loads
+# images such that each pixel value is a float from 0 to 255.
+#
+# We'll also need to load labels, though Keras will help with this. From our subdirectory structure,
+# it knows the images in ``/objects`` are one class, and those in ``/random`` another. Setting
+# ``label_mode='categorical'`` tells Keras to convert these into **categorical labels** - a 2x1 vector
+# that's either ``[1, 0]`` for an object of our target class, or ``[0, 1]`` vector for anything else.
+# We'll also set ``shuffle=True`` to randomize the order of our examples.
+#
+# We will also **batch** the data - grouping samples into clumps to make our training go faster.
+# Setting ``batch_size = 32`` is a decent number.
+#
+# Lastly, in machine learning we generally want our inputs to be small numbers. We'll thus use a
+# ``Rescaling`` layer to change our images such that each pixel is a float between ``0.0`` and ``1.0``,
+# instead of ``0`` to ``255``. We need to be careful not to rescale our categorical labels though, so
+# we'll use a ``lambda`` function.
+
+IMAGE_SIZE = (64, 64, 3)
+unscaled_dataset = tf.keras.utils.image_dataset_from_directory(
+    f"{FOLDER}/images",
+    batch_size=32,
+    shuffle=True,
+    label_mode="categorical",
+    image_size=IMAGE_SIZE[0:2],
+)
+rescale = tf.keras.layers.Rescaling(scale=1.0 / 255)
+full_dataset = unscaled_dataset.map(lambda im, lbl: (rescale(im), lbl))
+
+######################################################################
+# What's Inside Our Dataset?
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^
+# Before giving this data set to our neural network, we ought to give it a quick visual inspection.
+# Does the data look properly transformed? Do the labels seem appropriate? And what's our ratio of
+# objects to other stuff? We can display some examples from our datasets using ``matplotlib``:
+
+import matplotlib.pyplot as plt
+
+num_target_class = len(os.listdir(f"{FOLDER}/images/target/"))
+num_random_class = len(os.listdir(f"{FOLDER}/images/random/"))
+print(f"{FOLDER}/images/target contains {num_target_class} images")
+print(f"{FOLDER}/images/random contains {num_random_class} images")
+
+# Show some samples and their labels
+SAMPLES_TO_SHOW = 10
+plt.figure(figsize=(20, 10))
+for i, (image, label) in enumerate(unscaled_dataset.unbatch()):
+    if i >= SAMPLES_TO_SHOW:
+        break
+    ax = plt.subplot(1, SAMPLES_TO_SHOW, i + 1)
+    plt.imshow(image.numpy().astype("uint8"))
+    plt.title(list(label.numpy()))
+    plt.axis("off")
+
+######################################################################
+# Validating our Accuracy
+# ^^^^^^^^^^^^^^^^^^^^^^^
+# While developing our model, we'll often want to check how accurate it is (e.g. to see if it
+# improves during training). How do we do this? We could just train it on *all* of the data, and
+# then ask it to classify that same data. However, our model could cheat by just memorizing all of
+# the samples, which would make it *appear* to have very high accuracy, but perform very badly in
+# reality. In practice, this "memorizing" is called **overfitting**.
+#
+# To prevent this, we will set aside some of the data (we'll use 20%) as a **validation set**. Our
+# model will never be trained on validation data - we'll only use it to check our model's accuracy.
+
+num_batches = len(full_dataset)
+train_dataset = full_dataset.take(int(num_batches * 0.8))
+validation_dataset = full_dataset.skip(len(train_dataset))
+
+######################################################################
+# Loading the Data
+# ----------------
+# In the past decade, `convolutional neural networks <https://en.wikipedia.org/wiki/Convolutional_neural_network>`_ have been widely
+# adopted for image classification tasks. State-of-the-art models like `EfficientNet V2 <https://arxiv.org/abs/2104.00298>`_ are able
+# to perform image classification better than even humans! Unfortunately, these models have tens of
+# millions of parameters, and thus won't fit on cheap security camera computers.
+#
+# Our applications generally don't need perfect accuracy - 90% is good enough. We can thus use the
+# older and smaller MobileNet V1 architecture. But this *still* won't be small enough - by default,
+# MobileNet V1 with 224x224 inputs and depth 1.0 takes ~50 MB to just **store**. To reduce the size
+# of the model, there are three knobs we can turn. First, we can reduce the size of the input images
+# from 224x224 to 96x96 or 64x64, and Keras makes it easy to do this. We can also reduce the **depth**
+# of the model, from 1.0 to 0.25. And if we were really strapped for space, we could reduce the
+# number of **channels** by making our model take grayscale images instead of RGB ones.
+#
+# In this tutorial, we will use an RGB 64x64 input image and 0.25 depth scale. This is not quite
+# ideal, but it allows the finished model to fit in 192 KB of RAM, while still letting us perform
+# transfer learning using the official Tensorflow source models (if we used depth scale <0.25 or
+# a grayscale input, we wouldn't be able to do this).
+#
+# What is Transfer Learning?
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^
+# Deep learning has `dominated image classification <https://paperswithcode.com/sota/image-classification-on-imagenet>`_ for a long time,
+# but training neural networks takes a lot of time. When a neural network is trained "from scratch",
+# its parameters start out randomly initialized, forcing it to learn very slowly how to tell images
+# apart.
+#
+# With transfer learning, we instead start with a neural network that's **already** good at a
+# specific task. In this example, that task is classifying images from `the ImageNet database <https://www.image-net.org/>`_. This
+# means the network already has some object detection capabilities, and is likely closer to what you
+# want then a random model would be.
+#
+# This works especially well with image processing neural networks like MobileNet. In practice, it
+# turns out the convolutional layers of the model (i.e. the first 90% of the layers) are used for
+# identifying low-level features like lines and shapes - only the last few fully connected layers
+# are used to determine how those shapes make up the objects the network is trying to detect.
+#
+# We can take advantage of this by starting training with a MobileNet model that was trained on
+# ImageNet, and already knows how to identify those lines and shapes. We can then just remove the
+# last few layers from this pretrained model, and add our own final layers. We'll then train this
+# conglomerate model for a few epochs on our cars vs non-cars dataset, to fine tune the first layers
+# and train from scratch the last layers.
+#
+# Source MobileNets for transfer learning have been `pretrained by the Tensorflow folks <https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md>`_, so we
+# can just download the one closest to what we want (the 128x128 input model with 0.25 depth scale).
+
+os.makedirs(f"{FOLDER}/models")
+WEIGHTS_PATH = f"{FOLDER}/models/mobilenet_2_5_128_tf.h5"
+urllib.request.urlretrieve(
+    "https://storage.googleapis.com/tensorflow/keras-applications/mobilenet/mobilenet_2_5_128_tf.h5",
+    WEIGHTS_PATH,
+)
+
+pretrained = tf.keras.applications.MobileNet(
+    input_shape=IMAGE_SIZE, weights=WEIGHTS_PATH, alpha=0.25
+)
+
+######################################################################
+# Modifying Our Network
+# ^^^^^^^^^^^^^^^^^^^^^
+# As mentioned above, our pretrained model is designed to classify the 1,000 ImageNet categories,
+# but we want to convert it to classify cars. Since only the bottom few layers are task-specific,
+# we'll **cut off the last five layers** of our original model. In their place we'll build our own
+# "tail" to the model by performing respape, dropout, flatten, and softmax operations.
+
+model = tf.keras.models.Sequential()
+
+model.add(tf.keras.layers.InputLayer(input_shape=IMAGE_SIZE))
+model.add(tf.keras.Model(inputs=pretrained.inputs, outputs=pretrained.layers[-5].output))
+
+model.add(tf.keras.layers.Reshape((-1,)))
+model.add(tf.keras.layers.Dropout(0.1))
+model.add(tf.keras.layers.Flatten())
+model.add(tf.keras.layers.Dense(2, activation="softmax"))
+
+######################################################################
+# Training Our Network
+# ^^^^^^^^^^^^^^^^^^^^
+# When training neural networks, we must set a parameter called the **learning rate** that controls
+# how fast our network learns. It must be set carefully - too slow, and our network will take
+# forever to train; too fast, and our network won't be able to learn some fine details. Generally
+# for Adam (the optimizer we're using), ``0.001`` is a pretty good learning rate (and is what's
+# recommended in the `original paper <https://arxiv.org/abs/1412.6980>`_). However, in this case
+# ``0.0005`` seems to work a little better.
+#
+# We'll also pass the validation set from earlier to ``model.fit``. This will evaluate how good our
+# model is each time we train it, and let us track how our model is improving. Once training is
+# finished, the model should have a validation accuracy around ``0.98`` (meaning it was right 98% of
+# the time on our validation set).
+
+model.compile(
+    optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
+    loss="categorical_crossentropy",
+    metrics=["accuracy"],
+)
+model.fit(train_dataset, validation_data=validation_dataset, epochs=3, verbose=2)
+
+######################################################################
+# Quantization
+# ------------
+# We've done a decent job of reducing our model's size so far - changing the input dimension,
+# along with removing the bottom layers reduced the model to just 219k parameters. However, each of
+# these parameters is a ``float32`` that takes four bytes, so our model will take up almost one MB!
+#
+# Additionally, it might be the case that our hardware doesn't have built-in support for floating
+# point numbers. While most high-memory Arduinos (like the Nano 33 BLE) do have hardware support,
+# some others (like the Arduino Due) do not. On any boards *without* dedicated hardware support,
+# floating point multiplication will be extremely slow.
+#
+# To address both issues we will **quantize** the model - representing the weights as eight bit
+# integers. It's more complex than just rounding, though - to get the best performance, TensorFlow
+# tracks how each neuron in our model activates, so we can figure out how to best represent the
+# while being relatively truthful to the original model.
+#
+# We will help TensorFlow do this by creating a representative dataset - a subset of the original
+# that is used for tracking how those neurons activate. We'll then pass this into a ``TFLiteConverter``
+# (Keras itself does not have quantization support) with an ``Optimize`` flag to tell TFLite to perform
+# the conversion. By default, TFLite keeps the inputs and outputs of our model as floats, so we must
+# explicitly tell it to avoid this behavior.
+
+converter = tf.lite.TFLiteConverter.from_keras_model(model)
+
+
+def representative_dataset():
+    for image_batch, label_batch in full_dataset.take(10):
+        yield [image_batch]
+
+
+converter.optimizations = [tf.lite.Optimize.DEFAULT]
+converter.representative_dataset = representative_dataset
+converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
+converter.inference_input_type = tf.uint8
+converter.inference_output_type = tf.uint8
+
+quantized_model = converter.convert()
+
+######################################################################
+# Download the Model if Desired
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+# We've now got a finished model, that you can use locally or in other tutorials (try autotuning
+# this model or viewing it on `https://netron.app/ <https://netron.app/>`_). But before we do
+# those things, we'll have to write it to a file (``quantized.tflite``). If you're running this
+# tutorial on Google Colab, you'll have to uncomment the last two lines to download the file
+# after writing it.
+
+QUANTIZED_MODEL_PATH = f"{FOLDER}/models/quantized.tflite"
+with open(QUANTIZED_MODEL_PATH, "wb") as f:
+    f.write(quantized_model)
+# from google.colab import files
+# files.download(QUANTIZED_MODEL_PATH)
+
+######################################################################
+# Compiling With TVM For Arduino
+# ------------------------------
+# Tensorflow has a built-in framework for deploying to microcontrollers - `TFLite Micro <https://www.tensorflow.org/lite/microcontrollers>`_. However,
+# it's poorly supported by development boards, and does not support autotuning. We will use Apache
+# TVM instead.
+#
+# TVM can be used either with its command line interface (``tvmc``) or with its Python interface. The
+# Python interface is fully-featured and more stable, so we'll use it here.
+#
+# TVM is an optimizing compiler, and optimizations to our model are performed in stages via
+# **intermediate representations**. The first of these is `Relay <https://arxiv.org/abs/1810.00952>`_ a high-level intermediate
+# representation emphasizing portability. The conversion from ``.tflite`` to Relay is done without any
+# knowledge of our "end goal" - the fact we intend to run this model on an Arduino.
+#
+# Choosing an Arduino Board
+# ^^^^^^^^^^^^^^^^^^^^^^^^^
+# Next, we'll have to decide exactly which Arduino board to use. The Arduino sketch that we
+# ultimately generate should be compatible with any board, but knowing which board we are using in
+# advance allows TVM to adjust its compilation strategy to get better performance.
+#
+# There is one catch - we need enough **memory** (flash and RAM) to be able to run our model. We
+# won't ever be able to run a complex vision model like a MobileNet on an Arduino Uno - that board
+# only has 2 kB of RAM and 32 kB of flash! Our model has ~200,000 parameters, so there is just no
+# way it could fit.
+#
+# For this tutorial, we will use the Nano 33 BLE, which has 1 MB of flash memory and 256 KB of RAM.
+# However, any other Arduino with those specs or better should also work.
+#
+# Generating our project
+# ^^^^^^^^^^^^^^^^^^^^^^
+# Next, we'll compile the model to TVM's MLF (machine learning format) intermediate representation,

Review Comment:
   Fixed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch merged pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

areusch merged PR #10921:
URL: https://github.com/apache/tvm/pull/10921


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

guberti commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r883958414


##########
tests/scripts/task_python_docs.sh:
##########
@@ -84,6 +84,7 @@ IGNORED_WARNINGS=(
     'autotvm:Cannot find config for target=llvm -keys=cpu -link-params=0'
     'autotvm:One or more operators have not been tuned. Please tune your model for better performance. Use DEBUG logging level to see more details.'
     'autotvm:Cannot find config for target=cuda -keys=cuda,gpu'
+    'absl:For model inputs containing unsupported operations'

Review Comment:
   The full error is:
   ```
   WARNING:absl:For model inputs containing unsupported operations which cannot be quantized, the `inference_input_type` attribute will default to the original type.
   ```
   This warning also occurs in the official TensorFlow tutorial - https://www.tensorflow.org/lite/performance/post_training_integer_quant.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] guberti commented on pull request #10921: Model training tutorial for microTVM

Posted by GitBox <gi...@apache.org>.

guberti commented on PR #10921:
URL: https://github.com/apache/tvm/pull/10921#issuecomment-1099903109

   # How does "Open in Colab" work?
   
   Google Colab has a feature where using a URL, `.ipynb` files can be loaded from GitHub links. It just so happens that TVM's doc files are already held at https://github.com/apache/tvm-site/tree/asf-site, so we only need to point Google Colab to that. Since this tutorial is not yet merged, all the "Open in Colab" button is is a link to the following address:
   
   https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb
   
   # What had to be changed to get this working?
   
   In theory, nothing about Sphinx Gallery would have to be changed to make this work. However, in practice the generation of `.ipynb` files is extremely buggy, so the output Python notebooks do not work anyway. I made two pull requests to add features to Sphinx Gallery to make this possible:
   
   - https://github.com/sphinx-gallery/sphinx-gallery/pull/940
   - https://github.com/sphinx-gallery/sphinx-gallery/pull/941
   
   If these features are merged into `sphinx-gallery`, then we only need to update the version of `sphinx-gallery` to make opening tutorials in Colab possible.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] driazati commented on pull request #10921: [WIP] Proof of concept for Google Colab compatible tutorials

Posted by GitBox <gi...@apache.org>.

driazati commented on PR #10921:
URL: https://github.com/apache/tvm/pull/10921#issuecomment-1100261970

   This is fantastic! Colab will be great to have in the docs. If the sphinx-gallery PRs don’t get merged in a timely manner we can apply your patches and build it from source in the Docker images so we don’t need to wait


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch commented on pull request #10921: [docs] Google Colab compatible microTVM model training tutorial

Posted by GitBox <gi...@apache.org>.

areusch commented on PR #10921:
URL: https://github.com/apache/tvm/pull/10921#issuecomment-1113775853

   retriggered now that https://github.com/apache/tvm/pull/11164 has landed


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [tvm] areusch commented on a diff in pull request #10921: [docs] microTVM model training tutorial with Colab support

Posted by GitBox <gi...@apache.org>.

areusch commented on code in PR #10921:
URL: https://github.com/apache/tvm/pull/10921#discussion_r863213004


##########
gallery/how_to/work_with_microtvm/micro_train.py:
##########
@@ -0,0 +1,638 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+.. _microtvm-train-arduino:
+
+Training Vision Models for microTVM on Arduino
+==============================================
+**Author**: `Gavin Uberti <https://github.com/guberti>`_
+
+This tutorial shows how MobileNetV1 models can be trained
+to fit on embedded devices, and how those models can be
+deployed to Arduino using TVM.
+"""
+
+######################################################################
+# .. note::
+#
+#   This tutorial is best viewed as a Jupyter Notebook. You can download and run it locally
+#   using the link at the bottom of this page, or open it online for free using Google Colab.
+#   Click the icon below to open in Google Colab.
+#
+# .. image:: https://raw.githubusercontent.com/guberti/web-data/micro-train-tutorial-data/images/utilities/colab_button.png
+#      :align: center
+#      :target: https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb

Review Comment:
   does this URL need to get updated each time the tutorial is changed?



##########
gallery/how_to/work_with_microtvm/micro_train.py:
##########
@@ -0,0 +1,638 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+.. _microtvm-train-arduino:
+
+Training Vision Models for microTVM on Arduino
+==============================================
+**Author**: `Gavin Uberti <https://github.com/guberti>`_
+
+This tutorial shows how MobileNetV1 models can be trained
+to fit on embedded devices, and how those models can be
+deployed to Arduino using TVM.
+"""
+
+######################################################################
+# .. note::
+#
+#   This tutorial is best viewed as a Jupyter Notebook. You can download and run it locally
+#   using the link at the bottom of this page, or open it online for free using Google Colab.
+#   Click the icon below to open in Google Colab.
+#
+# .. image:: https://raw.githubusercontent.com/guberti/web-data/micro-train-tutorial-data/images/utilities/colab_button.png
+#      :align: center
+#      :target: https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb
+#      :width: 600px
+#
+# Motivation
+# ----------
+# When building IOT devices, we often want them to **see and understand** the world around them.
+# This can take many forms, but often times a device will want to know if a certain **kind of
+# object** is in its field of vision.
+#
+# For example, a security camera might look for **people**, so it can decide whether to save a video
+# to memory. A traffic light might look for **cars**, so it can judge which lights should change
+# first. Or a forest camera might look for a **kind of animal**, so they can estimate how large
+# the animal population is.
+#
+# To make these devices affordable, we would like them to need only a low-cost processor like the
+# `nRF52840 <https://www.nordicsemi.com/Products/nRF52840>`_ (costing five dollars each on Mouser) or the `RP2040 <https://www.raspberrypi.com/products/rp2040/>`_ (just $1.45 each!).
+#
+# These devices have very little memory (~250 KB RAM), meaning that no conventional edge AI
+# vision model (like MobileNet or EfficientNet) will be able to run. In this tutorial, we will
+# show how these models can be modified to work around this requirement. Then, we will use TVM
+# to compile and deploy it for an Arduino that uses one of these processors.
+#
+# Installing the Prerequisites
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+#
+# To run this tutorial, we will need Tensorflow and TFLite to train our model, pyserial and tlcpack
+# (a community build of TVM) to compile and test it, and imagemagick and curl to preprocess data.
+# We will also need to install the Arduino CLI and the mbed_nano package to test our model.
+#
+#     .. code-block:: bash
+#
+#       %%bash
+#       pip install -q tensorflow tflite pyserial
+#       pip install -q tlcpack-nightly -f https://tlcpack.ai/wheels
+#       apt-get -qq install imagemagick curl
+#
+#       # Install Arduino CLI and library for Nano 33 BLE
+#       curl -fsSL https://raw.githubusercontent.com/arduino/arduino-cli/master/install.sh | sh
+#       /content/bin/arduino-cli core update-index
+#       /content/bin/arduino-cli core install arduino:mbed_nano
+#
+# Using the GPU
+# ^^^^^^^^^^^^^
+#
+# This tutorial demonstrates training a neural network, which is requires a lot of computing power
+# and will go much faster if you have a GPU. If you are viewing this tutorial on Google Colab, you
+# can enable a GPU by going to **Runtime->Change runtime type** and selecting "GPU" as our hardware
+# accelerator. If you are running locally, you can `follow Tensorflow's guide <https://www.tensorflow.org/guide/gpu>`_ instead.
+#
+# We can test our GPU installation with the following code:
+
+import tensorflow as tf
+
+if not tf.test.gpu_device_name():
+    print("No GPU was detected!")
+    print("Model training will take much longer (~30 minutes instead of ~5)")
+else:
+    print("GPU detected - you're good to go.")
+
+######################################################################
+# Choosing Our Work Dir
+# ^^^^^^^^^^^^^^^^^^^^^
+# We need to pick a directory where our image datasets, trained model, and eventual Arduino sketch
+# will all live. If running on Google Colab, we'll save everything in ``/root`` (aka ``~``) but you'll
+# probably want to store it elsewhere if running locally. Note that this variable only affects Python
+# scripts - you'll have to adjust the Bash commands too.
+
+import os
+
+FOLDER = "/root"
+# sphinx_gallery_start_ignore
+import tempfile
+
+FOLDER = tempfile.mkdtemp()
+# sphinx_gallery_end_ignore
+
+######################################################################
+# Downloading the Data
+# --------------------
+# Convolutional neural networks usually learn by looking at many images, along with labels telling
+# the network what those images are. To get these images, we'll need a publicly available dataset
+# with thousands of images of all sorts of objects and labels of what's in each image. We'll also
+# need a bunch of images that **aren't** of cars, as we're trying to distinguish these two classes.
+#
+# In this tutorial, we'll create a model to detect if an image contains a **car**, but you can use
+# whatever category you like! Just change the source URL below to one containing images of another
+# type of object.
+#
+# To get our car images, we'll be downloading the `Stanford Cars dataset <http://ai.stanford.edu/~jkrause/cars/car_dataset.html>`_,
+# which contains 16,185 full color images of cars. We'll also need images of random things that
+# aren't cars, so we'll use the `COCO 2017 <https://cocodataset.org/#home>`_ validation set (it's
+# smaller, and thus faster to download than the full training set. Training on the full data set
+# would yield better results). Note that there are some cars in the COCO 2017 data set, but it's
+# a small enough fraction not to matter - just keep in mind that this will drive down our percieved
+# accuracy slightly.
+#
+# We could use the Tensorflow dataloader utilities, but we'll instead do it manually to make sure
+# it's easy to change the datasets being used. We'll end up with the following file hierarchy:
+#
+#     .. code-block::
+#
+#         /root
+#         ├── images
+#         │   ├── object
+#         │   │   ├── 000001.jpg
+#         │   │   │ ...
+#         │   │   └── 016185.jpg
+#         │   ├── object.tgz
+#         │   ├── random
+#         │   │   ├── 000000000139.jpg
+#         │   │   │ ...
+#         │   │   └── 000000581781.jpg
+#         │   └── random.zip
+#
+# We should also note that Stanford cars has 8k images, while the COCO 2017 validation set is 5k
+# images - it is not a 50/50 split! If we wanted to, we could weight these classes differently
+# during training to correct for this, but training will still work if we ignore it. It should
+# take about **2 minutes** to download the Stanford Cars, while COCO 2017 validation will take
+# **1 minute**.
+
+import os
+import shutil
+import urllib.request
+
+# Download datasets
+os.makedirs(f"{FOLDER}/images")
+urllib.request.urlretrieve(
+    "http://ai.stanford.edu/~jkrause/car196/cars_train.tgz", f"{FOLDER}/images/target.tgz"
+)
+urllib.request.urlretrieve(
+    "http://images.cocodataset.org/zips/val2017.zip", f"{FOLDER}/images/random.zip"
+)
+
+# Extract them and rename their folders
+shutil.unpack_archive(f"{FOLDER}/images/target.tgz", f"{FOLDER}/images")
+shutil.unpack_archive(f"{FOLDER}/images/random.zip", f"{FOLDER}/images")
+shutil.move(f"{FOLDER}/images/cars_train", f"{FOLDER}/images/target")
+shutil.move(f"{FOLDER}/images/val2017", f"{FOLDER}/images/random")
+
+######################################################################
+# Loading the Data
+# ----------------
+# Currently, our data is stored on-disk as JPG files of various sizes. To train with it, we'll have
+# to load the images into memory, resize them to be 64x64, and convert them to raw, uncompressed
+# data. Keras's ``image_dataset_from_directory`` will take care of most of this, though it loads
+# images such that each pixel value is a float from 0 to 255.
+#
+# We'll also need to load labels, though Keras will help with this. From our subdirectory structure,
+# it knows the images in ``/objects`` are one class, and those in ``/random`` another. Setting
+# ``label_mode='categorical'`` tells Keras to convert these into **categorical labels** - a 2x1 vector
+# that's either ``[1, 0]`` for an object of our target class, or ``[0, 1]`` vector for anything else.
+# We'll also set ``shuffle=True`` to randomize the order of our examples.
+#
+# We will also **batch** the data - grouping samples into clumps to make our training go faster.
+# Setting ``batch_size = 32`` is a decent number.
+#
+# Lastly, in machine learning we generally want our inputs to be small numbers. We'll thus use a
+# ``Rescaling`` layer to change our images such that each pixel is a float between ``0.0`` and ``1.0``,
+# instead of ``0`` to ``255``. We need to be careful not to rescale our categorical labels though, so
+# we'll use a ``lambda`` function.
+
+IMAGE_SIZE = (64, 64, 3)
+unscaled_dataset = tf.keras.utils.image_dataset_from_directory(
+    f"{FOLDER}/images",
+    batch_size=32,
+    shuffle=True,
+    label_mode="categorical",
+    image_size=IMAGE_SIZE[0:2],
+)
+rescale = tf.keras.layers.Rescaling(scale=1.0 / 255)
+full_dataset = unscaled_dataset.map(lambda im, lbl: (rescale(im), lbl))
+
+######################################################################
+# What's Inside Our Dataset?
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^
+# Before giving this data set to our neural network, we ought to give it a quick visual inspection.
+# Does the data look properly transformed? Do the labels seem appropriate? And what's our ratio of
+# objects to other stuff? We can display some examples from our datasets using ``matplotlib``:
+
+import matplotlib.pyplot as plt
+
+num_target_class = len(os.listdir(f"{FOLDER}/images/target/"))
+num_random_class = len(os.listdir(f"{FOLDER}/images/random/"))
+print(f"{FOLDER}/images/target contains {num_target_class} images")
+print(f"{FOLDER}/images/random contains {num_random_class} images")
+
+# Show some samples and their labels
+SAMPLES_TO_SHOW = 10
+plt.figure(figsize=(20, 10))
+for i, (image, label) in enumerate(unscaled_dataset.unbatch()):
+    if i >= SAMPLES_TO_SHOW:
+        break
+    ax = plt.subplot(1, SAMPLES_TO_SHOW, i + 1)
+    plt.imshow(image.numpy().astype("uint8"))
+    plt.title(list(label.numpy()))
+    plt.axis("off")
+
+######################################################################
+# Validating our Accuracy
+# ^^^^^^^^^^^^^^^^^^^^^^^
+# While developing our model, we'll often want to check how accurate it is (e.g. to see if it
+# improves during training). How do we do this? We could just train it on *all* of the data, and
+# then ask it to classify that same data. However, our model could cheat by just memorizing all of
+# the samples, which would make it *appear* to have very high accuracy, but perform very badly in
+# reality. In practice, this "memorizing" is called **overfitting**.
+#
+# To prevent this, we will set aside some of the data (we'll use 20%) as a **validation set**. Our
+# model will never be trained on validation data - we'll only use it to check our model's accuracy.
+
+num_batches = len(full_dataset)
+train_dataset = full_dataset.take(int(num_batches * 0.8))
+validation_dataset = full_dataset.skip(len(train_dataset))
+
+######################################################################
+# Loading the Data
+# ----------------
+# In the past decade, `convolutional neural networks <https://en.wikipedia.org/wiki/Convolutional_neural_network>`_ have been widely
+# adopted for image classification tasks. State-of-the-art models like `EfficientNet V2 <https://arxiv.org/abs/2104.00298>`_ are able
+# to perform image classification better than even humans! Unfortunately, these models have tens of
+# millions of parameters, and thus won't fit on cheap security camera computers.
+#
+# Our applications generally don't need perfect accuracy - 90% is good enough. We can thus use the
+# older and smaller MobileNet V1 architecture. But this *still* won't be small enough - by default,
+# MobileNet V1 with 224x224 inputs and depth 1.0 takes ~50 MB to just **store**. To reduce the size
+# of the model, there are three knobs we can turn. First, we can reduce the size of the input images
+# from 224x224 to 96x96 or 64x64, and Keras makes it easy to do this. We can also reduce the **depth**
+# of the model, from 1.0 to 0.25. And if we were really strapped for space, we could reduce the
+# number of **channels** by making our model take grayscale images instead of RGB ones.
+#
+# In this tutorial, we will use an RGB 64x64 input image and 0.25 depth scale. This is not quite
+# ideal, but it allows the finished model to fit in 192 KB of RAM, while still letting us perform
+# transfer learning using the official Tensorflow source models (if we used depth scale <0.25 or
+# a grayscale input, we wouldn't be able to do this).
+#
+# What is Transfer Learning?
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^
+# Deep learning has `dominated image classification <https://paperswithcode.com/sota/image-classification-on-imagenet>`_ for a long time,
+# but training neural networks takes a lot of time. When a neural network is trained "from scratch",
+# its parameters start out randomly initialized, forcing it to learn very slowly how to tell images
+# apart.
+#
+# With transfer learning, we instead start with a neural network that's **already** good at a
+# specific task. In this example, that task is classifying images from `the ImageNet database <https://www.image-net.org/>`_. This
+# means the network already has some object detection capabilities, and is likely closer to what you
+# want then a random model would be.
+#
+# This works especially well with image processing neural networks like MobileNet. In practice, it
+# turns out the convolutional layers of the model (i.e. the first 90% of the layers) are used for
+# identifying low-level features like lines and shapes - only the last few fully connected layers
+# are used to determine how those shapes make up the objects the network is trying to detect.
+#
+# We can take advantage of this by starting training with a MobileNet model that was trained on
+# ImageNet, and already knows how to identify those lines and shapes. We can then just remove the
+# last few layers from this pretrained model, and add our own final layers. We'll then train this
+# conglomerate model for a few epochs on our cars vs non-cars dataset, to fine tune the first layers
+# and train from scratch the last layers.
+#
+# Source MobileNets for transfer learning have been `pretrained by the Tensorflow folks <https://github.com/tensorflow/models/blob/master/research/slim/nets/mobilenet_v1.md>`_, so we
+# can just download the one closest to what we want (the 128x128 input model with 0.25 depth scale).
+
+os.makedirs(f"{FOLDER}/models")
+WEIGHTS_PATH = f"{FOLDER}/models/mobilenet_2_5_128_tf.h5"
+urllib.request.urlretrieve(
+    "https://storage.googleapis.com/tensorflow/keras-applications/mobilenet/mobilenet_2_5_128_tf.h5",
+    WEIGHTS_PATH,
+)
+
+pretrained = tf.keras.applications.MobileNet(
+    input_shape=IMAGE_SIZE, weights=WEIGHTS_PATH, alpha=0.25
+)
+
+######################################################################
+# Modifying Our Network
+# ^^^^^^^^^^^^^^^^^^^^^
+# As mentioned above, our pretrained model is designed to classify the 1,000 ImageNet categories,
+# but we want to convert it to classify cars. Since only the bottom few layers are task-specific,
+# we'll **cut off the last five layers** of our original model. In their place we'll build our own
+# "tail" to the model by performing respape, dropout, flatten, and softmax operations.
+
+model = tf.keras.models.Sequential()
+
+model.add(tf.keras.layers.InputLayer(input_shape=IMAGE_SIZE))
+model.add(tf.keras.Model(inputs=pretrained.inputs, outputs=pretrained.layers[-5].output))
+
+model.add(tf.keras.layers.Reshape((-1,)))
+model.add(tf.keras.layers.Dropout(0.1))
+model.add(tf.keras.layers.Flatten())
+model.add(tf.keras.layers.Dense(2, activation="softmax"))
+
+######################################################################
+# Training Our Network
+# ^^^^^^^^^^^^^^^^^^^^
+# When training neural networks, we must set a parameter called the **learning rate** that controls
+# how fast our network learns. It must be set carefully - too slow, and our network will take
+# forever to train; too fast, and our network won't be able to learn some fine details. Generally
+# for Adam (the optimizer we're using), ``0.001`` is a pretty good learning rate (and is what's
+# recommended in the `original paper <https://arxiv.org/abs/1412.6980>`_). However, in this case
+# ``0.0005`` seems to work a little better.
+#
+# We'll also pass the validation set from earlier to ``model.fit``. This will evaluate how good our
+# model is each time we train it, and let us track how our model is improving. Once training is
+# finished, the model should have a validation accuracy around ``0.98`` (meaning it was right 98% of
+# the time on our validation set).
+
+model.compile(
+    optimizer=tf.keras.optimizers.Adam(learning_rate=0.0005),
+    loss="categorical_crossentropy",
+    metrics=["accuracy"],
+)
+model.fit(train_dataset, validation_data=validation_dataset, epochs=3, verbose=2)
+
+######################################################################
+# Quantization
+# ------------
+# We've done a decent job of reducing our model's size so far - changing the input dimension,
+# along with removing the bottom layers reduced the model to just 219k parameters. However, each of
+# these parameters is a ``float32`` that takes four bytes, so our model will take up almost one MB!
+#
+# Additionally, it might be the case that our hardware doesn't have built-in support for floating
+# point numbers. While most high-memory Arduinos (like the Nano 33 BLE) do have hardware support,
+# some others (like the Arduino Due) do not. On any boards *without* dedicated hardware support,
+# floating point multiplication will be extremely slow.
+#
+# To address both issues we will **quantize** the model - representing the weights as eight bit
+# integers. It's more complex than just rounding, though - to get the best performance, TensorFlow
+# tracks how each neuron in our model activates, so we can figure out how to best represent the
+# while being relatively truthful to the original model.
+#
+# We will help TensorFlow do this by creating a representative dataset - a subset of the original
+# that is used for tracking how those neurons activate. We'll then pass this into a ``TFLiteConverter``
+# (Keras itself does not have quantization support) with an ``Optimize`` flag to tell TFLite to perform
+# the conversion. By default, TFLite keeps the inputs and outputs of our model as floats, so we must
+# explicitly tell it to avoid this behavior.
+
+converter = tf.lite.TFLiteConverter.from_keras_model(model)
+
+
+def representative_dataset():
+    for image_batch, label_batch in full_dataset.take(10):
+        yield [image_batch]
+
+
+converter.optimizations = [tf.lite.Optimize.DEFAULT]
+converter.representative_dataset = representative_dataset
+converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
+converter.inference_input_type = tf.uint8
+converter.inference_output_type = tf.uint8
+
+quantized_model = converter.convert()
+
+######################################################################
+# Download the Model if Desired
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+# We've now got a finished model, that you can use locally or in other tutorials (try autotuning
+# this model or viewing it on `https://netron.app/ <https://netron.app/>`_). But before we do
+# those things, we'll have to write it to a file (``quantized.tflite``). If you're running this
+# tutorial on Google Colab, you'll have to uncomment the last two lines to download the file
+# after writing it.
+
+QUANTIZED_MODEL_PATH = f"{FOLDER}/models/quantized.tflite"
+with open(QUANTIZED_MODEL_PATH, "wb") as f:
+    f.write(quantized_model)
+# from google.colab import files
+# files.download(QUANTIZED_MODEL_PATH)
+
+######################################################################
+# Compiling With TVM For Arduino
+# ------------------------------
+# Tensorflow has a built-in framework for deploying to microcontrollers - `TFLite Micro <https://www.tensorflow.org/lite/microcontrollers>`_. However,
+# it's poorly supported by development boards, and does not support autotuning. We will use Apache
+# TVM instead.
+#
+# TVM can be used either with its command line interface (``tvmc``) or with its Python interface. The
+# Python interface is fully-featured and more stable, so we'll use it here.
+#
+# TVM is an optimizing compiler, and optimizations to our model are performed in stages via
+# **intermediate representations**. The first of these is `Relay <https://arxiv.org/abs/1810.00952>`_ a high-level intermediate
+# representation emphasizing portability. The conversion from ``.tflite`` to Relay is done without any
+# knowledge of our "end goal" - the fact we intend to run this model on an Arduino.
+#
+# Choosing an Arduino Board
+# ^^^^^^^^^^^^^^^^^^^^^^^^^
+# Next, we'll have to decide exactly which Arduino board to use. The Arduino sketch that we
+# ultimately generate should be compatible with any board, but knowing which board we are using in
+# advance allows TVM to adjust its compilation strategy to get better performance.
+#
+# There is one catch - we need enough **memory** (flash and RAM) to be able to run our model. We
+# won't ever be able to run a complex vision model like a MobileNet on an Arduino Uno - that board
+# only has 2 kB of RAM and 32 kB of flash! Our model has ~200,000 parameters, so there is just no
+# way it could fit.
+#
+# For this tutorial, we will use the Nano 33 BLE, which has 1 MB of flash memory and 256 KB of RAM.
+# However, any other Arduino with those specs or better should also work.
+#
+# Generating our project
+# ^^^^^^^^^^^^^^^^^^^^^^
+# Next, we'll compile the model to TVM's MLF (machine learning format) intermediate representation,

Review Comment:
   Model Library Format



##########
tests/scripts/ci.py:
##########
@@ -267,7 +267,7 @@ def docs(
             "tlcpack-sphinx-addon==0.2.1",
             "synr==0.5.0",
             "image==1.5.33",
-            "sphinx-gallery==0.4.0",
+            "git+https://github.com/guberti/sphinx-gallery.git@ipynb-include-bash",

Review Comment:
   should we update docker/install scripts too if we're going to go this route? also, is this based off 0.4.0? may need to backport or verify it won't break anything to update.



##########
gallery/how_to/work_with_microtvm/micro_train.py:
##########
@@ -0,0 +1,638 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+.. _microtvm-train-arduino:
+
+Training Vision Models for microTVM on Arduino
+==============================================
+**Author**: `Gavin Uberti <https://github.com/guberti>`_
+
+This tutorial shows how MobileNetV1 models can be trained
+to fit on embedded devices, and how those models can be
+deployed to Arduino using TVM.
+"""
+
+######################################################################
+# .. note::
+#
+#   This tutorial is best viewed as a Jupyter Notebook. You can download and run it locally
+#   using the link at the bottom of this page, or open it online for free using Google Colab.
+#   Click the icon below to open in Google Colab.
+#
+# .. image:: https://raw.githubusercontent.com/guberti/web-data/micro-train-tutorial-data/images/utilities/colab_button.png
+#      :align: center
+#      :target: https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb
+#      :width: 600px

Review Comment:
   should we consider shrinking this button? i don't want it to be inconspicuous, but it's pretty big right now.



##########
gallery/how_to/work_with_microtvm/micro_train.py:
##########
@@ -0,0 +1,638 @@
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+"""
+.. _microtvm-train-arduino:
+
+Training Vision Models for microTVM on Arduino
+==============================================
+**Author**: `Gavin Uberti <https://github.com/guberti>`_
+
+This tutorial shows how MobileNetV1 models can be trained
+to fit on embedded devices, and how those models can be
+deployed to Arduino using TVM.
+"""
+
+######################################################################
+# .. note::
+#
+#   This tutorial is best viewed as a Jupyter Notebook. You can download and run it locally
+#   using the link at the bottom of this page, or open it online for free using Google Colab.
+#   Click the icon below to open in Google Colab.
+#
+# .. image:: https://raw.githubusercontent.com/guberti/web-data/micro-train-tutorial-data/images/utilities/colab_button.png
+#      :align: center
+#      :target: https://colab.research.google.com/github/guberti/tvm-site/blob/asf-site/docs/_downloads/a7c7ea4b5017ae70db1f51dd8e6dcd82/micro_train.ipynb
+#      :width: 600px
+#
+# Motivation
+# ----------
+# When building IOT devices, we often want them to **see and understand** the world around them.
+# This can take many forms, but often times a device will want to know if a certain **kind of
+# object** is in its field of vision.
+#
+# For example, a security camera might look for **people**, so it can decide whether to save a video
+# to memory. A traffic light might look for **cars**, so it can judge which lights should change
+# first. Or a forest camera might look for a **kind of animal**, so they can estimate how large
+# the animal population is.
+#
+# To make these devices affordable, we would like them to need only a low-cost processor like the
+# `nRF52840 <https://www.nordicsemi.com/Products/nRF52840>`_ (costing five dollars each on Mouser) or the `RP2040 <https://www.raspberrypi.com/products/rp2040/>`_ (just $1.45 each!).
+#
+# These devices have very little memory (~250 KB RAM), meaning that no conventional edge AI
+# vision model (like MobileNet or EfficientNet) will be able to run. In this tutorial, we will
+# show how these models can be modified to work around this requirement. Then, we will use TVM
+# to compile and deploy it for an Arduino that uses one of these processors.
+#
+# Installing the Prerequisites
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+#
+# To run this tutorial, we will need Tensorflow and TFLite to train our model, pyserial and tlcpack
+# (a community build of TVM) to compile and test it, and imagemagick and curl to preprocess data.
+# We will also need to install the Arduino CLI and the mbed_nano package to test our model.
+#
+#     .. code-block:: bash
+#
+#       %%bash
+#       pip install -q tensorflow tflite pyserial
+#       pip install -q tlcpack-nightly -f https://tlcpack.ai/wheels
+#       apt-get -qq install imagemagick curl
+#
+#       # Install Arduino CLI and library for Nano 33 BLE
+#       curl -fsSL https://raw.githubusercontent.com/arduino/arduino-cli/master/install.sh | sh
+#       /content/bin/arduino-cli core update-index
+#       /content/bin/arduino-cli core install arduino:mbed_nano
+#
+# Using the GPU
+# ^^^^^^^^^^^^^
+#
+# This tutorial demonstrates training a neural network, which is requires a lot of computing power
+# and will go much faster if you have a GPU. If you are viewing this tutorial on Google Colab, you
+# can enable a GPU by going to **Runtime->Change runtime type** and selecting "GPU" as our hardware
+# accelerator. If you are running locally, you can `follow Tensorflow's guide <https://www.tensorflow.org/guide/gpu>`_ instead.
+#
+# We can test our GPU installation with the following code:
+
+import tensorflow as tf
+
+if not tf.test.gpu_device_name():
+    print("No GPU was detected!")
+    print("Model training will take much longer (~30 minutes instead of ~5)")
+else:
+    print("GPU detected - you're good to go.")
+
+######################################################################
+# Choosing Our Work Dir
+# ^^^^^^^^^^^^^^^^^^^^^
+# We need to pick a directory where our image datasets, trained model, and eventual Arduino sketch
+# will all live. If running on Google Colab, we'll save everything in ``/root`` (aka ``~``) but you'll
+# probably want to store it elsewhere if running locally. Note that this variable only affects Python
+# scripts - you'll have to adjust the Bash commands too.
+
+import os
+
+FOLDER = "/root"
+# sphinx_gallery_start_ignore
+import tempfile
+
+FOLDER = tempfile.mkdtemp()
+# sphinx_gallery_end_ignore
+
+######################################################################
+# Downloading the Data
+# --------------------
+# Convolutional neural networks usually learn by looking at many images, along with labels telling
+# the network what those images are. To get these images, we'll need a publicly available dataset
+# with thousands of images of all sorts of objects and labels of what's in each image. We'll also
+# need a bunch of images that **aren't** of cars, as we're trying to distinguish these two classes.
+#
+# In this tutorial, we'll create a model to detect if an image contains a **car**, but you can use
+# whatever category you like! Just change the source URL below to one containing images of another
+# type of object.
+#
+# To get our car images, we'll be downloading the `Stanford Cars dataset <http://ai.stanford.edu/~jkrause/cars/car_dataset.html>`_,
+# which contains 16,185 full color images of cars. We'll also need images of random things that
+# aren't cars, so we'll use the `COCO 2017 <https://cocodataset.org/#home>`_ validation set (it's
+# smaller, and thus faster to download than the full training set. Training on the full data set
+# would yield better results). Note that there are some cars in the COCO 2017 data set, but it's
+# a small enough fraction not to matter - just keep in mind that this will drive down our percieved
+# accuracy slightly.
+#
+# We could use the Tensorflow dataloader utilities, but we'll instead do it manually to make sure
+# it's easy to change the datasets being used. We'll end up with the following file hierarchy:
+#
+#     .. code-block::
+#
+#         /root
+#         ├── images
+#         │   ├── object
+#         │   │   ├── 000001.jpg
+#         │   │   │ ...
+#         │   │   └── 016185.jpg
+#         │   ├── object.tgz
+#         │   ├── random
+#         │   │   ├── 000000000139.jpg
+#         │   │   │ ...
+#         │   │   └── 000000581781.jpg
+#         │   └── random.zip
+#
+# We should also note that Stanford cars has 8k images, while the COCO 2017 validation set is 5k
+# images - it is not a 50/50 split! If we wanted to, we could weight these classes differently
+# during training to correct for this, but training will still work if we ignore it. It should
+# take about **2 minutes** to download the Stanford Cars, while COCO 2017 validation will take
+# **1 minute**.
+
+import os
+import shutil
+import urllib.request
+
+# Download datasets
+os.makedirs(f"{FOLDER}/images")
+urllib.request.urlretrieve(
+    "http://ai.stanford.edu/~jkrause/car196/cars_train.tgz", f"{FOLDER}/images/target.tgz"
+)
+urllib.request.urlretrieve(
+    "http://images.cocodataset.org/zips/val2017.zip", f"{FOLDER}/images/random.zip"
+)
+
+# Extract them and rename their folders
+shutil.unpack_archive(f"{FOLDER}/images/target.tgz", f"{FOLDER}/images")
+shutil.unpack_archive(f"{FOLDER}/images/random.zip", f"{FOLDER}/images")
+shutil.move(f"{FOLDER}/images/cars_train", f"{FOLDER}/images/target")
+shutil.move(f"{FOLDER}/images/val2017", f"{FOLDER}/images/random")
+
+######################################################################
+# Loading the Data
+# ----------------
+# Currently, our data is stored on-disk as JPG files of various sizes. To train with it, we'll have
+# to load the images into memory, resize them to be 64x64, and convert them to raw, uncompressed
+# data. Keras's ``image_dataset_from_directory`` will take care of most of this, though it loads
+# images such that each pixel value is a float from 0 to 255.
+#
+# We'll also need to load labels, though Keras will help with this. From our subdirectory structure,
+# it knows the images in ``/objects`` are one class, and those in ``/random`` another. Setting
+# ``label_mode='categorical'`` tells Keras to convert these into **categorical labels** - a 2x1 vector
+# that's either ``[1, 0]`` for an object of our target class, or ``[0, 1]`` vector for anything else.
+# We'll also set ``shuffle=True`` to randomize the order of our examples.
+#
+# We will also **batch** the data - grouping samples into clumps to make our training go faster.
+# Setting ``batch_size = 32`` is a decent number.
+#
+# Lastly, in machine learning we generally want our inputs to be small numbers. We'll thus use a
+# ``Rescaling`` layer to change our images such that each pixel is a float between ``0.0`` and ``1.0``,
+# instead of ``0`` to ``255``. We need to be careful not to rescale our categorical labels though, so
+# we'll use a ``lambda`` function.
+
+IMAGE_SIZE = (64, 64, 3)
+unscaled_dataset = tf.keras.utils.image_dataset_from_directory(
+    f"{FOLDER}/images",
+    batch_size=32,
+    shuffle=True,
+    label_mode="categorical",
+    image_size=IMAGE_SIZE[0:2],
+)
+rescale = tf.keras.layers.Rescaling(scale=1.0 / 255)
+full_dataset = unscaled_dataset.map(lambda im, lbl: (rescale(im), lbl))
+
+######################################################################
+# What's Inside Our Dataset?
+# ^^^^^^^^^^^^^^^^^^^^^^^^^^
+# Before giving this data set to our neural network, we ought to give it a quick visual inspection.
+# Does the data look properly transformed? Do the labels seem appropriate? And what's our ratio of
+# objects to other stuff? We can display some examples from our datasets using ``matplotlib``:
+
+import matplotlib.pyplot as plt
+
+num_target_class = len(os.listdir(f"{FOLDER}/images/target/"))
+num_random_class = len(os.listdir(f"{FOLDER}/images/random/"))
+print(f"{FOLDER}/images/target contains {num_target_class} images")
+print(f"{FOLDER}/images/random contains {num_random_class} images")
+
+# Show some samples and their labels
+SAMPLES_TO_SHOW = 10
+plt.figure(figsize=(20, 10))
+for i, (image, label) in enumerate(unscaled_dataset.unbatch()):
+    if i >= SAMPLES_TO_SHOW:
+        break
+    ax = plt.subplot(1, SAMPLES_TO_SHOW, i + 1)
+    plt.imshow(image.numpy().astype("uint8"))
+    plt.title(list(label.numpy()))
+    plt.axis("off")
+
+######################################################################
+# Validating our Accuracy
+# ^^^^^^^^^^^^^^^^^^^^^^^
+# While developing our model, we'll often want to check how accurate it is (e.g. to see if it
+# improves during training). How do we do this? We could just train it on *all* of the data, and
+# then ask it to classify that same data. However, our model could cheat by just memorizing all of
+# the samples, which would make it *appear* to have very high accuracy, but perform very badly in
+# reality. In practice, this "memorizing" is called **overfitting**.
+#
+# To prevent this, we will set aside some of the data (we'll use 20%) as a **validation set**. Our
+# model will never be trained on validation data - we'll only use it to check our model's accuracy.
+
+num_batches = len(full_dataset)
+train_dataset = full_dataset.take(int(num_batches * 0.8))
+validation_dataset = full_dataset.skip(len(train_dataset))
+
+######################################################################
+# Loading the Data
+# ----------------
+# In the past decade, `convolutional neural networks <https://en.wikipedia.org/wiki/Convolutional_neural_network>`_ have been widely
+# adopted for image classification tasks. State-of-the-art models like `EfficientNet V2 <https://arxiv.org/abs/2104.00298>`_ are able
+# to perform image classification better than even humans! Unfortunately, these models have tens of
+# millions of parameters, and thus won't fit on cheap security camera computers.
+#
+# Our applications generally don't need perfect accuracy - 90% is good enough. We can thus use the
+# older and smaller MobileNet V1 architecture. But this *still* won't be small enough - by default,
+# MobileNet V1 with 224x224 inputs and depth 1.0 takes ~50 MB to just **store**. To reduce the size
+# of the model, there are three knobs we can turn. First, we can reduce the size of the input images
+# from 224x224 to 96x96 or 64x64, and Keras makes it easy to do this. We can also reduce the **depth**

Review Comment:
   perhaps elaborate on "depth" here (could just link somewhere)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@tvm.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org