You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by vo...@apache.org on 2023/05/22 21:29:44 UTC

[druid] branch 26.0.0 updated: Docs: Tutorial for streaming ingestion using Kafka + Docker file to use with Jupyter tutorials (#13984) (#14289)

This is an automated email from the ASF dual-hosted git repository.

vogievetsky pushed a commit to branch 26.0.0
in repository https://gitbox.apache.org/repos/asf/druid.git


The following commit(s) were added to refs/heads/26.0.0 by this push:
     new 855e576e87 Docs: Tutorial for streaming ingestion using Kafka + Docker file to use with Jupyter tutorials (#13984) (#14289)
855e576e87 is described below

commit 855e576e87096e08c4c85a7fc53584a7801e402b
Author: Victoria Lim <vt...@users.noreply.github.com>
AuthorDate: Mon May 22 14:29:37 2023 -0700

    Docs: Tutorial for streaming ingestion using Kafka + Docker file to use with Jupyter tutorials (#13984) (#14289)
---
 .gitignore                                         |   3 +-
 docs/tutorials/tutorial-jupyter-docker.md          | 201 ++++++
 docs/tutorials/tutorial-jupyter-index.md           |  67 +-
 .../jupyter-notebooks/0-START-HERE.ipynb           |  25 +-
 examples/quickstart/jupyter-notebooks/Dockerfile   |  65 ++
 examples/quickstart/jupyter-notebooks/README.md    |  74 +-
 .../jupyter-notebooks/docker-jupyter/README.md     |  60 ++
 .../docker-jupyter/docker-compose-local.yaml       | 172 +++++
 .../docker-jupyter/docker-compose.yaml             | 170 +++++
 .../jupyter-notebooks/docker-jupyter/environment   |  56 ++
 .../docker-jupyter/kafka_docker_config.json        |  90 +++
 .../docker-jupyter/tutorial-jupyter-docker.zip     | Bin 0 -> 2939 bytes
 .../jupyter-notebooks/kafka-tutorial.ipynb         | 782 +++++++++++++++++++++
 website/sidebars.json                              |   1 +
 14 files changed, 1635 insertions(+), 131 deletions(-)

diff --git a/.gitignore b/.gitignore
index 31b2f9dd1e..a60eb68173 100644
--- a/.gitignore
+++ b/.gitignore
@@ -33,9 +33,10 @@ integration-tests/gen-scripts/
 **/.ipython/
 **/.jupyter/
 **/.local/
+**/druidapi.egg-info/
+examples/quickstart/jupyter-notebooks/docker-jupyter/notebooks
 
 # ignore NetBeans IDE specific files
 nbproject
 nbactions.xml
 nb-configuration.xml
-
diff --git a/docs/tutorials/tutorial-jupyter-docker.md b/docs/tutorials/tutorial-jupyter-docker.md
new file mode 100644
index 0000000000..b5aa939db8
--- /dev/null
+++ b/docs/tutorials/tutorial-jupyter-docker.md
@@ -0,0 +1,201 @@
+---
+id: tutorial-jupyter-docker
+title: "Docker for Jupyter Notebook tutorials"
+sidebar_label: "Docker for tutorials"
+---
+
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+
+Apache Druid provides a custom Jupyter container that contains the prerequisites
+for all Jupyter-based Druid tutorials, as well as all of the tutorials themselves.
+You can run the Jupyter container, as well as containers for Druid and Apache Kafka,
+using the Docker Compose file provided in the Druid GitHub repository.
+
+You can run the following combination of applications:
+* [Jupyter only](#start-only-the-jupyter-container)
+* [Jupyter and Druid](#start-jupyter-and-druid)
+* [Jupyter, Druid, and Kafka](#start-jupyter-druid-and-kafka)
+
+## Prerequisites
+
+Jupyter in Docker requires that you have **Docker** and **Docker Compose**.
+We recommend installing these through [Docker Desktop](https://docs.docker.com/desktop/).
+
+## Launch the Docker containers
+
+You run Docker Compose to launch Jupyter and optionally Druid or Kafka.
+Docker Compose references the configuration in `docker-compose.yaml`.
+Running Druid in Docker also requires the `environment` file, which
+sets the configuration properties for the Druid services.
+To get started, download both `docker-compose.yaml` and `environment` from
+[`tutorial-jupyter-docker.zip`](https://github.com/apache/druid/blob/master/examples/quickstart/jupyter-notebooks/docker-jupyter/tutorial-jupyter-docker.zip).
+
+Alternatively, you can clone the [Apache Druid repo](https://github.com/apache/druid) and
+access the files in `druid/examples/quickstart/jupyter-notebooks/docker-jupyter`.
+
+### Start only the Jupyter container
+
+If you already have Druid running locally, you can run only the Jupyter container to complete the tutorials.
+In the same directory as `docker-compose.yaml`, start the application:
+
+```bash
+docker compose --profile jupyter up -d
+```
+
+The Docker Compose file assigns `8889` for the Jupyter port.
+You can override the port number by setting the `JUPYTER_PORT` environment variable before starting the Docker application.
+
+### Start Jupyter and Druid
+
+Running Druid in Docker requires the `environment` file as well as an environment variable named `DRUID_VERSION`,
+which determines the version of Druid to use. The Druid version references the Docker tag to pull from the
+[Apache Druid Docker Hub](https://hub.docker.com/r/apache/druid/tags).
+
+In the same directory as `docker-compose.yaml` and `environment`, start the application:
+
+```bash
+DRUID_VERSION={{DRUIDVERSION}} docker compose --profile druid-jupyter up -d
+```
+
+### Start Jupyter, Druid, and Kafka
+
+Running Druid in Docker requires the `environment` file as well as the `DRUID_VERSION` environment variable.
+
+In the same directory as `docker-compose.yaml` and `environment`, start the application:
+
+```bash
+DRUID_VERSION={{DRUIDVERSION}} docker compose --profile all-services up -d
+```
+
+### Update image from Docker Hub
+
+If you already have a local cache of the Jupyter image, you can update the image before running the application using the following command:
+
+```bash
+docker compose pull jupyter
+```
+
+### Use locally built image
+
+The default Docker Compose file pulls the custom Jupyter Notebook image from a third party Docker Hub.
+If you prefer to build the image locally from the official source, do the following:
+1. Clone the Apache Druid repository.
+2. Navigate to `examples/quickstart/jupyter-notebooks/docker-jupyter`.
+3. Start the services using `-f docker-compose-local.yaml` in the `docker compose` command. For example:
+
+```bash
+DRUID_VERSION={{DRUIDVERSION}} docker compose --profile all-services -f docker-compose-local.yaml up -d
+```
+
+## Access Jupyter-based tutorials
+
+The following steps show you how to access the Jupyter notebook tutorials from the Docker container.
+At startup, Docker creates and mounts a volume to persist data from the container to your local machine.
+This way you can save your work completed within the Docker container.
+
+1. Navigate to the notebooks at http://localhost:8889.
+   > If you set `JUPYTER_PORT` to another port number, replace `8889` with the value of the Jupyter port.
+
+2. Select a tutorial. If you don't plan to save your changes, you can use the notebook directly as is. Otherwise, continue to the next step.
+
+3. Optional: To save a local copy of your tutorial work,
+select **File > Save as...** from the navigation menu. Then enter `work/<notebook name>.ipynb`.
+If the notebook still displays as read only, you may need to refresh the page in your browser.
+Access the saved files in the `notebooks` folder in your local working directory.
+
+## View the Druid web console
+
+To access the Druid web console in Docker, go to http://localhost:8888/unified-console.html.
+Use the web console to view datasources and ingestion tasks that you create in the tutorials.
+
+## Stop Docker containers
+
+Shut down the Docker application using the following command:
+
+```bash
+docker compose down -v
+```
+
+## Tutorial setup without using Docker
+
+To use the Jupyter Notebook-based tutorials without using Docker, do the following:
+
+1. Clone the Apache Druid repo, or download the [tutorials](tutorial-jupyter-index.md#tutorials)
+as well as the [Python client for Druid](tutorial-jupyter-index.md#python-api-for-druid).
+
+2. Install the prerequisite Python packages with the following commands:
+
+   ```bash
+   # Install requests
+   pip install requests
+   ```
+
+   ```bash
+   # Install JupyterLab
+   pip install jupyterlab
+   
+   # Install Jupyter Notebook
+   pip install notebook
+   ```
+
+   Individual notebooks may list additional packages you need to install to complete the tutorial.
+
+3. In your Druid source repo, install `druidapi` with the following commands:
+
+   ```bash
+   cd examples/quickstart/jupyter-notebooks/druidapi
+   pip install .
+   ```
+
+4. Start Jupyter, in the same directory as the tutorials, using either JupyterLab or Jupyter Notebook:
+   ```bash
+   # Start JupyterLab on port 3001
+   jupyter lab --port 3001
+
+   # Start Jupyter Notebook on port 3001
+   jupyter notebook --port 3001
+   ```
+
+5. Start Druid. You can use the [Quickstart (local)](./index.md) instance. The tutorials
+   assume that you are using the quickstart, so no authentication or authorization
+   is expected unless explicitly mentioned.
+
+   If you contribute to Druid, and work with Druid integration tests, you can use a test cluster.
+   Assume you have an environment variable, `DRUID_DEV`, which identifies your Druid source repo.
+ 
+   ```bash
+   cd $DRUID_DEV
+   ./it.sh build
+   ./it.sh image
+   ./it.sh up <category>
+   ```
+ 
+   Replace `<category>` with one of the available integration test categories. See the integration
+   test `README.md` for details.
+
+You should now be able to access and complete the tutorials.
+
+## Learn more
+
+See the following topics for more information:
+* [Jupyter Notebook tutorials](tutorial-jupyter-index.md) for the available Jupyter Notebook-based tutorials for Druid
+* [Tutorial: Run with Docker](docker.md) for running Druid from a Docker container
+
diff --git a/docs/tutorials/tutorial-jupyter-index.md b/docs/tutorials/tutorial-jupyter-index.md
index d77e0d42b3..d7f401cae5 100644
--- a/docs/tutorials/tutorial-jupyter-index.md
+++ b/docs/tutorials/tutorial-jupyter-index.md
@@ -32,67 +32,34 @@ the Druid API to complete the tutorial.
 
 ## Prerequisites
 
-Make sure you meet the following requirements before starting the Jupyter-based tutorials:
+The simplest way to get started is to use Docker. In this case, you only need to set up Docker Desktop.
+For more information, see [Docker for Jupyter Notebook tutorials](tutorial-jupyter-docker.md).
 
-- Python 3.7 or later
-
-- The `requests` package for Python. For example, you can install it with the following command:
-
-   ```bash
-   pip3 install requests
-   ```
-
-- JupyterLab (recommended) or Jupyter Notebook running on a non-default port. By default, Druid
-  and Jupyter both try to use port `8888`, so start Jupyter on a different port.
-
-
-  - Install JupyterLab or Notebook:
+Otherwise, you can install the prerequisites on your own. Here's what you need:
 
-    ```bash
-    # Install JupyterLab
-    pip3 install jupyterlab
-    # Install Jupyter Notebook
-    pip3 install notebook
-    ```
-  - Start Jupyter using either JupyterLab
-    ```bash
-    # Start JupyterLab on port 3001
-    jupyter lab --port 3001
-    ```
-
-    Or using Jupyter Notebook
-    ```bash
-    # Start Jupyter Notebook on port 3001
-    jupyter notebook --port 3001
-    ```
-
-- An available Druid instance. You can use the [Quickstart (local)](./index.md) instance. The tutorials
-  assume that you are using the quickstart, so no authentication or authorization
-  is expected unless explicitly mentioned.
-
-  If you contribute to Druid, and work with Druid integration tests, can use a test cluster.
-  Assume you have an environment variable, `DRUID_DEV`, which identifies your Druid source repo.
-
-  ```bash
-  cd $DRUID_DEV
-  ./it.sh build
-  ./it.sh image
-  ./it.sh up <category>
-  ```
+- An available Druid instance.
+- Python 3.7 or later
+- JupyterLab (recommended) or Jupyter Notebook running on a non-default port.
+By default, Druid and Jupyter both try to use port `8888`, so start Jupyter on a different port.
+- The `requests` Python package
+- The `druidapi` Python package
 
-  Replace `<category>` with one of the available integration test categories. See the integration
-  test `README.md` for details.
+For setup instructions, see [Tutorial setup without using Docker](tutorial-jupyter-docker.md#tutorial-setup-without-using-docker).
+Individual tutorials may require additional Python packages, such as for visualization or streaming ingestion.
 
-## Simple Druid API
+## Python API for Druid
 
+The `druidapi` Python package is a REST API for Druid.
 One of the notebooks shows how to use the Druid REST API. The others focus on other
 topics and use a simple set of Python wrappers around the underlying REST API. The
 wrappers reside in the `druidapi` package within the notebooks directory. While the package
 can be used in any Python program, the key purpose, at present, is to support these
-notebooks. See the [Introduction to the Druid Python API]
-(https://github.com/apache/druid/tree/master/examples/quickstart/jupyter-notebooks/python-api-tutorial.ipynb)
+notebooks. See
+[Introduction to the Druid Python API](https://github.com/apache/druid/tree/master/examples/quickstart/jupyter-notebooks/python-api-tutorial.ipynb)
 for an overview of the Python API.
 
+The `druidapi` package is already installed in the custom Jupyter Docker container for Druid tutorials.
+
 ## Tutorials
 
 The notebooks are located in the [apache/druid repo](https://github.com/apache/druid/tree/master/examples/quickstart/jupyter-notebooks/). You can either clone the repo or download the notebooks you want individually.
diff --git a/examples/quickstart/jupyter-notebooks/0-START-HERE.ipynb b/examples/quickstart/jupyter-notebooks/0-START-HERE.ipynb
index fe4a30a551..5e74fa71c1 100644
--- a/examples/quickstart/jupyter-notebooks/0-START-HERE.ipynb
+++ b/examples/quickstart/jupyter-notebooks/0-START-HERE.ipynb
@@ -41,24 +41,27 @@
    "source": [
     "## Prerequisites\n",
     "\n",
-    "To get this far, you've installed Python 3 and Jupyter Notebook. Make sure you meet the following requirements before starting the Jupyter-based tutorials:\n",
-    "\n",
-    "- The `requests` package for Python. For example, you can install it with the following command:\n",
-    "\n",
-    "   ```bash\n",
-    "   pip install requests\n",
-    "   ````\n",
-    "\n",
-    "- JupyterLab (recommended) or Jupyter Notebook running on a non-default port. By default, Druid\n",
-    "  and Jupyter both try to use port `8888`, so start Jupyter on a different port.\n",
+    "Before starting the Jupyter-based tutorials, make sure you meet the requirements listed in this section.\n",
+    "The simplest way to get started is to use Docker. In this case, you only need to set up Docker Desktop.\n",
+    "For more information, see [Docker for Jupyter Notebook tutorials](https://druid.apache.org/docs/latest/tutorials/tutorial-jupyter-docker.html).\n",
     "\n",
+    "Otherwise, you need the following:\n",
     "- An available Druid instance. You can use the local quickstart configuration\n",
     "  described in [Quickstart](https://druid.apache.org/docs/latest/tutorials/index.html).\n",
     "  The tutorials assume that you are using the quickstart, so no authentication or authorization\n",
     "  is expected unless explicitly mentioned.\n",
+    "- Python 3.7 or later\n",
+    "- JupyterLab (recommended) or Jupyter Notebook running on a non-default port. By default, Druid\n",
+    "  and Jupyter both try to use port `8888`, so start Jupyter on a different port.\n",
+    "- The `requests` Python package\n",
+    "- The `druidapi` Python package\n",
+    "\n",
+    "For setup instructions, see [Tutorial setup without using Docker](https://druid.apache.org/docs/latest/tutorials/tutorial-jupyter-docker.html#tutorial-setup-without-using-docker).\n",
+    "Individual tutorials may require additional Python packages, such as for visualization or streaming ingestion.\n",
     "\n",
     "## Simple Druid API\n",
     "\n",
+    "The `druidapi` Python package is a REST API for Druid.\n",
     "One of the notebooks shows how to use the Druid REST API. The others focus on other\n",
     "topics and use a simple set of Python wrappers around the underlying REST API. The\n",
     "wrappers reside in the `druidapi` package within this directory. While the package\n",
@@ -148,7 +151,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.9.6"
+   "version": "3.9.5"
   }
  },
  "nbformat": 4,
diff --git a/examples/quickstart/jupyter-notebooks/Dockerfile b/examples/quickstart/jupyter-notebooks/Dockerfile
new file mode 100644
index 0000000000..492a4da9c1
--- /dev/null
+++ b/examples/quickstart/jupyter-notebooks/Dockerfile
@@ -0,0 +1,65 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+# -------------------------------------------------------------
+# This Dockerfile creates a custom Docker image for Jupyter
+# to use with the Apache Druid Jupyter notebook tutorials.
+# Build using `docker build -t imply/druid-notebook:latest .`
+# -------------------------------------------------------------
+
+# Use the Jupyter base notebook as the base image
+# Copyright (c) Project Jupyter Contributors.
+# Distributed under the terms of the 3-Clause BSD License.
+FROM jupyter/base-notebook
+
+# Set the container working directory
+WORKDIR /home/jovyan
+
+# Install required Python packages
+RUN pip install requests
+RUN pip install pandas
+RUN pip install numpy
+RUN pip install seaborn
+RUN pip install bokeh
+RUN pip install kafka-python
+RUN pip install sortedcontainers
+
+# Install druidapi client from apache/druid
+# Local install requires sudo privileges 
+USER root
+ADD druidapi /home/jovyan/druidapi
+WORKDIR /home/jovyan/druidapi
+RUN pip install .
+WORKDIR /home/jovyan
+
+# Import data generator and configuration file
+# Change permissions to allow import (requires sudo privileges)
+# WIP -- change to apache repo
+ADD https://raw.githubusercontent.com/shallada/druid/data-generator/examples/quickstart/jupyter-notebooks/data-generator/DruidDataDriver.py .
+ADD docker-jupyter/kafka_docker_config.json .
+RUN chmod 664 DruidDataDriver.py
+RUN chmod 664 kafka_docker_config.json
+USER jovyan
+
+# Copy the Jupyter notebook tutorials from the
+# build directory to the image working directory
+COPY ./*ipynb .
+
+# Add location of the data generator to PYTHONPATH
+ENV PYTHONPATH "${PYTHONPATH}:/home/jovyan"
+
diff --git a/examples/quickstart/jupyter-notebooks/README.md b/examples/quickstart/jupyter-notebooks/README.md
index 826ae5df34..361908c131 100644
--- a/examples/quickstart/jupyter-notebooks/README.md
+++ b/examples/quickstart/jupyter-notebooks/README.md
@@ -1,12 +1,5 @@
 # Jupyter Notebook tutorials for Druid
 
-If you are reading this in Jupyter, switch over to the [0-START-HERE](0-START-HERE.ipynb)
-notebook instead.
-
-<!-- This README, the "0-START-HERE" notebook, and the tutorial-jupyter-index.md file in
-docs/tutorials share a lot of the same content. If you make a change in one place, update
-the other too. -->
-
 <!--
   ~ Licensed to the Apache Software Foundation (ASF) under one
   ~ or more contributor license agreements.  See the NOTICE file
@@ -26,70 +19,13 @@ the other too. -->
   ~ under the License.
   -->
 
+If you are reading this in Jupyter, switch over to the [0-START-HERE](0-START-HERE.ipynb)
+notebook instead.
+
 You can try out the Druid APIs using the Jupyter Notebook-based tutorials. These
 tutorials provide snippets of Python code that you can use to run calls against
 the Druid API to complete the tutorial.
 
-## Prerequisites
-
-Make sure you meet the following requirements before starting the Jupyter-based tutorials:
-
-- Python 3
-
-- The `requests` package for Python. For example, you can install it with the following command:
-
-  ```bash
-  pip install requests
-  ```
-
-- JupyterLab (recommended) or Jupyter Notebook running on a non-default port. By default, Druid
-  and Jupyter both try to use port `8888`, so start Jupyter on a different port.
-
-  - Install JupyterLab or Notebook:
-
-    ```bash
-    # Install JupyterLab
-    pip install jupyterlab
-    # Install Jupyter Notebook
-    pip install notebook
-    ```
-  - Start Jupyter using either JupyterLab
-    ```bash
-    # Start JupyterLab on port 3001
-    jupyter lab --port 3001
-    ```
-
-    Or using Jupyter Notebook
-    ```bash
-    # Start Jupyter Notebook on port 3001
-    jupyter notebook --port 3001
-    ```
-
-- The Python API client for Druid. Clone the Druid repo if you haven't already.
-Go to your Druid source repo and install `druidapi` with the following commands:
-
-  ```bash
-  cd examples/quickstart/jupyter-notebooks/druidapi
-  pip install .
-  ```
-
-- An available Druid instance. You can use the [quickstart deployment](https://druid.apache.org/docs/latest/tutorials/index.html).
-  The tutorials assume that you are using the quickstart, so no authentication or authorization
-  is expected unless explicitly mentioned.
-
-  If you contribute to Druid, and work with Druid integration tests, can use a test cluster.
-  Assume you have an environment variable, `DRUID_DEV`, which identifies your Druid source repo.
-
-  ```bash
-  cd $DRUID_DEV
-  ./it.sh build
-  ./it.sh image
-  ./it.sh up <category>
-  ```
-
-  Replace `<catagory>` with one of the available integration test categories. See the integration 
-  test `README.md` for details.
-
-## Continue in Jupyter
+For information on prerequisites and getting started with the Jupyter-based tutorials,
+see [Jupyter Notebook tutorials](../../../docs/tutorials/tutorial-jupyter-index.md).
 
-Start Jupyter (see above) and navigate to the "0-START-HERE" notebook for more information.
diff --git a/examples/quickstart/jupyter-notebooks/docker-jupyter/README.md b/examples/quickstart/jupyter-notebooks/docker-jupyter/README.md
new file mode 100644
index 0000000000..028eb1f9b2
--- /dev/null
+++ b/examples/quickstart/jupyter-notebooks/docker-jupyter/README.md
@@ -0,0 +1,60 @@
+<!--
+  ~ Licensed to the Apache Software Foundation (ASF) under one
+  ~ or more contributor license agreements.  See the NOTICE file
+  ~ distributed with this work for additional information
+  ~ regarding copyright ownership.  The ASF licenses this file
+  ~ to you under the Apache License, Version 2.0 (the
+  ~ "License"); you may not use this file except in compliance
+  ~ with the License.  You may obtain a copy of the License at
+  ~
+  ~   http://www.apache.org/licenses/LICENSE-2.0
+  ~
+  ~ Unless required by applicable law or agreed to in writing,
+  ~ software distributed under the License is distributed on an
+  ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  ~ KIND, either express or implied.  See the License for the
+  ~ specific language governing permissions and limitations
+  ~ under the License.
+  -->
+
+# Jupyter in Docker
+
+For details on getting started with Jupyter in Docker,
+see [Docker for Jupyter Notebook tutorials](../../../../docs/tutorials/tutorial-jupyter-docker.md).
+
+## Contributing
+
+### Rebuild Jupyter image
+
+You may want to update the Jupyter image to access new or updated tutorial notebooks,
+include new Python packages, or update configuration files.
+
+To build the custom Jupyter image locally:
+
+1. Clone the Druid repo if you haven't already.
+2. Navigate to `examples/quickstart/jupyter-notebooks` in your Druid source repo.
+3. Edit the image definition in `Dockerfile`.
+4. Navigate to the `docker-jupyter` directory.
+5. Generate the new build using the following command:
+
+   ```shell
+   DRUID_VERSION=25.0.0 docker compose --profile all-services -f docker-compose-local.yaml up -d --build
+   ```
+
+   You can change the value of `DRUID_VERSION` or the profile used from the Docker Compose file.
+
+### Update Docker Compose
+
+The Docker Compose file defines a multi-container application that allows you to run
+the custom Jupyter Notebook container, Apache Druid, and Apache Kafka.
+
+Any changes to `docker-compose.yaml` should also be made to `docker-compose-local.yaml`
+and vice versa. These files should be identical except that `docker-compose.yaml`
+contains an `image` attribute while `docker-compose-local.yaml` contains a `build` subsection.
+
+If you update `docker-compose.yaml`, recreate the ZIP file using the following command:
+
+```bash
+zip tutorial-jupyter-docker.zip docker-compose.yaml environment
+```
+
diff --git a/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose-local.yaml b/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose-local.yaml
new file mode 100644
index 0000000000..9fb241deb8
--- /dev/null
+++ b/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose-local.yaml
@@ -0,0 +1,172 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+---
+version: "2.2"
+
+volumes:
+  metadata_data: {}
+  middle_var: {}
+  historical_var: {}
+  broker_var: {}
+  coordinator_var: {}
+  router_var: {}
+  druid_shared: {}
+
+
+services:
+  postgres:
+    image: postgres:latest
+    container_name: postgres
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - metadata_data:/var/lib/postgresql/data
+    environment:
+      - POSTGRES_PASSWORD=FoolishPassword
+      - POSTGRES_USER=druid
+      - POSTGRES_DB=druid
+
+  # Need 3.5 or later for container nodes
+  zookeeper:
+    image: zookeeper:latest
+    container_name: zookeeper
+    profiles: ["druid-jupyter", "all-services"]
+    ports:
+      - "2181:2181"
+    environment:
+      - ZOO_MY_ID=1
+      - ALLOW_ANONYMOUS_LOGIN=yes
+  
+  kafka:
+    image: bitnami/kafka:latest
+    container_name: kafka-broker
+    profiles: ["all-services"]
+    ports:
+    # To learn about configuring Kafka for access across networks see
+    # https://www.confluent.io/blog/kafka-client-cannot-connect-to-broker-on-aws-on-docker-etc/
+      - "9092:9092"
+    depends_on:
+      - zookeeper
+    environment:
+      - KAFKA_BROKER_ID=1
+      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092
+      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092
+      - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
+      - ALLOW_PLAINTEXT_LISTENER=yes
+
+  coordinator:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: coordinator
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - druid_shared:/opt/shared
+      - coordinator_var:/opt/druid/var
+    depends_on: 
+      - zookeeper
+      - postgres
+    ports:
+      - "8081:8081"
+    command:
+      - coordinator
+    env_file:
+      - environment
+
+  broker:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: broker
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - broker_var:/opt/druid/var
+    depends_on: 
+      - zookeeper
+      - postgres
+      - coordinator
+    ports:
+      - "8082:8082"
+    command:
+      - broker
+    env_file:
+      - environment
+
+  historical:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: historical
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - druid_shared:/opt/shared
+      - historical_var:/opt/druid/var
+    depends_on: 
+      - zookeeper
+      - postgres
+      - coordinator
+    ports:
+      - "8083:8083"
+    command:
+      - historical
+    env_file:
+      - environment
+
+  middlemanager:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: middlemanager
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - druid_shared:/opt/shared
+      - middle_var:/opt/druid/var
+    depends_on: 
+      - zookeeper
+      - postgres
+      - coordinator
+    ports:
+      - "8091:8091"
+      - "8100-8105:8100-8105"
+    command:
+      - middleManager
+    env_file:
+      - environment
+
+  router:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: router
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - router_var:/opt/druid/var
+    depends_on:
+      - zookeeper
+      - postgres
+      - coordinator
+    ports:
+      - "8888:8888"
+    command:
+      - router
+    env_file:
+      - environment
+
+  jupyter:
+    build:
+      context: ..
+      dockerfile: Dockerfile
+    container_name: jupyter
+    profiles: ["jupyter", "all-services"]
+    environment:
+      DOCKER_STACKS_JUPYTER_CMD: "notebook"
+      NOTEBOOK_ARGS: "--NotebookApp.token=''"
+    ports:
+      - "${JUPYTER_PORT:-8889}:8888"
+    volumes:
+      - ./notebooks:/home/jovyan/work
diff --git a/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose.yaml b/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose.yaml
new file mode 100644
index 0000000000..d9e95c085b
--- /dev/null
+++ b/examples/quickstart/jupyter-notebooks/docker-jupyter/docker-compose.yaml
@@ -0,0 +1,170 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+---
+version: "2.2"
+
+volumes:
+  metadata_data: {}
+  middle_var: {}
+  historical_var: {}
+  broker_var: {}
+  coordinator_var: {}
+  router_var: {}
+  druid_shared: {}
+
+
+services:
+  postgres:
+    image: postgres:latest
+    container_name: postgres
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - metadata_data:/var/lib/postgresql/data
+    environment:
+      - POSTGRES_PASSWORD=FoolishPassword
+      - POSTGRES_USER=druid
+      - POSTGRES_DB=druid
+
+  # Need 3.5 or later for container nodes
+  zookeeper:
+    image: zookeeper:latest
+    container_name: zookeeper
+    profiles: ["druid-jupyter", "all-services"]
+    ports:
+      - "2181:2181"
+    environment:
+      - ZOO_MY_ID=1
+      - ALLOW_ANONYMOUS_LOGIN=yes
+  
+  kafka:
+    image: bitnami/kafka:latest
+    container_name: kafka-broker
+    profiles: ["all-services"]
+    ports:
+    # To learn about configuring Kafka for access across networks see
+    # https://www.confluent.io/blog/kafka-client-cannot-connect-to-broker-on-aws-on-docker-etc/
+      - "9092:9092"
+    depends_on:
+      - zookeeper
+    environment:
+      - KAFKA_BROKER_ID=1
+      - KAFKA_CFG_LISTENERS=PLAINTEXT://:9092
+      - KAFKA_CFG_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092
+      - KAFKA_CFG_ZOOKEEPER_CONNECT=zookeeper:2181
+      - ALLOW_PLAINTEXT_LISTENER=yes
+
+  coordinator:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: coordinator
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - druid_shared:/opt/shared
+      - coordinator_var:/opt/druid/var
+    depends_on: 
+      - zookeeper
+      - postgres
+    ports:
+      - "8081:8081"
+    command:
+      - coordinator
+    env_file:
+      - environment
+
+  broker:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: broker
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - broker_var:/opt/druid/var
+    depends_on: 
+      - zookeeper
+      - postgres
+      - coordinator
+    ports:
+      - "8082:8082"
+    command:
+      - broker
+    env_file:
+      - environment
+
+  historical:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: historical
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - druid_shared:/opt/shared
+      - historical_var:/opt/druid/var
+    depends_on: 
+      - zookeeper
+      - postgres
+      - coordinator
+    ports:
+      - "8083:8083"
+    command:
+      - historical
+    env_file:
+      - environment
+
+  middlemanager:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: middlemanager
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - druid_shared:/opt/shared
+      - middle_var:/opt/druid/var
+    depends_on: 
+      - zookeeper
+      - postgres
+      - coordinator
+    ports:
+      - "8091:8091"
+      - "8100-8105:8100-8105"
+    command:
+      - middleManager
+    env_file:
+      - environment
+
+  router:
+    image: apache/druid:${DRUID_VERSION}
+    container_name: router
+    profiles: ["druid-jupyter", "all-services"]
+    volumes:
+      - router_var:/opt/druid/var
+    depends_on:
+      - zookeeper
+      - postgres
+      - coordinator
+    ports:
+      - "8888:8888"
+    command:
+      - router
+    env_file:
+      - environment
+
+  jupyter:
+    image: imply/druid-notebook:latest
+    container_name: jupyter
+    profiles: ["jupyter", "all-services"]
+    environment:
+      DOCKER_STACKS_JUPYTER_CMD: "notebook"
+      NOTEBOOK_ARGS: "--NotebookApp.token=''"
+    ports:
+      - "${JUPYTER_PORT:-8889}:8888"
+    volumes:
+      - ./notebooks:/home/jovyan/work
diff --git a/examples/quickstart/jupyter-notebooks/docker-jupyter/environment b/examples/quickstart/jupyter-notebooks/docker-jupyter/environment
new file mode 100644
index 0000000000..c63a5c0e88
--- /dev/null
+++ b/examples/quickstart/jupyter-notebooks/docker-jupyter/environment
@@ -0,0 +1,56 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+# Java tuning
+#DRUID_XMX=1g
+#DRUID_XMS=1g
+#DRUID_MAXNEWSIZE=250m
+#DRUID_NEWSIZE=250m
+#DRUID_MAXDIRECTMEMORYSIZE=6172m
+DRUID_SINGLE_NODE_CONF=micro-quickstart
+
+druid_emitter_logging_logLevel=debug
+
+druid_extensions_loadList=["druid-histogram", "druid-datasketches", "druid-lookups-cached-global", "postgresql-metadata-storage", "druid-multi-stage-query", "druid-kafka-indexing-service"]
+
+druid_zk_service_host=zookeeper
+
+druid_metadata_storage_host=
+druid_metadata_storage_type=postgresql
+druid_metadata_storage_connector_connectURI=jdbc:postgresql://postgres:5432/druid
+druid_metadata_storage_connector_user=druid
+druid_metadata_storage_connector_password=FoolishPassword
+
+druid_coordinator_balancer_strategy=cachingCost
+
+druid_indexer_runner_javaOptsArray=["-server", "-Xmx1g", "-Xms1g", "-XX:MaxDirectMemorySize=3g", "-Duser.timezone=UTC", "-Dfile.encoding=UTF-8", "-Djava.util.logging.manager=org.apache.logging.log4j.jul.LogManager"]
+druid_indexer_fork_property_druid_processing_buffer_sizeBytes=256MiB
+
+
+
+druid_storage_type=local
+druid_storage_storageDirectory=/opt/shared/segments
+druid_indexer_logs_type=file
+druid_indexer_logs_directory=/opt/shared/indexing-logs
+
+druid_processing_numThreads=2
+druid_processing_numMergeBuffers=2
+
+DRUID_LOG4J=<?xml version="1.0" encoding="UTF-8" ?><Configuration status="WARN"><Appenders><Console name="Console" target="SYSTEM_OUT"><PatternLayout pattern="%d{ISO8601} %p [%t] %c - %m%n"/></Console></Appenders><Loggers><Root level="info"><AppenderRef ref="Console"/></Root><Logger name="org.apache.druid.jetty.RequestLog" additivity="false" level="DEBUG"><AppenderRef ref="Console"/></Logger></Loggers></Configuration>
+
diff --git a/examples/quickstart/jupyter-notebooks/docker-jupyter/kafka_docker_config.json b/examples/quickstart/jupyter-notebooks/docker-jupyter/kafka_docker_config.json
new file mode 100644
index 0000000000..2add8f3fa1
--- /dev/null
+++ b/examples/quickstart/jupyter-notebooks/docker-jupyter/kafka_docker_config.json
@@ -0,0 +1,90 @@
+{
+  "target": {
+    "type": "kafka",
+    "endpoint": "kafka:9092",
+    "topic": "social_media"
+  },
+  "emitters": [
+    {
+      "name": "example_record_1",
+      "dimensions": [
+        {
+          "type": "enum",
+          "name": "username",
+          "values": ["willow", "mia", "leon", "milton", "miette", "gus", "jojo", "rocket"],
+          "cardinality_distribution": {
+            "type": "uniform",
+            "min": 0,
+            "max": 7
+          }
+        },
+        {
+          "type": "string",
+          "name": "post_title",
+          "length_distribution": {"type": "uniform", "min": 1, "max": 140},
+          "cardinality": 0,
+          "chars": "abcdefghijklmnopqrstuvwxyz0123456789_ABCDEFGHIJKLMNOPQRSTUVWXYZ!';:,."
+        },
+        {
+          "type": "int",
+          "name": "views",
+          "distribution": {
+            "type": "exponential",
+            "mean": 10000
+          },
+          "cardinality": 0
+        },
+        {
+          "type": "int",
+          "name": "upvotes",
+          "distribution": {
+            "type": "normal",
+            "mean": 70,
+            "stddev": 20
+          },
+          "cardinality": 0
+        },
+        {
+          "type": "int",
+          "name": "comments",
+          "distribution": {
+            "type": "normal",
+            "mean": 10,
+            "stddev": 5
+          },
+          "cardinality": 0
+        },
+        {
+          "type": "enum",
+          "name": "edited",
+          "values": ["True","False"],
+          "cardinality_distribution": {
+            "type": "uniform",
+            "min": 0,
+            "max": 1
+          }
+        }
+      ]
+    }
+  ],
+  "interarrival": {
+    "type": "constant",
+    "value": 1
+  },
+  "states": [
+    {
+      "name": "state_1",
+      "emitter": "example_record_1",
+      "delay": {
+        "type": "constant",
+        "value": 1
+      },
+      "transitions": [
+        {
+          "next": "state_1",
+          "probability": 1.0
+        }
+      ]
+    }
+  ]
+}
diff --git a/examples/quickstart/jupyter-notebooks/docker-jupyter/tutorial-jupyter-docker.zip b/examples/quickstart/jupyter-notebooks/docker-jupyter/tutorial-jupyter-docker.zip
new file mode 100644
index 0000000000..4a3c02e4c4
Binary files /dev/null and b/examples/quickstart/jupyter-notebooks/docker-jupyter/tutorial-jupyter-docker.zip differ
diff --git a/examples/quickstart/jupyter-notebooks/kafka-tutorial.ipynb b/examples/quickstart/jupyter-notebooks/kafka-tutorial.ipynb
new file mode 100644
index 0000000000..9ab6ce1681
--- /dev/null
+++ b/examples/quickstart/jupyter-notebooks/kafka-tutorial.ipynb
@@ -0,0 +1,782 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Tutorial: Ingest and query data from Apache Kafka\n",
+    "\n",
+    "<!--\n",
+    "  ~ Licensed to the Apache Software Foundation (ASF) under one\n",
+    "  ~ or more contributor license agreements.  See the NOTICE file\n",
+    "  ~ distributed with this work for additional information\n",
+    "  ~ regarding copyright ownership.  The ASF licenses this file\n",
+    "  ~ to you under the Apache License, Version 2.0 (the\n",
+    "  ~ \"License\"); you may not use this file except in compliance\n",
+    "  ~ with the License.  You may obtain a copy of the License at\n",
+    "  ~\n",
+    "  ~   http://www.apache.org/licenses/LICENSE-2.0\n",
+    "  ~\n",
+    "  ~ Unless required by applicable law or agreed to in writing,\n",
+    "  ~ software distributed under the License is distributed on an\n",
+    "  ~ \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
+    "  ~ KIND, either express or implied.  See the License for the\n",
+    "  ~ specific language governing permissions and limitations\n",
+    "  ~ under the License.\n",
+    "  -->\n",
+    "\n",
+    "This tutorial introduces you to streaming ingestion in Apache Druid using the Apache Kafka event streaming platform.\n",
+    "Follow along to learn how to create and load data into a Kafka topic, start ingesting data from the topic into Druid, and query results over time. This tutorial assumes you have a basic understanding of Druid ingestion, querying, and API requests."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Table of contents\n",
+    "\n",
+    "* [Prerequisites](#Prerequisites)\n",
+    "* [Load Druid API client](#Load-Druid-API-client)\n",
+    "* [Create Kafka topic](#Create-Kafka-topic)\n",
+    "* [Load data into Kafka topic](#Load-data-into-Kafka-topic)\n",
+    "* [Start Druid ingestion](#Start-Druid-ingestion)\n",
+    "* [Query Druid datasource and visualize query results](#Query-Druid-datasource-and-visualize-query-results)\n",
+    "* [Learn more](#Learn-more)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Prerequisites\n",
+    "\n",
+    "Launch this tutorial and all prerequisites using the `all-services` profile of the Docker Compose file for Jupyter-based Druid tutorials. For more information, see [Docker for Jupyter Notebook tutorials](https://druid.apache.org/docs/latest/tutorials/tutorial-jupyter-docker.html).\n",
+    "\n",
+    "Otherwise, you need the following:\n",
+    "* A running Druid instance.\n",
+    "   * Update the `druid_host` variable to point to your Router endpoint. For example, `druid_host = \"http://localhost:8888\"`.\n",
+    "* A running Kafka cluster.\n",
+    "   * Update the Kafka bootstrap servers to point to your servers. For example, `bootstrap_servers=[\"localhost:9092\"]`.\n",
+    "* The following Python packages:\n",
+    "   * `druidapi`, a Python client for Apache Druid\n",
+    "   * `DruidDataDriver`, a data generator\n",
+    "   * `kafka`, a Python client for Apache Kafka\n",
+    "   * `pandas`, `matplotlib`, and `seaborn` for data visualization\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load Druid API client"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To start the tutorial, run the following cell. It imports the required Python packages and defines a variable for the Druid client, and another for the SQL client used to run SQL commands."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "\n",
+       "<style>\n",
+       "  .druid table {\n",
+       "    border: 1px solid black;\n",
+       "    border-collapse: collapse;\n",
+       "  }\n",
+       "\n",
+       "  .druid th, .druid td {\n",
+       "    padding: 4px 1em ;\n",
+       "    text-align: left;\n",
+       "  }\n",
+       "\n",
+       "  td.druid-right, th.druid-right {\n",
+       "    text-align: right;\n",
+       "  }\n",
+       "\n",
+       "  td.druid-center, th.druid-center {\n",
+       "    text-align: center;\n",
+       "  }\n",
+       "\n",
+       "  .druid .druid-left {\n",
+       "    text-align: left;\n",
+       "  }\n",
+       "\n",
+       "  .druid-alert {\n",
+       "    font-weight: bold;\n",
+       "  }\n",
+       "\n",
+       "  .druid-error {\n",
+       "    color: red;\n",
+       "  }\n",
+       "</style>\n"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "import druidapi\n",
+    "import json\n",
+    "\n",
+    "# druid_host is the hostname and port for your Druid deployment. \n",
+    "# In a distributed environment, you can point to other Druid services.\n",
+    "# In this tutorial, you'll use the Router service as the `druid_host`.\n",
+    "druid_host = \"http://router:8888\"\n",
+    "\n",
+    "druid = druidapi.jupyter_client(druid_host)\n",
+    "display = druid.display\n",
+    "sql_client = druid.sql\n",
+    "\n",
+    "# Create a rest client for native JSON ingestion for streaming data\n",
+    "rest_client = druidapi.rest.DruidRestClient(\"http://coordinator:8081\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Create Kafka topic"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This notebook relies on the Python client for the Apache Kafka. Import the Kafka producer and consumer modules, then create a Kafka client. You use the Kafka producer to create and publish records to a new topic named `social_media`."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "from kafka import KafkaProducer\n",
+    "from kafka import KafkaConsumer\n",
+    "\n",
+    "# Kafka runs on kafka:9092 in multi-container tutorial application\n",
+    "producer = KafkaProducer(bootstrap_servers='kafka:9092')\n",
+    "topic_name = \"social_media\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Create the `social_media` topic and send a sample event. The `send()` command returns a metadata descriptor for the record."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<kafka.producer.future.FutureRecordMetadata at 0x7f5f65344610>"
+      ]
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "event = {\n",
+    "    \"__time\": \"2023-01-03T16:40:21.501\",\n",
+    "    \"username\": \"willow\",\n",
+    "    \"post_title\": \"This title is required\",\n",
+    "    \"views\": 15284,\n",
+    "    \"upvotes\": 124,\n",
+    "    \"comments\": 21,\n",
+    "    \"edited\": \"True\"\n",
+    "}\n",
+    "\n",
+    "producer.send(topic_name, json.dumps(event).encode('utf-8'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "To verify that the Kafka topic stored the event, create a consumer client to read records from the Kafka cluster, and get the next (only) message:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{\"__time\": \"2023-01-03T16:40:21.501\", \"username\": \"willow\", \"post_title\": \"This title is required\", \"views\": 15284, \"upvotes\": 124, \"comments\": 21, \"edited\": \"True\"}\n"
+     ]
+    }
+   ],
+   "source": [
+    "consumer = KafkaConsumer(topic_name, bootstrap_servers=['kafka:9092'], auto_offset_reset='earliest',\n",
+    "     enable_auto_commit=True)\n",
+    "\n",
+    "print(next(consumer).value.decode('utf-8'))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Load data into Kafka topic"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Instead of manually creating events to send to the Kafka topic, use a data generator to simulate a continuous data stream. This tutorial makes use of Druid Data Driver to simulate a continuous data stream into the `social_media` Kafka topic. To learn more about the Druid Data Driver, see the Druid Summit talk, [Generating Time centric Data for Apache Druid](https://www.youtube.com/watch?v=3zAOeLe3iAo).\n",
+    "\n",
+    "In this notebook, you use a background process to continuously load data into the Kafka topic.\n",
+    "This allows you to keep executing commands in this notebook while data is constantly being streamed into the topic."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Run the following cells to load sample data into the `social_media` Kafka topic:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 5,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import multiprocessing as mp\n",
+    "from datetime import datetime\n",
+    "import DruidDataDriver"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "def run_driver():\n",
+    "    DruidDataDriver.simulate(\"kafka_docker_config.json\", None, None, \"REAL\", datetime.now())\n",
+    "        \n",
+    "mp.set_start_method('fork')\n",
+    "ps = mp.Process(target=run_driver)\n",
+    "ps.start()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Start Druid ingestion"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Now that you have a new Kafka topic and data being streamed into the topic, you ingest the data into Druid by submitting a Kafka ingestion spec.\n",
+    "The ingestion spec describes the following:\n",
+    "* where to source the data to ingest (in `spec > ioConfig`),\n",
+    "* the datasource to ingest data into (in `spec > dataSchema > dataSource`), and\n",
+    "* what the data looks like (in `spec > dataSchema > dimensionsSpec`).\n",
+    "\n",
+    "Other properties control how Druid aggregates and stores data. For more information, see the Druid documenation:\n",
+    "* [Apache Kafka ingestion](https://druid.apache.org/docs/latest/development/extensions-core/kafka-ingestion.html)\n",
+    "* [Ingestion spec reference](https://druid.apache.org/docs/latest/ingestion/ingestion-spec.html)\n",
+    "\n",
+    "Run the following cells to define and view the Kafka ingestion spec."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "kafka_ingestion_spec = \"{\\\"type\\\": \\\"kafka\\\",\\\"spec\\\": {\\\"ioConfig\\\": {\\\"type\\\": \\\"kafka\\\",\\\"consumerProperties\\\": {\\\"bootstrap.servers\\\": \\\"kafka:9092\\\"},\\\"topic\\\": \\\"social_media\\\",\\\"inputFormat\\\": {\\\"type\\\": \\\"json\\\"},\\\"useEarliestOffset\\\": true},\\\"tuningConfig\\\": {\\\"type\\\": \\\"kafka\\\"},\\\"dataSchema\\\": {\\\"dataSource\\\": \\\"social_media\\\",\\\"timestampSpec\\\": {\\\"column\\\": \\\"__time\\\",\\\"for [...]
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "{\n",
+      "    \"type\": \"kafka\",\n",
+      "    \"spec\": {\n",
+      "        \"ioConfig\": {\n",
+      "            \"type\": \"kafka\",\n",
+      "            \"consumerProperties\": {\n",
+      "                \"bootstrap.servers\": \"kafka:9092\"\n",
+      "            },\n",
+      "            \"topic\": \"social_media\",\n",
+      "            \"inputFormat\": {\n",
+      "                \"type\": \"json\"\n",
+      "            },\n",
+      "            \"useEarliestOffset\": true\n",
+      "        },\n",
+      "        \"tuningConfig\": {\n",
+      "            \"type\": \"kafka\"\n",
+      "        },\n",
+      "        \"dataSchema\": {\n",
+      "            \"dataSource\": \"social_media\",\n",
+      "            \"timestampSpec\": {\n",
+      "                \"column\": \"__time\",\n",
+      "                \"format\": \"iso\"\n",
+      "            },\n",
+      "            \"dimensionsSpec\": {\n",
+      "                \"dimensions\": [\n",
+      "                    \"username\",\n",
+      "                    \"post_title\",\n",
+      "                    {\n",
+      "                        \"type\": \"long\",\n",
+      "                        \"name\": \"views\"\n",
+      "                    },\n",
+      "                    {\n",
+      "                        \"type\": \"long\",\n",
+      "                        \"name\": \"upvotes\"\n",
+      "                    },\n",
+      "                    {\n",
+      "                        \"type\": \"long\",\n",
+      "                        \"name\": \"comments\"\n",
+      "                    },\n",
+      "                    \"edited\"\n",
+      "                ]\n",
+      "            },\n",
+      "            \"granularitySpec\": {\n",
+      "                \"queryGranularity\": \"none\",\n",
+      "                \"rollup\": false,\n",
+      "                \"segmentGranularity\": \"hour\"\n",
+      "            }\n",
+      "        }\n",
+      "    }\n",
+      "}\n"
+     ]
+    }
+   ],
+   "source": [
+    "print(json.dumps(json.loads(kafka_ingestion_spec), indent=4))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Send the spec to Druid to start the streaming ingestion from Kafka:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 9,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<Response [200]>"
+      ]
+     },
+     "execution_count": 9,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "headers = {\n",
+    "  'Content-Type': 'application/json'\n",
+    "}\n",
+    "\n",
+    "rest_client.post(\"/druid/indexer/v1/supervisor\", kafka_ingestion_spec, headers=headers)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "A `200` response indicates that the request was successful. You can view the running ingestion task and the new datasource in the web console at http://localhost:8888/unified-console.html."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Query Druid datasource and visualize query results"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You can now query the new datasource called `social_media`. In this section, you also visualize query results using the Matplotlib and Seaborn visualization libraries. Run the following cell import these packages."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 10,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "import matplotlib\n",
+    "import matplotlib.pyplot as plt\n",
+    "import seaborn as sns"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Run a simple query to view a subset of rows from the new datasource:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 11,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div class=\"druid\"><table>\n",
+       "<tr><th>__time</th><th>username</th><th>post_title</th><th>views</th><th>upvotes</th><th>comments</th><th>edited</th></tr>\n",
+       "<tr><td>2023-01-03T16:40:21.501Z</td><td>willow</td><td>This title is required</td><td>15284</td><td>124</td><td>21</td><td>True</td></tr>\n",
+       "<tr><td>2023-05-02T23:34:54.451Z</td><td>gus</td><td>3y4hkmd1!&#x27;Er4;</td><td>4031</td><td>93</td><td>15</td><td>False</td></tr>\n",
+       "<tr><td>2023-05-02T23:34:55.454Z</td><td>mia</td><td>m62u53:D9s2bOvnY_VM9vjtZ&#x27;MyDLvQ7_xGodAP:ZNTXM6cFAt,_jrxBVBeRILLvAF9Z!jM9YNN;3ErV5eGbE_TFQS</td><td>16060</td><td>84</td><td>8</td><td>True</td></tr>\n",
+       "<tr><td>2023-05-02T23:34:55.455Z</td><td>jojo</td><td>rAmeAJrjs;FBj:zy2MwoGh_P_SowlLTfp6zhX55xqogH.,1DC2xY_x2T;M_Vcu3QWaz650u;Roa</td><td>14313</td><td>65</td><td>7</td><td>False</td></tr>\n",
+       "<tr><td>2023-05-02T23:34:56.456Z</td><td>willow</td><td>3bHB,iJdE;sedTDA,1dKGDAZL!qdsvO_tv.4Jrq7fa.KWcHPD&#x27;TB_5nnvsf9EgtnN8tGeeA0MjKc30iubJ:D&#x27;l7pHNihWpFz8K&#x27;46q!vJs</td><td>4237</td><td>112</td><td>3</td><td>True</td></tr>\n",
+       "</table></div>"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "sql = '''\n",
+    "SELECT * FROM social_media LIMIT 5\n",
+    "'''\n",
+    "display.sql(sql)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "In this social media scenario, each incoming event represents a post on social media, for which you collect the timestamp, username, and post metadata. You are interested in analyzing the total number of upvotes for all posts, compared between users. Preview this data with the following query:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 12,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div class=\"druid\"><table>\n",
+       "<tr><th>num_posts</th><th>total_upvotes</th><th>username</th></tr>\n",
+       "<tr><td>155</td><td>10985</td><td>willow</td></tr>\n",
+       "<tr><td>161</td><td>11223</td><td>gus</td></tr>\n",
+       "<tr><td>164</td><td>11456</td><td>leon</td></tr>\n",
+       "<tr><td>173</td><td>12098</td><td>jojo</td></tr>\n",
+       "<tr><td>176</td><td>12175</td><td>mia</td></tr>\n",
+       "<tr><td>177</td><td>11998</td><td>milton</td></tr>\n",
+       "<tr><td>185</td><td>13256</td><td>miette</td></tr>\n",
+       "<tr><td>188</td><td>13360</td><td>rocket</td></tr>\n",
+       "</table></div>"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "sql = '''\n",
+    "SELECT\n",
+    "  COUNT(post_title) as num_posts,\n",
+    "  SUM(upvotes) as total_upvotes,\n",
+    "  username\n",
+    "FROM social_media\n",
+    "GROUP BY username\n",
+    "ORDER BY num_posts\n",
+    "'''\n",
+    "\n",
+    "response = sql_client.sql_query(sql)\n",
+    "response.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Visualize the total number of upvotes per user using a line plot. You sort the results by username before plotting because the order of users may vary as new results arrive."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 13,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAk0AAAHMCAYAAADI/py4AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy88F64QAAAACXBIWXMAAA9hAAAPYQGoP6dpAACRN0lEQVR4nOzdd3iTZfcH8O+T7j3ppNAySwdtAQtF9p7KEEVl+JMhvjJERJYyVBRRnCjI60AcrwqyQaBskLJbSiktUArdmzZd6UjO7480sWE2Je2TpOdzXb20z/M0OUlpcnKf+z63QEQExhhjjDH2UBKxA2CMMcYYMwScNDHGGGOM1QEnTYwxxhhjdcBJE2OMMcZYHXDSxBhjjDFWB5w0McYYY4zVASdNjDHGGGN1YCp2AMZCoVAgIyMDdnZ2EARB7HAYY4wxVgdEhOLiYnh5eUEiefhYEidNOpKRkQEfHx+xw2CM [...]
+      "text/plain": [
+       "<Figure size 640x480 with 1 Axes>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "df = pd.DataFrame(response.json)\n",
+    "df = df.sort_values('username')\n",
+    "\n",
+    "df.plot(x='username', y='total_upvotes', marker='o')\n",
+    "plt.xticks(rotation=45, ha='right')\n",
+    "plt.ylabel(\"Total number of upvotes\")\n",
+    "plt.gca().get_legend().remove()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "The total number of upvotes likely depends on the total number of posts created per user. To better assess the relative impact per user, you compare the total number of upvotes (line plot) with the total number of posts."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "<matplotlib.legend.Legend at 0x7f5f18400310>"
+      ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
+    },
+    {
+     "data": {
+      "image/png": "iVBORw0KGgoAAAANSUhEUgAAA1cAAAHMCAYAAAA5/FJZAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy88F64QAAAACXBIWXMAAA9hAAAPYQGoP6dpAADE60lEQVR4nOzdd3hUdfb48feUTHrvgYSEFhJKQhGkCQSkKS6uFZDq6voVdDH2/UqxLLYFQWHhZ2ddXbEgX9dFpIgiSJESBQIBQiABUkmvk8zc3x9hBmICpExyJ8l5Pc88j5m5c++ZRJI5c87nfDSKoigIIYQQQgghhGgSrdoBCCGEEEIIIURbIMmVEEIIIYQQQtiAJFdCCCGEEEIIYQOSXAkhhBBCCCGEDUhyJYQQQgghhBA2IMmVEEIIIYQQQtiAJFdCCCGEEEIIYQN6tQNoK6qqqjh06BCBgYFotZKzCiGEEK2B [...]
+      "text/plain": [
+       "<Figure size 640x480 with 2 Axes>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "matplotlib.rc_file_defaults()\n",
+    "ax1 = sns.set_style(style=None, rc=None )\n",
+    "\n",
+    "fig, ax1 = plt.subplots()\n",
+    "plt.xticks(rotation=45, ha='right')\n",
+    "\n",
+    "\n",
+    "sns.lineplot(\n",
+    "    data=df, x='username', y='total_upvotes',\n",
+    "    marker='o', ax=ax1, label=\"Sum of upvotes\")\n",
+    "ax1.get_legend().remove()\n",
+    "\n",
+    "ax2 = ax1.twinx()\n",
+    "sns.barplot(data=df, x='username', y='num_posts',\n",
+    "            order=df['username'], alpha=0.5, ax=ax2, log=True,\n",
+    "            color=\"orange\", label=\"Number of posts\")\n",
+    "\n",
+    "\n",
+    "# ask matplotlib for the plotted objects and their labels\n",
+    "lines, labels = ax1.get_legend_handles_labels()\n",
+    "lines2, labels2 = ax2.get_legend_handles_labels()\n",
+    "ax2.legend(lines + lines2, labels + labels2, bbox_to_anchor=(1.55, 1))"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You should see a correlation between total number of upvotes and total number of posts. In order to track user impact on a more equal footing, normalize the total number of upvotes relative to the total number of posts, and plot the result:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAkAAAAHMCAYAAAA9ABcIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy88F64QAAAACXBIWXMAAA9hAAAPYQGoP6dpAACLeElEQVR4nO3dd1xT5/cH8E/YIFNkyhQRRFFQHDjqVlyto1ZbrVpXbbXO2or9ujocbW3VDm1t1VpbW2frqHvvCU5EBGSDiuxNcn5/8MstKaAEEi5Jzvv1yktzc3PvuQGSk+c5z/NIiIjAGGOMMaZD9MQOgDHGGGOsrnECxBhjjDGdwwkQY4wxxnQOJ0CMMcYY0zmcADHGGGNM53ACxBhjjDGdwwkQY4wxxnSOgdgB1EcymQzJycmwsLCARCIROxzGGGOMVQMRIScnB87OztDTe34bDydAlUhOToarq6vYYTDGGGOs [...]
+      "text/plain": [
+       "<Figure size 640x480 with 1 Axes>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "df['upvotes_normalized'] = df['total_upvotes']/df['num_posts']\n",
+    "\n",
+    "df.plot(x='username', y='upvotes_normalized', marker='o', color='green')\n",
+    "plt.xticks(rotation=45, ha='right')\n",
+    "plt.ylabel(\"Number of upvotes (normalized)\")\n",
+    "plt.gca().get_legend().remove()\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "You've been working with data taken at a single snapshot in time from when you ran the last query. Run the same query again, and store the output in `response2`, which you will compare with the previous results:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 16,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/html": [
+       "<div class=\"druid\"><table>\n",
+       "<tr><th>num_posts</th><th>total_upvotes</th><th>username</th></tr>\n",
+       "<tr><td>404</td><td>28166</td><td>willow</td></tr>\n",
+       "<tr><td>418</td><td>29413</td><td>jojo</td></tr>\n",
+       "<tr><td>419</td><td>29202</td><td>mia</td></tr>\n",
+       "<tr><td>419</td><td>29456</td><td>miette</td></tr>\n",
+       "<tr><td>428</td><td>29472</td><td>gus</td></tr>\n",
+       "<tr><td>433</td><td>30160</td><td>milton</td></tr>\n",
+       "<tr><td>440</td><td>31212</td><td>leon</td></tr>\n",
+       "<tr><td>443</td><td>31063</td><td>rocket</td></tr>\n",
+       "</table></div>"
+      ],
+      "text/plain": [
+       "<IPython.core.display.HTML object>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "response2 = sql_client.sql_query(sql)\n",
+    "response2.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Normalizing the data also helps you evaluate trends over time more consistently on the same plot axes. Plot the normalized data again, this time alongside the results from the previous snapshot:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 17,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAkAAAAHMCAYAAAA9ABcIAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMCwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy88F64QAAAACXBIWXMAAA9hAAAPYQGoP6dpAAC6DklEQVR4nOzdd3iTZffA8W+SbroodNKWsilQoOxV9ihLkCUqAoKoiALixNetPxDfVwW3Iut9FZSpAlKWjLJX2WWVQgdtoYXuneT3R22ktkDTJk3Sns915ZI+ffI8J0ibk/s+97kVWq1WixBCCCFEDaI0dQBCCCGEEFVNEiAhhBBC1DiSAAkhhBCixpEESAghhBA1jiRAQgghhKhxJAESQgghRI0jCZAQQgghahwrUwdgjjQaDTdu3MDJyQmFQmHqcIQQQghRDlqtloyMDHx8fFAq7z/GIwlQGW7cuIGfn5+pwxBC [...]
+      "text/plain": [
+       "<Figure size 640x480 with 1 Axes>"
+      ]
+     },
+     "metadata": {},
+     "output_type": "display_data"
+    }
+   ],
+   "source": [
+    "df2 = pd.DataFrame(response2.json)\n",
+    "df2 = df2.sort_values('username')\n",
+    "df2['upvotes_normalized'] = df2['total_upvotes']/df2['num_posts']\n",
+    "\n",
+    "ax = df.plot(x='username', y='upvotes_normalized', marker='o', color='green', label=\"Time 1\")\n",
+    "df2.plot(x='username', y='upvotes_normalized', marker='o', color='purple', ax=ax, label=\"Time 2\")\n",
+    "plt.xticks(rotation=45, ha='right')\n",
+    "plt.ylabel(\"Number of upvotes (normalized)\")\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "This plot shows how some users maintain relatively consistent social media impact between the two query snapshots, whereas other users grow or decline in their influence.\n",
+    "\n",
+    "## Learn more\n",
+    "\n",
+    "This tutorial showed you how to create a Kafka topic using a Python client for Kafka, send a simulated stream of data to Kafka using a data generator, and query and visualize results over time. For more information, see the following resources:\n",
+    "\n",
+    "* [Apache Kafka ingestion](https://druid.apache.org/docs/latest/development/extensions-core/kafka-ingestion.html)\n",
+    "* [Querying data](https://druid.apache.org/docs/latest/tutorials/tutorial-query.html)\n",
+    "* [Tutorial: Run with Docker](https://druid.apache.org/docs/latest/tutorials/docker.html)"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.8"
+  },
+  "vscode": {
+   "interpreter": {
+    "hash": "a4289e5b8bae5973a6609d90f7bc464162478362b9a770893a3c5c597b0b36e7"
+   }
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 2
+}
diff --git a/website/sidebars.json b/website/sidebars.json
index fbb6bf0866..f1ab145c04 100644
--- a/website/sidebars.json
+++ b/website/sidebars.json
@@ -27,6 +27,7 @@
       "tutorials/tutorial-sql-query-view",
       "tutorials/tutorial-unnest-arrays",
       "tutorials/tutorial-jupyter-index",
+      "tutorials/tutorial-jupyter-docker",
       "tutorials/tutorial-jdbc"
     ],
     "Design": [


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org