You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/10/18 01:42:43 UTC

[GitHub] [pulsar] cdbartholomew commented on issue #5392: Error open RocksDB database when 'Set up a standalone Pulsar in Docker'

cdbartholomew commented on issue #5392: Error open RocksDB database when 'Set up a standalone Pulsar in Docker'
URL: https://github.com/apache/pulsar/issues/5392#issuecomment-543444568
 
 
   This is a Docker configuration issue. The command in the documentation:
   
   ```
   $ docker run -it \
     -p 6650:6650 \
     -p 8080:8080 \
     -v "$PWD/data:/pulsar/data".ToLower() \
     apachepulsar/pulsar:2.4.1 \
     bin/pulsar standalone
   ```
   
   Is faulty. Here's why.
   
   When the pulsar Docker image is built, it defines two volumes:
   
   ```
   VOLUME  ["/pulsar/conf", "/pulsar/data"]
   ```
   This means at run time the Docker image expects an externally mounted volume for those the two paths. When you do "docker run" without specifying any storage it will automatically create anonymous Docker volumes. It will look something like this:
   
   ```
           "Mounts": [
               {
                   "Type": "volume",
                   "Name": "c59b3a5ad98eafc8efcbe6de1f63b1a31693564996a8dcc6de60f02867015a51",
                   "Source": "/var/lib/docker/volumes/c59b3a5ad98eafc8efcbe6de1f63b1a31693564996a8dcc6de60f02867015a51/_data",
                   "Destination": "/pulsar/conf",
                   "Driver": "local",
                   "Mode": "",
                   "RW": true,
                   "Propagation": ""
               },
               {
                   "Type": "volume",
                   "Name": "b7600b9314de9bf8ed573ef697fa012b9d2574074fbea45c2831b3e0a854da30",
                   "Source": "/var/lib/docker/volumes/b7600b9314de9bf8ed573ef697fa012b9d2574074fbea45c2831b3e0a854da30/_data",
                   "Destination": "/pulsar/data",
                   "Driver": "local",
                   "Mode": "",
                   "RW": true,
                   "Propagation": ""
               }
           ],
   ```
   
   The problem is the -v option in the command. This tries to create a bind mount at the same path in the container as one of the pre-specified volume mount points (/pulsar/data). This creates two mounts on the same path. Docker doesn't barf on this (for some reason), but it obviously makes the file system behave strangely, causing RocksDB to barf.
   
   PR #3918 mentions adding the ToLower() function to -v option to "fix" the issue. This doesn't fix the issue at all. It avoids the error because it ends mangling the -v option so that it mounts a volume at /pulsar/data.ToLower(). This doesn't collide with the pre-defined path so RocksDB works, but it doesn't work at all as intended, since Pulsar is configured to expect its data to be in the /pulsar/data directory so /pulsar/data.ToLower() is never used.
   
   The fix for this is simple, instead of messing around with bind mounts, give it a volume mount like it expects. And since we presumably want the data to persist between "docker run" commands, we just have to give the volume a name.
   
   ```
   docker run -it \
   -p 6650:6650 -p 8080:8080  \
   --mount source=pulsardata,target=/pulsar/data \
   apachepulsar/pulsar:2.4.1 \
   bin/pulsar standalone
   ```
   
   Here is what docker inspect gives: 
   
   ```
           "Mounts": [
               {
                   "Type": "volume",
                   "Name": "6a41856b98b36f81d1bf1d1d196a59a026c82dd12aa50cee5bdf82295e7f670c",
                   "Source": "/var/lib/docker/volumes/6a41856b98b36f81d1bf1d1d196a59a026c82dd12aa50cee5bdf82295e7f670c/_data",
                   "Destination": "/pulsar/conf",
                   "Driver": "local",
                   "Mode": "",
                   "RW": true,
                   "Propagation": ""
               },
               {
                   "Type": "volume",
                   "Name": "pulsardata",
                   "Source": "/var/lib/docker/volumes/pulsardata/_data",
                   "Destination": "/pulsar/data",
                   "Driver": "local",
                   "Mode": "z",
                   "RW": true,
                   "Propagation": ""
               }
           ],
   ```
   
   
   The other advantage of using a named Docker volume is that we don't have to mess around with path specifications, so no need for the $PWD variable, which is defined in PowerShell but not CommandPrompt (CMD). I am able to run the above command using either PowerShell or CMD and it works reliably for me on Docker Desktop for Windows.
   
   Since the image expects the config data to be persisted, we should probably specify a name for that volume too, like this:
   
   ```
   docker run -it \
   -p 6650:6650 -p 8080:8080  \
   --mount source=pulsardata,target=/pulsar/data \
   --mount source=pulsarconf,target=/pulsar/conf \
   apachepulsar/pulsar:2.4.1 \
   bin/pulsar standalone
   ```
   
   I think just a documentation change is needed to resolve this issue. @junlia can you confirm whether my revised command above works for you?
   
   I am happy to put in a PR with the documentation change once this is confirmed to work outside my environment.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services