You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Stephen Sisk <si...@google.com.INVALID> on 2016/12/08 02:18:06 UTC

Container Orchestration software for hosting data stores

Hi,

I wanted to give a quick update on my investigation of container management
systems/orchestration software - we want to do this to allow us to host
instances of data stores for IO transform testing as discussed in [1].

We wanted to compare:

* kubernetes - the completely open source version that can run on-prem/on
any cloud

* mesos+marathon - I used DC/OS's distribution, which has prepackaged
support for the environment I was testing (let me know if there's a better
version to be using)

* docker swarm - I haven't had time to look into this yet. I'm open to
someone else doing an investigation or strongly speaking up in favor of it.

I wanted to see what it would be like for us to use these for our purposes,
so I looked at the following:

1. How easy is it to set up the primary node/agent node? This affects
community members' ability to run the tests locally, so it represents a
barrier to entry for doing IO work.

2. How well does it support hooking up a disk of some sort? Most data
stores will require persistent storage of some sort.

3. How easy is it to setup a basic example 1 node instance of a data store
(postgres in this case)? This will affect the cost of creating new IO
transforms


I want to look at next:

4. How easy is it to setup/configure multi-node stores? For example, the
current proposal is that we would run 3-5 node instances of the various
stores for the performance tests, so we'll want to be creating these.

5. How quickly does it setup/tear down new instances of a data store? This
affects how often we can run the tests.

If other folks have important scenarios we should be looking at, let me
know. If someone has time in the next few days to do #4 and #5 for mesos,
let me know and we'll co-ordinate. Otherwise I will get to it soon.

Overall, once they are up and running, both seem pretty similar. Here are
the areas where I found noteworthy differences:

* I found kubernetes much simpler to set up. Mesos has a lot of
dependencies that must be managed, and debugging errors in mismatched
versions of various dependencies was very time consuming. The list of
commands I had to use to set up mesos is at [2], but the quick summary was
that it was ~30 items long, while for kubernetes I basically just typed one
command.


* mesos' persistent disk feature is in beta, which isn't great. However, it
seems to work fine for the limited purposes of my testing, and since we're
not planning to use it for long-running instances we could in theory just
use non-persistent disks anyway.


* It's worth noting that I used postgres, which both mesos+marathon and
kubernetes have good example configs for. However, I believe setting up
other data stores should be similarly straightforward.


* kubernetes' persistent disk set up was more complicated, but gives us a
"true" persistent disk that outlives the cluster, so the current comparison
is not apples to apples.

This is just a quick update so I'm deliberately not making any
recommendations - I'd like to wait to see how these behave when we use them
with multi-node stores. Other folks may have thoughts on the differences I
found since there are likely people much more experienced with mesos than I
on this mailing list. :)

If folks are curious, the scripts and some basic comments for what I did
are up at [3]. The current version are very draft, but I'd rather show my
work mid-way than wait longer on updating the group. (eg. I don't like the
term "support" I used for the directory, I included username/passwords in
the scripts - that's not okay for the real version, scripts needs to be
parameterized, and are not immediately runnable as-is)

Thanks,

Stephen


[1] "Hosting data stores for IO Transform testing" -
https://lists.apache.org/thread.html/367fd9669411f21c9ec1f2d27df60464f49d5ce81e6bd16de401d035@%3Cdev.beam.apache.org%3E

[2] Mesos setup script I followed:
https://github.com/ssisk/incubator-beam/blob/support/support/mesos/setup.md

[3] The scripts/links I used for setting up mesos & kubernetes
https://github.com/ssisk/incubator-beam/tree/support/support