You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Stephen Sisk <si...@google.com.INVALID> on 2016/10/20 21:08:27 UTC

Integration Testing Sources

Hi all!

(I've recently joined the Beam community and I wanted to take this
opportunity to introduce myself. I previously worked on the Dataflow team
and I'm transitioning to working on Beam. I'm looking forward to getting
started.)

The Sources and Runners discussion seemed to be headed in two directions,
so I thought I would split part of that conversation into a new thread to
address Aljoscha's question of "Should we maybe add integration tests that
verify that all runners can correctly read from and write to an external
system in a complete Pipeline"? [1]

Having embedded data services run along side the runner seems like an
expedient way to get some test coverage on runner's interactions with
sources & sinks, and it'd be an important part of the pre-commits or
post-commits for the runners.

I'm also interested in a problem related to what Aljoscha raised and am
starting to investigate having a cluster of machines available to run other
data services (HDFS, mongodb, redis, ActiveMQ, etc...) so we can get good
integration test coverage on the connectors themselves. I'm excited to hear
about the work JB has done in this area [2] and I'd be building off of
that/learning from that.

My goal would be to have automated integration tests running against real
instances of the data services. JB has been working with mesos+marathon -
in addition to those, I'm taking a quick look at kubernetes and docker
swarm to see what would be easiest to maintain, what the tradeoffs are,
etc... If folks have experience they'd like to share with those tools, or
other tools worth looking into, I'd be excited to hear about it.

As Dan previously mentioned [3], we would also need to have a cluster
available for performance testing anyway, so this is just adding more data
services to run on the cluster we likely already need.

[1] -
https://lists.apache.org/thread.html/0b15378d4b85e55e1e76c3c7ae8933a4de02171442c63f3cec7d1c8b@%3Cdev.beam.apache.org%3E

[2] -
https://lists.apache.org/thread.html/7b5e9c4e21f5d3a0698db2edf1422681afd64b40e7eb1e588c49e59d@%3Cdev.beam.apache.org%3E

[3] -
https://lists.apache.org/thread.html/ff083837b839cb3b90944d0db998d36aeee76008cec8e42c98490174@%3Cdev.beam.apache.org%3E


Thanks!
Stephen