You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bahir.apache.org by Arne Zachlod <ar...@nerdkeller.org> on 2019/07/16 20:24:57 UTC
how is the Spark/Bahir dataflow with MQTT?
Hello,
sorry if this is kind of a beginner question to ask, but I couldn't find
any documentation on this. I'm using PySpark 2.4.3 running with the
Bahir git master, and everything seems to work great, thank you for that.
I didn't do any real scaling tests jet, but I was wondering how the flow
of data works with bahir. I have a single DStream created by
MQTTUtils.createStream() and this seems to create a single MQTT listener
according to my mosquitto logs. So, my question is: is that correct? Did
I do something wrong?
My original plan was to use some DNS trickery in order to scale beyond
what a singe machine is capable of delivering via network, is that still
possible? Basically, I wanted a MQTT subscriber per spark worker if that
is supported.
Any pointing to some documentation or example even would be greatly
appreciated.