You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@storm.apache.org by "Martin, Nick" <Ni...@pssd.com> on 2014/03/26 01:47:00 UTC

Storm + RDBMS

Hi team,

I've been quietly watching Storm progress and I think I'm ready to dive in.

Use case is streaming from Oracle into HDFS and on the fly I'd like to do some aggregation (i.e. sum up the lines on an invoice into an order amount) and translation/modification (i.e. every time order code '5' comes across I update it to 'EDI')...you get the idea. We typically use Sqoop to do "dumb" pulls from RDBMS sources and then do various operations on the data once in HDFS/Hive but I'd like to explore scaling back our footprint of transformation jobs if possible with Storm.

I've spent some time digging through docs/web  and found this:
https://github.com/nathanmarz/storm-contrib/tree/master/storm-rdbms

Which seems to be a good starting place, but wondering if I'm completely missing something that has some documentation about this kind of Storm application. If not, totally fine, just wanted to ensure I asked the group before I plugged away piecing it together from existing documentation/examples.

Thanks in advance for any kind nudges in the right direction and can't wait to get started.

Best,
Nick