You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Dylan Adams <dy...@gmail.com> on 2018/11/28 21:09:08 UTC

Updating multiple database tables

Hello,

I was hoping to get input from the Flink community about patterns for
handling multiple dependent RDMS updates. A textbook example would be
order & order_line_item tables. I've encountered a few approaches to
this problem, and I'm curious to see if there are others, and the
benefits & drawbacks of those solutions.

Multiple JDBCOutputFormats. This is possible if you use an
application-generated primary key, such as a UUID. Drawback is that
it's only eventually consistent.

JDBC-emitting function and JDBCOutputFormat. When the primary key is
generated by the database, the program uses a JDBC-emitting
MapFunction to persist records for one table and retrieve its PK. The
other table is persisted using JDBCOutputFormat. Only eventually
consistent.

JDBCOutputFormat and database-specific features. Most broadly
supported would be stored procedures, but use other mechanisms like
CTEs. Atomic; requires non-portable database implementations.

Custom OutputFormat. Full control, allows for atomic updates at the
cost of maintaining custom OutputFormats for each combination of
updated tables.

Has anyone seen any other approaches to this challenge?

Regards,
Dylan