You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@airflow.apache.org by Gerard Toonstra <gt...@gmail.com> on 2018/01/21 19:37:50 UTC

Data Vault example with airflow...

I've added a new example using a "Data Vault"  methodology available here:

https://gtoonstra.github.io/etl-with-airflow/datavault.html

What I find compelling about DataVault is how it enables you to store data
in a flexible way and regenerate some downstream star schema on the fly
from scratch (or even multiple versions of those). This allows you to defer
design decisions,
which decomplicates all the compromising and discussing of design decisions
and what this
means for analysis. It also allows you to redesign business metrics with
large impact
and compare the impact on analytics.


Also, a great reading tip if you haven't seen it yet; "Functional data
engineering" on medium:

https://medium.com/@maximebeauchemin/functional-data-engineering-a-modern-paradigm-for-batch-data-processing-2327ec32c42a

Rgds,

Gerard