You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Pramod Biligiri (Jira)" <ji...@apache.org> on 2022/10/13 05:07:00 UTC
[jira] [Created] (HUDI-5024) Support storing database also as a Dataset in Datahub, not just a table
Pramod Biligiri created HUDI-5024:
-------------------------------------
Summary: Support storing database also as a Dataset in Datahub, not just a table
Key: HUDI-5024
URL: https://issues.apache.org/jira/browse/HUDI-5024
Project: Apache Hudi
Issue Type: Task
Components: meta-sync
Reporter: Pramod Biligiri
Note: Evaluate feasibility and desirability of this before implementing.
Hudi's DatahubSyncTool only pushes tables as a Dataset into Datahub, and not the database itself as a Dataset. Moreover, Datahub also appears (on the face of it) to only store tables as a Dataset, and not the database itself. This is shown even in their demo page: [https://demo.datahubproject.io/browse/dataset/prod/postgres/calm-pagoda-323403/jaffle_shop]
But some customers might want to store the Database also as a top-level entity. So consider enhancing DatahubSyncTool to do the same - probably using some advanced features of Datahub?
Ongoing Slack thread about this in Datahub Slack: https://datahubspace.slack.com/archives/CUMUWQU66/p1665636994736379
--
This message was sent by Atlassian Jira
(v8.20.10#820010)