You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2020/10/06 01:00:32 UTC

[GitHub] [incubator-superset] betodealmeida opened a new issue #11167: [SIP-54] Proposal to improve export/import of objects

betodealmeida opened a new issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167


   ## [SIP-54] Proposal to improve export/import of objects
   
   ### Motivation
   
   Superset provides functionality for exporting and importing objects. It can be used to export:
   
   - **Datasets**: metadata describing the dataset **definition** and **semantic layer** can be exported to YAML ([example](https://gist.github.com/betodealmeida/0f0fd576b7229a72863b02e0dc2d70f7));
   - **Databases**: metadata describing the database **connection**, as well as metadata describing contained **datasets**, can be exported to YAML ([example](https://gist.github.com/betodealmeida/8d4f06b2b002f06722f8c1330da778d1));
   - **Dashboards**: metadata describing the dashboard, contained charts and associated datasets can be exported to a single JSON file ([example](https://gist.github.com/betodealmeida/e5aace448357c2e404fe7c8769dc1bc1)).
   
   For import, currently only dashboards can be imported. The import assumes that the databases are identical in both instances, having the same tables. In theory databases [can have different names](https://github.com/apache/incubator-superset/pull/10118), but as @mistercrunch said "the code in this area is not in a great shape", with bugs (https://github.com/apache/incubator-superset/issues/11028) and unexpected side-effects (https://github.com/apache/incubator-superset/issues/10479). Additionally, since users need to specify a database when importing, dashboards with charts from multiple databases are currently not supported.
   
   The main motivation for this SIP is to **fix** and **improve** the import/export functionality, proposing a well-defined format for serializing and deserializing collections of Superset objects. This includes databases, datasources, charts and dashboards. The format specification will be versioned, providing backwards compatibility.
   
   Having a well-defined interchange format not only will prevent bugs but also allow us to build new functionality on top of it:
   
   - We can potentially include the data in the download (assuming it's smaller than a given threshold). This would allow users to, eg, download a dashboard from a blog post or repository, load it into their instance, and explore it.
   - It would provide a foundation for file-based configuration of dashboards. Other tools like [Grafana](https://grafana.com/) allow users to define dashboards in files and store them under version control, and having a well-defined format will make it easier to implement a similar storage mechanism in Superset.
   - It would be easier to build tools that programmatically generate dashboards and charts.
   
   ### Proposed Change
   
   The implementation of this SIP will build upon work in progress introducing UUIDs to import/export mixins (https://github.com/apache/incubator-superset/pull/7829). Adding UUIDs to the Superset models will help prevent conflicts when importing/exporting, as well as allowing the import of dashboards that are powered from different databases.
   
   This SIP introduces a specification for serializing Superset objects, focusing on **backwards compatibility** and **readability**. Objects (databases, datasources, charts and dashboards) will be serialized to YAML, one file per object. Files will be grouped into directories according their type, and zipped together into a single file.
   
   For example, exporting the "Unicode Test" dashboard would result in a ZIP file with the following structure:
   
   ```
   # dashboard_unicode-test_20200923T173845.zip
   databases/examples.yaml
   datasources/unicode_test.yaml
   charts/unicode_cloud.yaml
   dashboards/unicode_test.yaml
   ```
   
   Each object will be versioned, and use UUIDs for relationships:
   
   ```yaml
   # databases/examples.yaml
   version: 1.0.0
   id: e834a2be-439f-4cdf-bd55-0a1a32df7ceb
   title: Examples
   ...
   ```
   
   ```yaml
   # datasources/unicode_test.yaml
   version: 1.0.0
   id: 9e963850-e556-4c09-bc95-79d1c0c98724
   database_id: e834a2be-439f-4cdf-bd55-0a1a32df7ceb
   table_name: unicode_test
   ...
   ```
   
   When importing, Superset will unzip the file and check if it can import the provided version. Exports in the current format (single YAML or JSON file) will continue being supported, even though they don't declare a version.
   
   On import, objects will be matched against existing objects based on UUID. In the case of a match, users will be presented with options to **upsert** (merge into existing object, with attributes on the file having precedence), **overwrite** or **ignore**. If there's no match, the object will be created if possible.
   
   For databases, sensitive information such as passwords and the `secure_extra` field would be omitted, requiring the administrator to manually provide them on import.
   
   ### New or Changed Public Interfaces
   
   The implementation of this SIP would deprecate the `/import_dashboards` endpoint, substituting it for a more generic `/api/v1/import` endpoint for any object type. The initial scope of this SIP is to replace the existing "Import Dashboards" menu entry under "Settings" in the header, but in the future we could have additional navigation paths, eg, having an "Import Database" option close to the button to add a new database.
   
   The current import page is a CRUD UI generated by FAB, and it will be replaced with a React-based UI.
   
   ### New dependencies
   
   No new dependencies will be introduced.
   
   ### Migration Plan and Compatibility
   
   Implementing UUIDs on the `ImportMixin` would require adding a new column to the models to store the UUID, as well as populating existing objects with a new value. To simplify the export import/export process, the migration script would also add UUIDs to the `position_json` field in dashboards, pointing to the object UUID instead of its primary key.
   
   ### Rejected Alternatives
   
   Other serialization formats were considered, but since we want files to be human-readable [YAML stood out](https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats), especially because it doesn't introduce any new dependencies.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] issue-label-bot[bot] commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
issue-label-bot[bot] commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-703968402


   Issue-Label Bot is automatically applying the label `#enhancement` to this issue, with a confidence of 0.98. Please mark this comment with :thumbsup: or :thumbsdown: to give our bot feedback! 
   
    Links: [app homepage](https://github.com/marketplace/issue-label-bot), [dashboard](https://mlbot.net/data/apache/incubator-superset) and [code](https://github.com/hamelsmu/MLapp) for this bot.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] bkyryliuk commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
bkyryliuk commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-705798018


   > > @betodealmeida would you see backing up / powering charts & dashboards from the github as a continuation of this effort ?
   > 
   > I think we can come up with nice abstractions when developing the import, so that they can be reused when developing a mechanism for filesystem-based configuration. But I'm not sure on the timeline for that, it's not a high priority AFAIK.
   
   it would be great to keep that in mind, thank you! 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] scriminaci commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
scriminaci commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-728788588


   Hi! I think this would be a significant improvement. We currently export dashboards and we manipulate the output JSON to be able to recreate them for a different database (we provide the same dashboards to different customers). So basically we are using the exports as templates. 
   
   Another thing that is relevant and not supported in the import now is to use different databases in the same dashboard or the same export. You can import specifying just one db.
   
   Thanks!


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] villebro commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
villebro commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-712397245


   I think this is a great initiative, really looking forward to all that this will enable, both stability and new functionality! 👍 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] bkyryliuk commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
bkyryliuk commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-705798018


   > > @betodealmeida would you see backing up / powering charts & dashboards from the github as a continuation of this effort ?
   > 
   > I think we can come up with nice abstractions when developing the import, so that they can be reused when developing a mechanism for filesystem-based configuration. But I'm not sure on the timeline for that, it's not a high priority AFAIK.
   
   it would be great to keep that in mind, thank you! 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] betodealmeida commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
betodealmeida commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-705721667


   > @betodealmeida would you see backing up / powering charts & dashboards from the github as a continuation of this effort ?
   
   I think we can come up with nice abstractions when developing the import, so that they can be reused when developing a mechanism for filesystem-based configuration. But I'm not sure on the timeline for that, it's not a high priority AFAIK.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] CoryChaplin commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
CoryChaplin commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-706522541


   > > @betodealmeida would you see backing up / powering charts & dashboards from the github as a continuation of this effort ?
   > 
   > I think we can come up with nice abstractions when developing the import, so that they can be reused when developing a mechanism for filesystem-based configuration. But I'm not sure on the timeline for that, it's not a high priority AFAIK.
   
   I think there's a nice use case for "official dashboards" that can be versioned, changes reviewed and deployed. And even more so for people running multiple instances of Superset, starting by dev/staging/prod or multi-team/multi-customer versions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] eugeniamz commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
eugeniamz commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-713835774


   What about including export and import queries?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] betodealmeida commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
betodealmeida commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-705721667


   > @betodealmeida would you see backing up / powering charts & dashboards from the github as a continuation of this effort ?
   
   I think we can come up with nice abstractions when developing the import, so that they can be reused when developing a mechanism for filesystem-based configuration. But I'm not sure on the timeline for that, it's not a high priority AFAIK.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] bkyryliuk commented on issue #11167: [SIP-54] Proposal to improve export/import of objects

Posted by GitBox <gi...@apache.org>.
bkyryliuk commented on issue #11167:
URL: https://github.com/apache/incubator-superset/issues/11167#issuecomment-704429426


   @betodealmeida would you see backing up / powering charts & dashboards from the github as a continuation of this effort ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org