You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2021/09/06 09:58:14 UTC

[GitHub] [superset] EBoisseauSierra opened a new pull request #16606: [WIP] fix(import-dashboards): Match db via name not UUID

EBoisseauSierra opened a new pull request #16606:
URL: https://github.com/apache/superset/pull/16606


   ### SUMMARY
   
   NB: We're here discussing the `import-dashboards` CLI with the `VERSIONED_EXPORT` feature flag *turned ON* — cf. #11349
   
   Currently, `superset import-dashboards` looks up at the required databases (`export/databases/*.yaml`). If the db URI (`sqlalchemy_uri` key) contains a masked password, then the script looks up for already existing databases record.
   
   The match is currently done using db UUID. However, '<insert how UUIDs are generated>', the same db connection (same URI) might not have the same UUID on the export instance, and on the import one. (Cf. #16395)
   
   Thus, we propose to lookup for the db _name_ instead — this would also allow for updating the db URI between the export and the import instance.
   
   ### TESTING INSTRUCTIONS
   <!--- Required! What steps can be taken to manually verify the changes? -->
   
   WIP — help required
   
   ### ADDITIONAL INFORMATION
   <!--- Check any relevant boxes with "x" -->
   <!--- HINT: Include "Fixes #nnn" if you are fixing an existing issue -->
   - [x] Has associated issue: #16395 
   - [x] Required feature flags: `"VERSIONED_EXPORT": True`
   - [ ] Changes UI
   - [ ] Includes DB Migration (follow approval process in [SIP-59](https://github.com/apache/superset/issues/13351))
     - [ ] Migration is atomic, supports rollback & is backwards-compatible
     - [ ] Confirm DB migration upgrade and downgrade tested
     - [ ] Runtime estimates and downtime expectations provided
   - [ ] Introduces new feature or API
   - [ ] Removes existing feature or API
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] EBoisseauSierra commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
EBoisseauSierra commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-945786830


   @betodealmeida , I think we can close this PR. Indeed, I reckon I raised an XY problem: what I want to be able to do is use different SQLA URI in dev (instance generating the export) and in prod (instance importing the export), without having to modify the archive/export itself.
   
   I'm still unsure how UUIDs are generated. However, it doesn't really matter as — as hinted by @PowerPlop — changing the SQLA URI of a db connection using its name as lookup key… is exactly what is already happening with `set-database-uri -d <db_connection_name>`! So running this command _after_ the `import-dashboards` would tick the box.
   
   The reason why we did it beforehand is that importing an export/archive containing a masked password in the `sqlalchemy_uri` key of a `export/databases/*.yaml` file fails, due to [this `ValidationError`](https://github.com/apache/superset/blob/master/superset/databases/schemas.py#L601-L602) — happening on a blank instance, at least.
   
   ➥ long story short, we still have a problem, but matching db connections with names rather than UUID is probably not the best way to address it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] EBoisseauSierra commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
EBoisseauSierra commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-938139484


   With hindsight, it might be an XY problem I've been trying to solve.
   
   Indeed, what we eventually want to achieve is to “clone” one dev instance into production — with the slight difference that our (assuming only) database on dev would be connected to `postgresql://devuser:Tr0ub4dor&3@dev.acme.org/mock_data`, while on  prod it would query `postgresql://produser:correcthorsebatterystaple@prod.acme.org/client_data`.
   
   Our process so far is:
   
   * on dev:
     1. `superset export-dashboards`
     2. `git push`
   * on prod:
     1. `superset set-database-uri -d foobar -u postgresql://produser:correcthorsebatterystaple@prod.acme.org/client_data`
     2. `git pull`
     3. `superset import-dashboards`
     
     … and we'd like the import to use the already existing `foobar` db connection (i.e. to the prod db), rather than using the URI found in the export (which is pointing to the dev db) — using only the db name (`foobar`) as the matching key (and without having to manually tinker with the URI in the export archive, so that all can be automated nicely).
     
    (I know this should rather be a new Issue, I'll format it as such in the upcoming days.) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] EBoisseauSierra commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
EBoisseauSierra commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-938695411


   @PowerPlop that could be a good option indeed. However, for that, I'd need to have _successfully_ imported the dashboards first. And for this to be possible, because the password (of the dev db URI) is masked in the export archive, I'd need to establish — as far as I understand — a _successful_ connection to the dev db (because it's the one in the export), from the prod VM.
   
   This is obviously better than the other way around, but is still not perfect, as this would mean having to fiddle with the dev db using prod details (e.g. allowing connection from the prod VM in dev's `pg_hba.conf`) for it to work.
   
   But that opens an interesting alternative solution: adding a flag for `import-dashboards` to work without ensuring that the db connection is successful (so that dashboards are effectively imported, yet with the db connection broken), and _then_, fixing the db connection as you suggests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] EBoisseauSierra commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
EBoisseauSierra commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-938139484






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] betodealmeida commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
betodealmeida commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-937904442


   Let me take a look at the original problem again. I'm worried about this PR because the assumption that we match on UUID is baked on a lot of the design decisions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] krsnik93 commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
krsnik93 commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-938152821


   It's not just the above where the database URI's differ, it's that if you have the same database URI on two different machines, they will have different UUID's and the import will fail. So the PR is about making the import/export a bit more portable from one host to another. @EBoisseauSierra please correct if I am wrong.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] krsnik93 commented on a change in pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
krsnik93 commented on a change in pull request #16606:
URL: https://github.com/apache/superset/pull/16606#discussion_r724229485



##########
File path: superset/commands/importers/v1/__init__.py
##########
@@ -112,8 +112,11 @@ def validate(self) -> None:
                     # populate passwords from the request or from existing DBs
                     if file_name in self.passwords:
                         config["password"] = self.passwords[file_name]
-                    elif prefix == "databases" and config["uuid"] in db_passwords:
-                        config["password"] = db_passwords[config["uuid"]]
+                    elif (
+                        prefix == "databases"
+                        and config["database_name"] in db_passwords

Review comment:
       @EBoisseauSierra replacing `config["database_name"]` with `config.get("database_name")` should make the only failing test pass.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] EBoisseauSierra commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
EBoisseauSierra commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-938159662


   @krsnik93 you're right: this PR aimed to fix an issue we had where using _the same_ URI on two different machine yield to two different UUID (I haven't investigated where they are generated) — so the matching at import is failing (the pw in the archive is masked, and even if the URI/db connection already exist in Superset thanks to the set-database-uri, it is not reconciled because of the different UUIDs).
   
   But this issue of non-deterministic/reproducible UUIDs (and thus, failing import) — that my PR tries to solve by using the db connection name as lookup key — is, IMHO, a subset of a more generic problem that is “how to import-dashboards, but update the db URI on the fly?” (or use the previously instantiated db connection, using db connection name as lookup key).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] PowerPlop commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
PowerPlop commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-938671516


   @EBoisseauSierra  why not modify the URI after importing the dashboards?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] krsnik93 commented on a change in pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
krsnik93 commented on a change in pull request #16606:
URL: https://github.com/apache/superset/pull/16606#discussion_r724229485



##########
File path: superset/commands/importers/v1/__init__.py
##########
@@ -112,8 +112,11 @@ def validate(self) -> None:
                     # populate passwords from the request or from existing DBs
                     if file_name in self.passwords:
                         config["password"] = self.passwords[file_name]
-                    elif prefix == "databases" and config["uuid"] in db_passwords:
-                        config["password"] = db_passwords[config["uuid"]]
+                    elif (
+                        prefix == "databases"
+                        and config["database_name"] in db_passwords

Review comment:
       @EBoisseauSierra replacing `config["database_name"]` with `config.get("database_name")` should make the only failing test pass.  




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] betodealmeida commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
betodealmeida commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-937904442


   Let me take a look at the original problem again. I'm worried about this PR because the assumption that we match on UUID is baked on a lot of the design decisions.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] krsnik93 commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
krsnik93 commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-938152821


   It's not just the above where the database URI's differ, it's that if you have the same database URI on two different machines, they will have different UUID's and the import will fail. So the PR is about making the import/export a bit more portable from one host to another. @EBoisseauSierra please correct if I am wrong.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] PowerPlop commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
PowerPlop commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-942066794


   @EBoisseauSierra  when I import a dashboard via the UI (or API), it does not seem to validate if the database connection is valid?
   
   Our process is:
   
   1. Develop dashboard in dev/locally
   2. Export to zip
   3. Commit unzipped
   4. import dashboard in PRD (zipped) via API
   5. update database connection url (manually or scripted / first time only)
   
   When updating an existing dashboard we simply import them again with option 'Overwrite' for dashboard,chart and dataset. The database record is not changed (only at initial import).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] PowerPlop commented on pull request #16606: fix(import-dashboards): Match db via name not UUID

Posted by GitBox <gi...@apache.org>.
PowerPlop commented on pull request #16606:
URL: https://github.com/apache/superset/pull/16606#issuecomment-942066794


   @EBoisseauSierra  when I import a dashboard via the UI (or API), it does not seem to validate if the database connection is valid?
   
   Our process is:
   
   1. Develop dashboard in dev/locally
   2. Export to zip
   3. Commit unzipped
   4. import dashboard in PRD (zipped) via API
   5. update database connection url (manually or scripted / first time only)
   
   When updating an existing dashboard we simply import them again with option 'Overwrite' for dashboard,chart and dataset. The database record is not changed (only at initial import).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org