You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2019/05/31 20:51:08 UTC

[GitHub] [incubator-superset] john-bodley opened a new pull request #5449: [schema] Updating the tables schema

john-bodley opened a new pull request #5449: [schema] Updating the tables schema
URL: https://github.com/apache/incubator-superset/pull/5449
 
 
   We've noticed a number of anomalies in our database caused by ill-defined forms and/or table schema definitions. This PR resolves a number of issues related to the `tables` table including:
   - Modifies the uniqueness constraint defined by PR https://github.com/apache/incubator-superset/pull/3978 (which was never created in a migration) to include the schema name (previously it was only based on the database/table). Note this requires reducing the size of the `schema` and `table_name` columns to 127 characters which is valid as most engines have a 64 character limit.
   - Removes the pre-add for checking for uniqueness. This is no longer required as the uniqueness constraint defines the same logic. This also only worked when adding a table and thus it was possible to have duplicates names if someone edited the record.
   - Adds form validation to ensure that the schema name is not part of the table name. 
   - Ensures the `table_name` column is non-nullable.
   
   ![Screen Shot 2019-03-22 at 5 09 58 PM](https://user-images.githubusercontent.com/4567245/54858973-63508380-4cc5-11e9-9c37-0c25e2b94500.png)
   
   Note this migration will fail if the `table_name` column is NULL. One must manually fix these records as programmatically trying to remedy these invalid records is difficult as the intent is unclear and the tables may function (from a query standpoint) if SQL is provided. The following query determines which records are problematic:
   ```
   SELECT 
       *
   FROM 
      tables
   WHERE 
      table_name IS NULL
   ```
   
   Additionally this migration _will_ fail if the `tables` table is corrupt  in terms of the uniqueness constraint. One **must** manually consolidate duplicate or non-valid records given there's no way of programmatically removing invalid records. The following query determines whether there are duplicates:
   ```
   SELECT 
       database_id,
       `schema`, 
       table_name,
       COUNT(1)
   FROM 
       tables
   GROUP BY 
       1, 2, 3
   HAVING 
       COUNT(1) > 1
   ORDER BY 
       COUNT(1) DESC
   ```
   
   Note this PR is gated by https://github.com/apache/incubator-superset/pull/5445 and https://github.com/apache/incubator-superset/pull/7084 which ensure that empty strings associated with form-data wont persist in the database and is necessary for ensuring that the relevant entries are non-NULL.
   
   to: @fabianmenges @graceguo-supercat @michellethomas @mistercrunch @timifasubaa 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org