You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "eupraxis1981 (via GitHub)" <gi...@apache.org> on 2023/06/26 18:28:31 UTC

[GitHub] [pinot] eupraxis1981 opened a new issue, #10974: Clean up table management CLI commands

eupraxis1981 opened a new issue, #10974:
URL: https://github.com/apache/pinot/issues/10974

   # What's the problem?
   Cannot add schema and tables separately via CLI. Current AddTable CLI command assumes table(s) and associated schema are uploaded together. 
    
    * Prevents using AddSchema -- adding the schema would prevent using AddTable later as it will flag a duplicate schema is being uploaded.
    * Sometimes you want to evolve your tables and/or schema and need extra flexibility on when you define schemas vs tables that rely upon them.
    
    # What's the proposal?
   Modify AddTable command signature so you can add many tables at once without needing to specify schema separately on command line.
   
   Also update AddSchema to enable updates/overwrites vs just inserts of schemas.
   
   ```
   Usage: AddTable -tableConfigFiles tableConfigFile1[,tableConfigFile2][,...] [OPTIONS] 
   Add tables defined in comma-separated list of tableConfig files. 
   NOTE: Each tableConfig file must specify the table schema and type.
   
   Options:
       -schemaFiles  Comma-separated list of schemaConfig files to add or update (default null)              
       -controllerProtocol  Protocol to use to connect to controller (default 'http') 
       -controllerPort  Port of controller (default 9000) 
       -user  Username (default null)
       -password  Password associated with user (default null) 
   ```
   
   Since user has to specify the schema in the tableConfig it is redundant to require it in the command line. However, I see the usefulness of allowing a user to add/update a schema at the same time as uploading tables as a convenience.
   
   We should also expand `AddSchema` to actually be an UPSERT so you can manage schemas more fully from CLI as well.
   
   # Risk
   None -- should be backwards compatible with scripts using earlier syntax. Only small risk is allowing overwrites vs triggering warning. Maybe have an `--overwrite` flag to enable but keep False by default?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10974: Clean up table management CLI commands

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10974:
URL: https://github.com/apache/pinot/issues/10974#issuecomment-1612203080

   Great suggestion!
   FYI, `schemaName` is not mandatory in the table config (actually we recommend not putting it), and by default schema name should be the same as table name. E.g. for `myTable` table, the schema should also be named as `myTable`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] eupraxis1981 commented on issue #10974: Clean up table management CLI commands

Posted by "eupraxis1981 (via GitHub)" <gi...@apache.org>.
eupraxis1981 commented on issue #10974:
URL: https://github.com/apache/pinot/issues/10974#issuecomment-1612219817

   @Jackie-Jiang -- so if tableName is 1:1 with schemaName then I agree that schema shouldn't need to be defined in the table config. I assume that order-of-ops would then need to be: addSchema then addTable(s) but the only linkage between the two would be via their name.
   
   This also means that I can't use the same schema for different tables (e.g., OrdersTable, ShippingTable if both use same columns but in diff contexts). Not a big deal as most people are used to a table and its schema being "two sides of the same coin". 
   
   In that case, I'll leave the option -schemaFile in case user wants to upload the table and schema at same time. I'll remove the '-schemaName' option as it is unnecessary.
   
   Agree?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #10974: Clean up table management CLI commands

Posted by "Jackie-Jiang (via GitHub)" <gi...@apache.org>.
Jackie-Jiang commented on issue #10974:
URL: https://github.com/apache/pinot/issues/10974#issuecomment-1612272348

   Yes, we don't need the `-schemaName` option. `schemaName` is embedded in the table config, and is optional. When it is configured, we look up the schema using `schemaName`; when it is not configured, we look up the schema using the table name.
   It is not recommended to use a schema name not the same as table name, but we don't enforce that as of now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org