You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@shardingsphere.apache.org by GitBox <gi...@apache.org> on 2020/07/19 01:41:53 UTC

[GitHub] [shardingsphere] tristaZero commented on a change in pull request #6377: support mysql insert select statement route and rewrite

tristaZero commented on a change in pull request #6377:
URL: https://github.com/apache/shardingsphere/pull/6377#discussion_r456796446



##########
File path: shardingsphere-sql-parser/shardingsphere-sql-parser-binder/src/main/java/org/apache/shardingsphere/sql/parser/binder/statement/dml/InsertStatementContext.java
##########
@@ -78,6 +83,19 @@ public InsertStatementContext(final SchemaMetaData schemaMetaData, final List<Ob
         return result;
     }
     
+    private Optional<InsertSelectContext> getInsertSelectContext(final SchemaMetaData schemaMetaData, final String sql,
+                                                                 final List<Object> parameters, final AtomicInteger parametersOffset) {
+        if (!getSqlStatement().getInsertSelect().isPresent()) {
+            return Optional.empty();
+        }
+        SubquerySegment insertSelectSegment = getSqlStatement().getInsertSelect().get();
+        String selectSql = sql.substring(insertSelectSegment.getStartIndex(), insertSelectSegment.getStopIndex() + 1);

Review comment:
       Since `createProjection` needs `SQL`, here  `SQL` is passed and handled. Maybe we consider parsing `SQL` into `sqlStatement`.  It is another point, we could consider it later.
   If `sqlStatement ` contained the text of `SQL`, we would not need to pass it through the parameter list.

##########
File path: shardingsphere-features/shardingsphere-sharding/shardingsphere-sharding-rewrite/src/main/java/org/apache/shardingsphere/sharding/rewrite/context/ShardingSQLRewriteContextDecorator.java
##########
@@ -38,8 +42,15 @@
     @Override
     public void decorate(final ShardingRule shardingRule, final ConfigurationProperties props, final SQLRewriteContext sqlRewriteContext, final RouteContext routeContext) {
         for (ParameterRewriter each : new ShardingParameterRewriterBuilder(shardingRule, routeContext).getParameterRewriters(sqlRewriteContext.getSchemaMetaData())) {
-            if (!sqlRewriteContext.getParameters().isEmpty() && each.isNeedRewrite(sqlRewriteContext.getSqlStatementContext())) {
-                each.rewrite(sqlRewriteContext.getParameterBuilder(), sqlRewriteContext.getSqlStatementContext(), sqlRewriteContext.getParameters());
+            SQLStatementContext sqlStatementContext = sqlRewriteContext.getSqlStatementContext();
+            if (!sqlRewriteContext.getParameters().isEmpty() && each.isNeedRewrite(sqlStatementContext)) {
+                each.rewrite(sqlRewriteContext.getParameterBuilder(), sqlStatementContext, sqlRewriteContext.getParameters());
+            }
+            if (sqlStatementContext instanceof InsertStatementContext && null != ((InsertStatementContext) sqlStatementContext).getInsertSelectContext()

Review comment:
       The content of this condition is a bit long, maybe it is better to exact a function for it.

##########
File path: shardingsphere-features/shardingsphere-sharding/shardingsphere-sharding-route/src/main/java/org/apache/shardingsphere/sharding/route/engine/validator/impl/ShardingInsertStatementValidator.java
##########
@@ -35,9 +38,15 @@
     @Override
     public void validate(final ShardingRule shardingRule, final InsertStatement sqlStatement, final List<Object> parameters) {
         Optional<OnDuplicateKeyColumnsSegment> onDuplicateKeyColumnsSegment = sqlStatement.getOnDuplicateKeyColumns();
-        if (onDuplicateKeyColumnsSegment.isPresent() && isUpdateShardingKey(shardingRule, onDuplicateKeyColumnsSegment.get(), sqlStatement.getTable().getTableName().getIdentifier().getValue())) {
+        String tableName = sqlStatement.getTable().getTableName().getIdentifier().getValue();
+        if (onDuplicateKeyColumnsSegment.isPresent() && isUpdateShardingKey(shardingRule, onDuplicateKeyColumnsSegment.get(), tableName)) {
             throw new ShardingSphereException("INSERT INTO .... ON DUPLICATE KEY UPDATE can not support update for sharding column.");
         }
+        Optional<SubquerySegment> insertSelectSegment = sqlStatement.getInsertSelect();
+        if (insertSelectSegment.isPresent() && isContainsKeyGenerateStrategy(shardingRule, tableName)
+                && !isContainsKeyGenerateColumn(shardingRule, sqlStatement.getColumns(), tableName)) {
+            throw new ShardingSphereException("INSERT INTO .... SELECT can not support key generate column.");

Review comment:
       Do you think it is better to add estimation of `the table inserted and the table selected are the same or bind tables.`?

##########
File path: shardingsphere-infra/shardingsphere-infra-rewrite/shardingsphere-infra-rewrite-engine/src/main/java/org/apache/shardingsphere/infra/rewrite/context/SQLRewriteContext.java
##########
@@ -80,5 +80,8 @@ public void addSQLTokenGenerators(final Collection<SQLTokenGenerator> sqlTokenGe
      */
     public void generateSQLTokens() {
         sqlTokens.addAll(sqlTokenGenerators.generateSQLTokens(sqlStatementContext, parameters, schemaMetaData));
+        if (sqlStatementContext instanceof InsertStatementContext && null != ((InsertStatementContext) sqlStatementContext).getInsertSelectContext()) {

Review comment:
       It is indeed an easy way to handle the `select statement` in the `insert statement`. However,  two points get my concerns,
   
   1. IMO, `SQLRewriteContext` does not need to know which `sqlstatment` it is and whether it is required to generate SQL tokens. So maybe I know `isGenerateSQLToken(sqlStatementContext)` of each `tokenGenerator` is a more proper place to give some similar estimations. However, this way is a bit hard to do the same thing.
   2. If we call `sqlTokenGenerators.generateSQLTokens()` twice,  that means it runs `for each` twice internally. It is not efficient from my point.
   
   I'd like to listen to your idea about it. :-)

##########
File path: shardingsphere-features/shardingsphere-sharding/shardingsphere-sharding-route/src/main/java/org/apache/shardingsphere/sharding/route/engine/type/ShardingRouteEngineFactory.java
##########
@@ -94,6 +96,15 @@ public static ShardingRouteEngine newInstance(final ShardingRule shardingRule, f
         return getShardingRoutingEngine(shardingRule, sqlStatementContext, shardingConditions, tableNames, props);
     }
     
+    private static Collection<String> getTableNames(final SQLStatementContext sqlStatementContext) {
+        Collection<String> tableNames = new LinkedList<>(sqlStatementContext.getTablesContext().getTableNames());
+        if (sqlStatementContext instanceof InsertStatementContext && null != ((InsertStatementContext) sqlStatementContext).getInsertSelectContext()) {

Review comment:
       `GetTableNames()` is supposed to be a function of `InsertStatementContext`. How abou moving these process to `InsertStatementContext` ?

##########
File path: shardingsphere-features/shardingsphere-sharding/shardingsphere-sharding-rewrite/src/main/java/org/apache/shardingsphere/sharding/rewrite/context/ShardingSQLRewriteContextDecorator.java
##########
@@ -38,8 +42,15 @@
     @Override
     public void decorate(final ShardingRule shardingRule, final ConfigurationProperties props, final SQLRewriteContext sqlRewriteContext, final RouteContext routeContext) {
         for (ParameterRewriter each : new ShardingParameterRewriterBuilder(shardingRule, routeContext).getParameterRewriters(sqlRewriteContext.getSchemaMetaData())) {
-            if (!sqlRewriteContext.getParameters().isEmpty() && each.isNeedRewrite(sqlRewriteContext.getSqlStatementContext())) {
-                each.rewrite(sqlRewriteContext.getParameterBuilder(), sqlRewriteContext.getSqlStatementContext(), sqlRewriteContext.getParameters());
+            SQLStatementContext sqlStatementContext = sqlRewriteContext.getSqlStatementContext();
+            if (!sqlRewriteContext.getParameters().isEmpty() && each.isNeedRewrite(sqlStatementContext)) {
+                each.rewrite(sqlRewriteContext.getParameterBuilder(), sqlStatementContext, sqlRewriteContext.getParameters());
+            }
+            if (sqlStatementContext instanceof InsertStatementContext && null != ((InsertStatementContext) sqlStatementContext).getInsertSelectContext()
+                    && !sqlRewriteContext.getParameters().isEmpty() && each.isNeedRewrite(((InsertStatementContext) sqlStatementContext).getInsertSelectContext().getSelectStatementContext())) {
+                StandardParameterBuilder parameterBuilder = sqlRewriteContext.getParameterBuilder() instanceof GroupedParameterBuilder

Review comment:
       It is impossible to have many `values(), ()` in `INSERT SELECT`, isn't it?
   If it is that case, there is no chance for `GroupedParameterBuilder` here, right?
   Do you think no estimation for'GroupedParameterBuilder` is reasonable?

##########
File path: shardingsphere-features/shardingsphere-sharding/shardingsphere-sharding-rewrite/src/test/resources/sharding/insert.xml
##########
@@ -132,6 +132,158 @@
         <output sql="INSERT INTO t_account_1 VALUES (101, 2000, 'OK')  ON DUPLICATE KEY UPDATE status = ?" parameters="OK_UPDATE" />
     </rewrite-assertion>
 
+    <rewrite-assertion id="insert_select_with_all_columns_with_sharding_column_for_parameters" db-type="MySQL">
+        <input sql="INSERT INTO t_account (account_id, amount, status) SELECT account_id, amount, status FROM t_account WHERE account_id = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="100, 20" />
+        <output sql="INSERT INTO t_account_0 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_0 WHERE account_id = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="100, 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_all_columns_without_sharding_column_for_parameters" db-type="MySQL">
+        <input sql="INSERT INTO t_account (account_id, amount, status) SELECT account_id, amount, status FROM t_account WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+        <output sql="INSERT INTO t_account_0 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_0 WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+        <output sql="INSERT INTO t_account_1 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_1 WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_all_columns_with_sharding_column_for_literals" db-type="MySQL">
+        <input sql="INSERT INTO t_account (account_id, amount, status) SELECT account_id, amount, status FROM t_account WHERE account_id = 100 ON DUPLICATE KEY UPDATE amount = 20" />
+        <output sql="INSERT INTO t_account_0 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_0 WHERE account_id = 100 ON DUPLICATE KEY UPDATE amount = 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_all_columns_without_sharding_column_for_literals" db-type="MySQL">
+        <input sql="INSERT INTO t_account (account_id, amount, status) SELECT account_id, amount, status FROM t_account WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = 20" />
+        <output sql="INSERT INTO t_account_0 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_0 WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = 20" />
+        <output sql="INSERT INTO t_account_1 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_1 WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_without_columns_with_sharding_column_for_parameters" db-type="MySQL">
+        <input sql="INSERT INTO t_account SELECT account_id, amount, status FROM t_account WHERE account_id = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="100, 20" />
+        <output sql="INSERT INTO t_account_0 SELECT account_id, amount, status FROM t_account_0 WHERE account_id = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="100, 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_without_columns_without_sharding_column_for_parameters" db-type="MySQL">
+        <input sql="INSERT INTO t_account SELECT account_id, amount, status FROM t_account WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+        <output sql="INSERT INTO t_account_0 SELECT account_id, amount, status FROM t_account_0 WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+        <output sql="INSERT INTO t_account_1 SELECT account_id, amount, status FROM t_account_1 WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_without_columns_with_sharding_column_for_literals" db-type="MySQL">
+        <input sql="INSERT INTO t_account SELECT account_id, amount, status FROM t_account WHERE account_id = 100 ON DUPLICATE KEY UPDATE amount = 20" />
+        <output sql="INSERT INTO t_account_0 SELECT account_id, amount, status FROM t_account_0 WHERE account_id = 100 ON DUPLICATE KEY UPDATE amount = 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_without_columns_without_sharding_column_for_literals" db-type="MySQL">
+        <input sql="INSERT INTO t_account SELECT account_id, amount, status FROM t_account WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = 20" />
+        <output sql="INSERT INTO t_account_0 SELECT account_id, amount, status FROM t_account_0 WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = 20" />
+        <output sql="INSERT INTO t_account_1 SELECT account_id, amount, status FROM t_account_1 WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_binding_table_with_all_columns_with_sharding_column_for_parameters" db-type="MySQL">
+        <input sql="INSERT INTO t_account (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail WHERE account_id = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="100, 20" />
+        <output sql="INSERT INTO t_account_0 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail_0 WHERE account_id = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="100, 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_binding_table_with_all_columns_without_sharding_column_for_parameters" db-type="MySQL">
+        <input sql="INSERT INTO t_account (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+        <output sql="INSERT INTO t_account_0 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail_0 WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+        <output sql="INSERT INTO t_account_1 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail_1 WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_binding_table_with_all_columns_with_sharding_column_for_literals" db-type="MySQL">
+        <input sql="INSERT INTO t_account (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail WHERE account_id = 100 ON DUPLICATE KEY UPDATE amount = VALUES(amount)" />
+        <output sql="INSERT INTO t_account_0 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail_0 WHERE account_id = 100 ON DUPLICATE KEY UPDATE amount = VALUES(amount)" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_binding_table_with_all_columns_without_sharding_column_for_literals" db-type="MySQL">
+        <input sql="INSERT INTO t_account (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = VALUES(amount)" />
+        <output sql="INSERT INTO t_account_0 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail_0 WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = VALUES(amount)" />
+        <output sql="INSERT INTO t_account_1 (account_id, amount, status) SELECT account_id, amount, status FROM t_account_detail_1 WHERE amount = 1000 ON DUPLICATE KEY UPDATE amount = VALUES(amount)" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_binding_table_without_columns_with_sharding_column_for_parameters" db-type="MySQL">
+        <input sql="INSERT INTO t_account SELECT account_id, amount, status FROM t_account_detail WHERE account_id = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="100, 20" />
+        <output sql="INSERT INTO t_account_0 SELECT account_id, amount, status FROM t_account_detail_0 WHERE account_id = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="100, 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_binding_table_without_columns_without_sharding_column_for_parameters" db-type="MySQL">
+        <input sql="INSERT INTO t_account SELECT account_id, amount, status FROM t_account_detail WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+        <output sql="INSERT INTO t_account_0 SELECT account_id, amount, status FROM t_account_detail_0 WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+        <output sql="INSERT INTO t_account_1 SELECT account_id, amount, status FROM t_account_detail_1 WHERE amount = ? ON DUPLICATE KEY UPDATE amount = ?" parameters="1000, 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_binding_table_without_columns_with_sharding_column_for_literals" db-type="MySQL">
+        <input sql="INSERT INTO t_account SELECT account_id, amount, status FROM t_account_detail WHERE account_id = 100 ON DUPLICATE KEY UPDATE amount = 20" />
+        <output sql="INSERT INTO t_account_0 SELECT account_id, amount, status FROM t_account_detail_0 WHERE account_id = 100 ON DUPLICATE KEY UPDATE amount = 20" />
+    </rewrite-assertion>
+
+    <rewrite-assertion id="insert_select_with_binding_table_without_columns_without_sharding_column_for_literals" db-type="MySQL">

Review comment:
       Oh, I like these test cases!

##########
File path: shardingsphere-features/shardingsphere-sharding/shardingsphere-sharding-route/src/main/java/org/apache/shardingsphere/sharding/route/engine/validator/impl/ShardingInsertStatementValidator.java
##########
@@ -35,9 +38,15 @@
     @Override
     public void validate(final ShardingRule shardingRule, final InsertStatement sqlStatement, final List<Object> parameters) {
         Optional<OnDuplicateKeyColumnsSegment> onDuplicateKeyColumnsSegment = sqlStatement.getOnDuplicateKeyColumns();
-        if (onDuplicateKeyColumnsSegment.isPresent() && isUpdateShardingKey(shardingRule, onDuplicateKeyColumnsSegment.get(), sqlStatement.getTable().getTableName().getIdentifier().getValue())) {
+        String tableName = sqlStatement.getTable().getTableName().getIdentifier().getValue();
+        if (onDuplicateKeyColumnsSegment.isPresent() && isUpdateShardingKey(shardingRule, onDuplicateKeyColumnsSegment.get(), tableName)) {
             throw new ShardingSphereException("INSERT INTO .... ON DUPLICATE KEY UPDATE can not support update for sharding column.");
         }
+        Optional<SubquerySegment> insertSelectSegment = sqlStatement.getInsertSelect();
+        if (insertSelectSegment.isPresent() && isContainsKeyGenerateStrategy(shardingRule, tableName)
+                && !isContainsKeyGenerateColumn(shardingRule, sqlStatement.getColumns(), tableName)) {
+            throw new ShardingSphereException("INSERT INTO .... SELECT can not support key generate column.");

Review comment:
       This exception looks not clear enough, how about `INSERT INTO .... SELECT can not support applying keyGenerator to absent generateKeyColumn.` ?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org