You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@shardingsphere.apache.org by GitBox <gi...@apache.org> on 2021/06/14 09:07:53 UTC

[GitHub] [shardingsphere] lyflyy opened a new issue #10806: ## Wrong LIMIT rewrite while GROUP BY and ORDER BY

lyflyy opened a new issue #10806:
URL: https://github.com/apache/shardingsphere/issues/10806


   ## I also had this issue in 【4.1.1】
   
   ## Bug Report
   
   **For English only**, other languages will not accept.
   
   Before report a bug, make sure you have:
   
   - Searched open and closed [GitHub issues](https://github.com/apache/shardingsphere/issues).
   - Read documentation: [ShardingSphere Doc](https://shardingsphere.apache.org/document/current/en/overview).
   
   Please pay attention on issues you submitted, because we maybe need more details. 
   If no response anymore and we cannot reproduce it on current information, we will **close it**.
   
   Please answer these questions before submitting your issue. Thanks!
   
   ### Which version of ShardingSphere did you use?
   4.1.1
   
   ### Which project did you use? ShardingSphere-JDBC or ShardingSphere-Proxy?
   ShardingSphere-JDBC
   
   ### Expected behavior
   I want to return 10 pieces of data in the SQL result
   
   ### Actual behavior
   The rewritten SQL queries 2147483647 pieces of data
   
   ### Reason analyze (If you can)
   
   ### Steps to reproduce the behavior, such as: SQL to execute, sharding rule configuration, when exception occur etc.
   ShardingSphere-SQL LOG detail
   ```
   2021-06-14 16:40:46.205 DEBUG 12448 --- [nio-8080-exec-1] c.s.d.mapper.OrdersMapper.officialDemo   : ==>  Preparing: SELECT user_id, SUM(price) `sum` FROM orders where create_time > ? and create_time < ? GROUP BY user_id ORDER BY `sum` DESC LIMIT 0, 10 
   2021-06-14 16:40:46.231 DEBUG 12448 --- [nio-8080-exec-1] c.s.d.mapper.OrdersMapper.officialDemo   : ==> Parameters: 2020-07-09 13:54:57.0(Timestamp), 2022-07-19 13:54:57.0(Timestamp)
   2021-06-14 16:40:47.088  INFO 12448 --- [nio-8080-exec-1] ShardingSphere-SQL                       : Logic SQL: SELECT user_id, SUM(price) `sum` FROM orders
            
               where create_time > ? and create_time < ?
            
           GROUP BY user_id ORDER BY `sum` DESC LIMIT 0, 10
   2021-06-14 16:40:47.088  INFO 12448 --- [nio-8080-exec-1] ShardingSphere-SQL                       : SQLStatement: SelectStatementContext(super=CommonSQLStatementContext(sqlStatement=org.apache.shardingsphere.sql.parser.sql.statement.dml.SelectStatement@109c4794, tablesContext=org.apache.shardingsphere.sql.parser.binder.segment.table.TablesContext@66e38b62), tablesContext=org.apache.shardingsphere.sql.parser.binder.segment.table.TablesContext@66e38b62, projectionsContext=ProjectionsContext(startIndex=7, stopIndex=31, distinctRow=false, projections=[ColumnProjection(owner=null, name=user_id, alias=Optional.empty), AggregationProjection(type=SUM, innerExpression=(price), alias=Optional[sum], derivedAggregationProjections=[], index=-1)]), groupByContext=org.apache.shardingsphere.sql.parser.binder.segment.select.groupby.GroupByContext@1bba7f5d, orderByContext=org.apache.shardingsphere.sql.parser.binder.segment.select.orderby.OrderByContext@3e02c417, paginationContext=org.apache.shardi
 ngsphere.sql.parser.binder.segment.select.pagination.PaginationContext@4c5084a4, containsSubquery=false)
   2021-06-14 16:40:47.088  INFO 12448 --- [nio-8080-exec-1] ShardingSphere-SQL                       : Actual SQL: dbs0 ::: SELECT user_id, SUM(price) `sum` FROM orders_6
            
               where create_time > ? and create_time < ?
            
           GROUP BY user_id ORDER BY `sum` DESC LIMIT 0, 2147483647 ::: [2020-07-09 13:54:57.0, 2022-07-19 13:54:57.0]
   2021-06-14 16:40:47.088  INFO 12448 --- [nio-8080-exec-1] ShardingSphere-SQL                       : Actual SQL: dbs1 ::: SELECT user_id, SUM(price) `sum` FROM orders_6
            
               where create_time > ? and create_time < ?
            
           GROUP BY user_id ORDER BY `sum` DESC LIMIT 0, 2147483647 ::: [2020-07-09 13:54:57.0, 2022-07-19 13:54:57.0]
   2021-06-14 16:40:47.088  INFO 12448 --- [nio-8080-exec-1] ShardingSphere-SQL                       : Actual SQL: dbs2 ::: SELECT user_id, SUM(price) `sum` FROM orders_6
            
               where create_time > ? and create_time < ?
            
           GROUP BY user_id ORDER BY `sum` DESC LIMIT 0, 2147483647 ::: [2020-07-09 13:54:57.0, 2022-07-19 13:54:57.0]
   2021-06-14 16:40:48.780 DEBUG 12448 --- [nio-8080-exec-1] c.s.d.mapper.OrdersMapper.officialDemo   : <==      Total: 10
   ```
   
   ### Example codes for reproduce this issue (such as a github link).
   1. I use read/write separation and my configuration file is as follows
   ```
   # 数据源 db0,db1
   logging.level.com.sharding.demo.mapper=debug
   
   spring.shardingsphere.props.sql.show=true
   spring.shardingsphere.props.max.connections.size.per.query=1
   
   mybatis-plus.mapper-locations=classpath:/mapper/*.xml
   
   spring.shardingsphere.datasource.names = db0,db1,db2,dbs0,dbs1,dbs2
   
   spring.shardingsphere.datasource.db0.type = com.zaxxer.hikari.HikariDataSource
   spring.shardingsphere.datasource.db0.driver-class-name = com.mysql.cj.jdbc.Driver
   spring.shardingsphere.datasource.db0.jdbc-url = jdbc:mysql://127.0.0.1:3306/db0?characterEncoding=utf8&useSSL=false
   spring.shardingsphere.datasource.db0.username = root
   spring.shardingsphere.datasource.db0.password = root
   
   spring.shardingsphere.datasource.db1.type = com.zaxxer.hikari.HikariDataSource
   spring.shardingsphere.datasource.db1.driver-class-name = com.mysql.cj.jdbc.Driver
   spring.shardingsphere.datasource.db1.jdbc-url = jdbc:mysql://127.0.0.1:3306/db1?characterEncoding=utf8&useSSL=false
   spring.shardingsphere.datasource.db1.username = root
   spring.shardingsphere.datasource.db1.password = root
   
   spring.shardingsphere.datasource.db2.type = com.zaxxer.hikari.HikariDataSource
   spring.shardingsphere.datasource.db2.driver-class-name = com.mysql.cj.jdbc.Driver
   spring.shardingsphere.datasource.db2.jdbc-url = jdbc:mysql://127.0.0.1:3306/db2?characterEncoding=utf8&useSSL=false
   spring.shardingsphere.datasource.db2.username = root
   spring.shardingsphere.datasource.db2.password = root
   
   spring.shardingsphere.datasource.dbs0.type = com.zaxxer.hikari.HikariDataSource
   spring.shardingsphere.datasource.dbs0.driver-class-name = com.mysql.cj.jdbc.Driver
   spring.shardingsphere.datasource.dbs0.jdbc-url = jdbc:mysql://127.0.0.1:3307/db0?characterEncoding=utf8&useSSL=false
   spring.shardingsphere.datasource.dbs0.username = root
   spring.shardingsphere.datasource.dbs0.password = root
   
   spring.shardingsphere.datasource.dbs1.type = com.zaxxer.hikari.HikariDataSource
   spring.shardingsphere.datasource.dbs1.driver-class-name = com.mysql.cj.jdbc.Driver
   spring.shardingsphere.datasource.dbs1.jdbc-url = jdbc:mysql://127.0.0.1:3307/db1?characterEncoding=utf8&useSSL=false
   spring.shardingsphere.datasource.dbs1.username = root
   spring.shardingsphere.datasource.dbs1.password = root
   
   spring.shardingsphere.datasource.dbs2.type = com.zaxxer.hikari.HikariDataSource
   spring.shardingsphere.datasource.dbs2.driver-class-name = com.mysql.cj.jdbc.Driver
   spring.shardingsphere.datasource.dbs2.jdbc-url = jdbc:mysql://127.0.0.1:3307/db2?characterEncoding=utf8&useSSL=false
   spring.shardingsphere.datasource.dbs2.username = root
   spring.shardingsphere.datasource.dbs2.password = root
   
   #master-slave   基于master1和master2主从集群实现读写分离
   spring.shardingsphere.sharding.master-slave-rules.db0.master-data-source-name=db0
   spring.shardingsphere.sharding.master-slave-rules.db0.slave-data-source-names=dbs0
   
   spring.shardingsphere.sharding.master-slave-rules.db1.master-data-source-name=db1
   spring.shardingsphere.sharding.master-slave-rules.db1.slave-data-source-names=dbs1
   
   spring.shardingsphere.sharding.master-slave-rules.db2.master-data-source-name=db2
   spring.shardingsphere.sharding.master-slave-rules.db2.slave-data-source-names=dbs2
   
   #多个从库的时候使用负载均衡
   spring.shardingsphere.masterslave.load-balance-algorithm-type=ROUND_ROBIN
   
   # user 表策略 用户按照性别进行分库, 2个库, id取模进行分表 3张表, db_male ,db_female user_0, user_1, user_2
   # 分库策略 根据id取模确定数据进哪个数据库
   spring.shardingsphere.sharding.tables.user.database-strategy.inline.sharding-column = sex
   spring.shardingsphere.sharding.tables.user.database-strategy.inline.algorithm-expression = db$->{sex % 2}
   # 分表策略
   spring.shardingsphere.sharding.tables.user.actual-data-nodes = db$->{0..1}.user_$->{0..2}
   # 分表字段member_id
   spring.shardingsphere.sharding.tables.user.table-strategy.inline.sharding-column = id
   # 分表策略 根据member_id取模,确定数据最终落在那个表中
   spring.shardingsphere.sharding.tables.user.table-strategy.inline.algorithm-expression = user_$->{id % 3}
   # 使用SNOWFLAKE算法生成主键
   spring.shardingsphere.sharding.tables.user.key-generator.column = id
   spring.shardingsphere.sharding.tables.user.key-generator.type = SNOWFLAKE
   
   
   # 商品按照类型取模分库,3个库,id取模进行分表 db_type1, db_type2, db_type3, goods_0, goods_1, goods_2
   # 分库策略 根据id取模确定数据进哪个数据库
   spring.shardingsphere.sharding.tables.goods.database-strategy.inline.sharding-column = type
   spring.shardingsphere.sharding.tables.goods.database-strategy.inline.algorithm-expression = db$->{type % 3}
   # 分表策略
   spring.shardingsphere.sharding.tables.goods.actual-data-nodes = db$->{0..2}.goods_$->{0..2}
   # 分表字段member_id
   spring.shardingsphere.sharding.tables.goods.table-strategy.inline.sharding-column = id
   # 分表策略 根据member_id取模,确定数据最终落在那个表中
   spring.shardingsphere.sharding.tables.goods.table-strategy.inline.algorithm-expression = goods_$->{id % 3}
   # 使用SNOWFLAKE算法生成主键
   spring.shardingsphere.sharding.tables.goods.key-generator.column = id
   spring.shardingsphere.sharding.tables.goods.key-generator.type = SNOWFLAKE
   
   
   # 订单按照下单月份进行分表,年份进行分库, db_2020,db_2021,db_2022
   #声明虚拟表
   spring.shardingsphere.sharding.tables.orders.actual-data-nodes=db$->{0..2}.orders_$->{0..11}
   #声明表内的主键
   spring.shardingsphere.sharding.tables.orders.key-generator.column=id
   #声明主键生成策略
   spring.shardingsphere.sharding.tables.orders.key-generator.type=SNOWFLAKE
   ##声明根据哪个字段进行分片
   spring.shardingsphere.sharding.tables.orders.database-strategy.standard.sharding-column=create_time
   ##自定义分片规则类
   ## 分库规则orders_auto_increment
   spring.shardingsphere.sharding.tables.orders.database-strategy.standard.precise-algorithm-class-name=com.sharding.demo.algorithm.OrderDatabaseShardingAlgorithm
   spring.shardingsphere.sharding.tables.orders.database-strategy.standard.range-algorithm-class-name=com.sharding.demo.algorithm.OrderDatabaseShardingRangeAlgorithm
   #声明根据哪个字段进行分片
   spring.shardingsphere.sharding.tables.orders.table-strategy.standard.sharding-column=create_time
   #自定义分片规则类
   spring.shardingsphere.sharding.tables.orders.table-strategy.standard.precise-algorithm-class-name=com.sharding.demo.algorithm.OrderTableShardingAlgorithm
   spring.shardingsphere.sharding.tables.orders.table-strategy.standard.range-algorithm-class-name=com.sharding.demo.algorithm.OrderTableShardingRangeAlgorithm
   
   # 配置公共表
   # 一个实体类对应两张表,覆盖
   spring.main.allow-bean-definition-overriding=true
   spring.shardingsphere.sharding.broadcast-tables=common_dict
   # 配置数据库中 t_dict 表主键 dict_id 生成策略 SNOWFLAKE 雪花算法
   spring.shardingsphere.sharding.tables.common_dict.key-generator.column=dict_id
   spring.shardingsphere.sharding.tables.common_dict.key-generator.type=SNOWFLAKE
   ```
   2. i use mybatis
   ```
   <select id="officialDemo" resultType="com.sharding.demo.vo.OffocialDemoVo">
       SELECT user_id, SUM(price) `sum` FROM orders
       <if test="orders.createTimeStart != null">
           where create_time > #{orders.createTimeStart} and create_time &lt; #{orders.createTimeEnd}
       </if>
       GROUP BY user_id ORDER BY `sum` DESC LIMIT 0, 10
   </select>
   ```
   
   _Originally posted by @lyflyy in https://github.com/apache/shardingsphere/issues/2062#issuecomment-860525708_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] lyflyy commented on issue #10806: ## Wrong LIMIT rewrite while GROUP BY and ORDER BY

Posted by GitBox <gi...@apache.org>.
lyflyy commented on issue #10806:
URL: https://github.com/apache/shardingsphere/issues/10806#issuecomment-861128332


   @strongduangmu Thank you very much for your answer
   
   But I think if SQL were rewritten to look like this
   ```
   Actual SQL: dbs0 ::: SELECT user_id, SUM(price) `sum` FROM orders_6
   where create_time > ? and create_time < ?
   GROUP BY user_id ORDER BY `sum` DESC LIMIT 0, 10 ::: [2020-07-09 13:54:57.0, 2022-07-19 13:54:57.0]
   ```
   , and then merge all the shard data, it would also get the right result


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] lyflyy commented on issue #10806: ## Wrong LIMIT rewrite while GROUP BY and ORDER BY

Posted by GitBox <gi...@apache.org>.
lyflyy commented on issue #10806:
URL: https://github.com/apache/shardingsphere/issues/10806#issuecomment-861140883


   
   @strongduanmu Thank you very much for your answer
   
   I understand that the data grouped by user ID may be distributed in any of the three tables, and the data sorted by grouping the three tables separately may not be accurate
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] lyflyy closed issue #10806: ## Wrong LIMIT rewrite while GROUP BY and ORDER BY

Posted by GitBox <gi...@apache.org>.
lyflyy closed issue #10806:
URL: https://github.com/apache/shardingsphere/issues/10806


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] lyflyy edited a comment on issue #10806: ## Wrong LIMIT rewrite while GROUP BY and ORDER BY

Posted by GitBox <gi...@apache.org>.
lyflyy edited a comment on issue #10806:
URL: https://github.com/apache/shardingsphere/issues/10806#issuecomment-861128332


   @strongduanmu  Thank you very much for your answer
   
   But I think if SQL were rewritten to look like this
   ```
   Actual SQL: dbs0 ::: SELECT user_id, SUM(price) `sum` FROM orders_6
   where create_time > ? and create_time < ?
   GROUP BY user_id ORDER BY `sum` DESC LIMIT 0, 10 ::: [2020-07-09 13:54:57.0, 2022-07-19 13:54:57.0]
   ```
   , and then merge all the shard data, it would also get the right result


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] strongduanmu commented on issue #10806: ## Wrong LIMIT rewrite while GROUP BY and ORDER BY

Posted by GitBox <gi...@apache.org>.
strongduanmu commented on issue #10806:
URL: https://github.com/apache/shardingsphere/issues/10806#issuecomment-861123359


   @lyflyy Thank you for your feedback. Since the `orders` table is a sharding table with multiple data nodes, when calculating `SUM(price)`, we need to obtain data on different nodes, and then perform aggregation calculations. Therefore, the limit clause needs to be rewritten to meet the data needed for the calculation. 
   For more information, you can refer to the [doc](https://shardingsphere.apache.org/document/current/cn/features/sharding/principle/merge/) of the merge engine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] strongduanmu commented on issue #10806: ## Wrong LIMIT rewrite while GROUP BY and ORDER BY

Posted by GitBox <gi...@apache.org>.
strongduanmu commented on issue #10806:
URL: https://github.com/apache/shardingsphere/issues/10806#issuecomment-861134021


   @lyflyy Through this SQL, we can get the TOP 10 sum data corresponding to the `orders_6` table. But can this ensure that we get the global TOP 10 sum data after merging?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [shardingsphere] strongduanmu edited a comment on issue #10806: ## Wrong LIMIT rewrite while GROUP BY and ORDER BY

Posted by GitBox <gi...@apache.org>.
strongduanmu edited a comment on issue #10806:
URL: https://github.com/apache/shardingsphere/issues/10806#issuecomment-861123359


   @lyflyy Thank you for your feedback. Since the `orders` table is a sharding table with multiple data nodes, when calculating `SUM(price)`, we need to obtain data on different nodes, and then perform aggregation calculations. Therefore, the limit clause needs to be rewritten to meet the data needed for the calculation. 
   
   For more information, you can refer to the [doc](https://shardingsphere.apache.org/document/current/cn/features/sharding/principle/merge/) of the merge engine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org