You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/12/16 06:43:47 UTC

[GitHub] [doris] nextdreamblue opened a new issue, #15129: [Bug] when enable_segcompaction = true, get wrong result after a big load

nextdreamblue opened a new issue, #15129:
URL: https://github.com/apache/doris/issues/15129

   ### Search before asking
   
   - [X] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues.
   
   
   ### Version
   
   master
   
   ### What's Wrong?
   
   当是普通的unique 表(非merge-on-write),以及agg表的时候,一个大的数据导入,会产生若干个segment,会触发segment compaction,而segment compaction并没有对普通的unique表和agg表中,想同rowset下不同segment下有相同key(主键)的值进行合理的去重或者聚合,导致合并后的segment有相同主键,然后执行查询,就会查询到相同的主键行
   
   enable_segcompaction = false时,则没复现问题
   
   ### What You Expected?
   
   查询返回正确的结果
   
   ### How to Reproduce?
   
   使用这个测试数据
   https://doris-agg-table-bug-1301087413.cos.ap-beijing.myqcloud.com/agg_table_bug/segcompaction_lineitem
   数据中有若干行相同主键的数据
   
   创建表
   ```
   CREATE TABLE `lineitem_unique` (
     `l_orderkey` bigint(20) NULL,
     `l_partkey` int(11) NULL,
     `l_suppkey` int(11) NULL,
     `l_linenumber` int(11) NULL,
     `l_quantity` decimal(15, 2) NULL,
     `l_extendedprice` decimal(15, 2) NULL,
     `l_discount` decimal(15, 2) NULL,
     `l_tax` decimal(15, 2) NULL,
     `l_returnflag` char(1) NULL,
     `l_linestatus` char(1) NULL,
     `l_shipdate` date NULL,
     `l_commitdate` date NULL,
     `l_receiptdate` date NULL,
     `l_shipinstruct` char(25) NULL,
     `l_shipmode` char(10) NULL,
     `l_comment` varchar(44) NULL
   ) ENGINE=OLAP
   UNIQUE KEY(`l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`)
   COMMENT 'OLAP'
   DISTRIBUTED BY HASH(`l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`) BUCKETS 1
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 1",
   "in_memory" = "false",
   "storage_format" = "V2",
   "disable_auto_compaction" = "false"
   );
   
   
   CREATE TABLE `lineitem_agg` (
     `l_orderkey` bigint(20) NULL,
     `l_partkey` int(11) NULL,
     `l_suppkey` int(11) NULL,
     `l_linenumber` int(11) NULL,
     `l_quantity` decimal(15, 2) REPLACE NULL,
     `l_extendedprice` decimal(15, 2) REPLACE NULL,
     `l_discount` decimal(15, 2) REPLACE NULL,
     `l_tax` decimal(15, 2) REPLACE NULL,
     `l_returnflag` char(1) REPLACE NULL,
     `l_linestatus` char(1) REPLACE NULL,
     `l_shipdate` date REPLACE NULL,
     `l_commitdate` date REPLACE NULL,
     `l_receiptdate` date REPLACE NULL,
     `l_shipinstruct` char(25) REPLACE NULL,
     `l_shipmode` char(10) REPLACE NULL,
     `l_comment` varchar(44) REPLACE NULL
   ) ENGINE=OLAP
   AGGREGATE KEY(`l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`)
   COMMENT 'OLAP'
   DISTRIBUTED BY HASH(`l_orderkey`, `l_partkey`, `l_suppkey`, `l_linenumber`) BUCKETS 1
   PROPERTIES (
   "replication_allocation" = "tag.location.default: 1",
   "in_memory" = "false",
   "storage_format" = "V2",
   "disable_auto_compaction" = "false"
   );
   ```
   
   使用stream_load导入上边的数据文件
   
   查询
   ```
   MySQL [test]> select l_orderkey,l_partkey,l_suppkey,l_linenumber,l_comment from lineitem_unique where l_orderkey=5970000001 and l_partkey=75814607 and l_suppkey=5814608 and l_linenumber=1;
   +------------+-----------+-----------+--------------+-----------+
   | l_orderkey | l_partkey | l_suppkey | l_linenumber | l_comment |
   +------------+-----------+-----------+--------------+-----------+
   | 5970000001 |  75814607 |   5814608 |            1 |  00_1124  |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_184   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_245   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_307   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_368   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_430   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_491   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_552   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_61    |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_614   |
   +------------+-----------+-----------+--------------+-----------+
   10 rows in set (0.01 sec)
   ```
   
   获得了10行主键完全相同的key
   
   查询另外一个表
   ```
   MySQL [test]> select l_orderkey,l_partkey,l_suppkey,l_linenumber,l_comment from lineitem_agg where l_orderkey=5970000001 and l_partkey=75814607 and l_suppkey=5814608 and l_linenumber=1;
   +------------+-----------+-----------+--------------+-----------+
   | l_orderkey | l_partkey | l_suppkey | l_linenumber | l_comment |
   +------------+-----------+-----------+--------------+-----------+
   | 5970000001 |  75814607 |   5814608 |            1 |  00_1124  |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_191   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_255   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_319   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_383   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_447   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_511   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_575   |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_63    |
   | 5970000001 |  75814607 |   5814608 |            1 |  00_638   |
   +------------+-----------+-----------+--------------+-----------+
   10 rows in set (0.05 sec)
   ```
   
   
   ### Anything Else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [I] [Bug] when enable_segcompaction = true, get wrong result after a big load [doris]

Posted by "nextdreamblue (via GitHub)" <gi...@apache.org>.
nextdreamblue closed issue #15129: [Bug] when enable_segcompaction = true, get wrong result after a big load
URL: https://github.com/apache/doris/issues/15129


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] freemandealer commented on issue #15129: [Bug] when enable_segcompaction = true, get wrong result after a big load

Posted by GitBox <gi...@apache.org>.
freemandealer commented on issue #15129:
URL: https://github.com/apache/doris/issues/15129#issuecomment-1359283772

   Thanks for your testing segcompaction on agg & unique tables loading and clarifying the problem. I will fix it ASAP.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org