You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/02/18 11:14:53 UTC

[GitHub] [incubator-doris] kangkaisen opened a new issue #2932: TupleIsNullPredicate lead BE core

kangkaisen opened a new issue #2932: TupleIsNullPredicate lead BE core
URL: https://github.com/apache/incubator-doris/issues/2932
 
 
   **Describe the bug**
   ```
   *** Aborted at 1581922536 (unix time) try "date -d @1581922536" if you are using GNU date ***
   PC: @           0xf522c9 doris::RowDescriptor::tuple_is_nullable()
   *** SIGSEGV (@0x0) received by PID 82036 (TID 0x7f1b83496700) from PID 0; stack trace: ***
       @     0x7f1c39a4d5d0 (unknown)
       @           0xf522c9 doris::RowDescriptor::tuple_is_nullable()
       @           0xd9e4fe doris::TupleIsNullPredicate::prepare()
       @           0xd50c22 doris::Expr::prepare()
       @           0xd50c22 doris::Expr::prepare()
       @           0xd721b4 doris::ScalarFnCall::prepare()
       @           0xd50c22 doris::Expr::prepare()
       @           0xd721b4 doris::ScalarFnCall::prepare()
       @           0xd5a37c doris::ExprContext::prepare()
       @           0xd5141d doris::Expr::prepare()
       @          0x158919c doris::DataStreamSender::prepare()
       @          0x1019316 doris::PlanFragmentExecutor::prepare()
       @           0xfa4ba4 doris::FragmentExecState::prepare()
       @           0xfa7f85 doris::FragmentMgr::exec_plan_fragment()
       @           0xfa8c72 doris::FragmentMgr::exec_plan_fragment()
       @          0x1058e86 doris::PInternalServiceImpl<>::_exec_plan_fragment()
       @          0x1058fa2 doris::PInternalServiceImpl<>::exec_plan_fragment()
       @          0x12eb22c doris::PBackendService::CallMethod()
   ```
   
   ```
   #3  doris::RowDescriptor::tuple_is_nullable (this=this@entry=0xa9ae77678, tuple_idx=-1)
       at /home/kangkaisen/palo/be/src/runtime/descriptors.cpp:371
   ```
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1.
   
   ```
   CREATE TABLE `afo_app_gpu_resource` (
     `collect_day` date NOT NULL COMMENT "",
     `collect_min` char(20) NOT NULL COMMENT "",
     `app_id` varchar(50) NOT NULL COMMENT "",
     `group_name` varchar(100) NOT NULL COMMENT "",
     `queue_name` varchar(200) NOT NULL COMMENT "",
     `app_name` varchar(300) NULL COMMENT "",
     `app_type` varchar(50) NULL COMMENT "",
     `app_mode` varchar(20) NULL COMMENT "",
     `res_type` varchar(20) NULL ,
     `res_schedule` double SUM NULL COMMENT "",
     `res_used` double SUM NULL COMMENT "",
     `res_util` double SUM NULL COMMENT ""
   ) ENGINE=OLAP
   AGGREGATE KEY(`collect_day`, `collect_min`, `app_id`, `group_name`, `queue_name`, `app_name`, `app_type`, `app_mode`, `res_type`)
   COMMENT "OLAP"
   PARTITION BY RANGE(`collect_day`)
   (PARTITION p20200215 VALUES  LESS THAN  ('2020-02-15'),
   PARTITION p20200216 VALUES  LESS THAN  ('2020-02-16'),
   PARTITION p20200217 VALUES  LESS THAN  ('2020-02-17'))
   DISTRIBUTED BY HASH(`group_name`) BUCKETS 30
   PROPERTIES (
    "replication_num" = "1"
   ); 
   
   
   insert into afo_app_gpu_resource(`collect_day`, `collect_min`, `app_id`, `group_name`, `queue_name`, `app_name`, `app_type`, `app_mode`, `res_type`, res_schedule, res_used, res_util) values ('2020-02-16', '2020-02-16 00:00', 'application_1547876776012_46509900', 'root.serving.hadoop-peisong', 'root.serving.hadoop-peisong.p40prod', 'TFS','AFO-Serving', 'NULL', 'unknown', 2, 2,0);
   
   
   CREATE TABLE `yarn_groups_gpu_resource` (
     `dt` date NULL COMMENT "日期",
     `scenes` varchar(20) NULL COMMENT "",
     `yarn_cluster` varchar(30) NULL COMMENT "",
     `tenant` varchar(50) NULL COMMENT "",
     `group_id` varchar(10) NULL COMMENT "",
     `group_code` varchar(200) NULL COMMENT "",
     `res_type` varchar(20) NULL COMMENT "",
     `res_min` int(11) SUM NULL COMMENT ""
   ) ENGINE=OLAP
   AGGREGATE KEY(`dt`, `scenes`, `yarn_cluster`, `tenant`, `group_id`, `group_code`, `res_type`)
   PARTITION BY RANGE(`dt`)
   (PARTITION p20200215 VALUES  LESS THAN  ('2020-02-15'),
   PARTITION p20200216 VALUES  LESS THAN  ('2020-02-16'),
   PARTITION p20200217 VALUES  LESS THAN  ('2020-02-17'))
   DISTRIBUTED BY HASH(`tenant`) BUCKETS 10
   PROPERTIES (
    "replication_num" = "1"
   );
   
   insert into yarn_groups_gpu_resource(`dt`, `scenes`, `yarn_cluster`, `tenant`, `group_id`, `group_code`, `res_type`, res_min) values ('2020-02-16', 'Serving', 'gh_serving', 'hadoop-poistar', '3143', 'root.gh_serving.hadoop-poistar','gcores', 13);
   
   ```
   
   2 query
   
   **This query with date_format will fialed:**
   ```
          select date_format(dt, '%Y%m%d') as dt
            from afo_app_gpu_resource as app
            left join (
                  select  dt from yarn_groups_gpu_resource
                 ) g
           on app.collect_day = g.dt
           group by dt;
   ```
   
   **This query without date_format will success:**
   
   ```
          select dt
            from afo_app_gpu_resource as app
            left join (
                  select  dt from yarn_groups_gpu_resource
                 ) g
           on app.collect_day = g.dt
           group by dt;
   ```
   **Additional context**
   
   **The root reason is `tupleIds` in TupleIsNullPredicate is not right.**
   
   **The tupleId is 1 in TupleIsNullPredicate, but for AggregationNode, the tupleId is 3**
   
   ```
               17: tuple_is_null_pred (struct) = TTupleIsNullPredicate {
                 01: tuple_ids (list) = list<i32>[1] {
                   [0] = 1,
                 },
   ```
   
   ```
   F0218 13:24:48.494895 31692 descriptors.cpp:370] Check failed: tuple_idx < _tuple_idx_nullable_map.size() (-1 vs. 1) RowDescriptor: tuple_desc_map: [Tuple(id=3 size=24 slots=[Slot(id=3 type=VARCHAR col=-1 offset=8 null=(offset=0 mask=80))] has_varlen_slots=1)] tuple_id_map: [-1, -1, -1, 0] tuple_is_nullable: [0]
   ```
   
   The following is all TupleDescriptor:
   ```
   TupleDescriptor{id=0, tbl=afo_app_gpu_resource, byte_size=0, is_materialized=true, slots=[SlotDescriptor{id=2, parent=0, col=colle
   ct_day, type=DATE, materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}]}
   
   TupleDescriptor{id=1, tbl=yarn_groups_gpu_resource, byte_size=0, is_materialized=true, slots=[SlotDescriptor{id=0, parent=1, col=d
   t, type=DATE, materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}]}
   
   TupleDescriptor{id=2, tbl=g, byte_size=0, is_materialized=false, slots=[SlotDescriptor{id=1, parent=2, col=dt, type=DATE, material
   ized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}]}
   
   
   TupleDescriptor{id=3, tbl=null, byte_size=0, is_materialized=true, slots=[SlotDescriptor{id=3, parent=3, col=null, type=VARCHAR(*)
   , materialized=false, byteSize=0, byteOffset=-1, nullIndicatorByte=0, nullIndicatorBit=0, slotIdx=0}]}
   ```
   
   Partially fix this issue is improve `TupleIsNullPredicate::requiresNullWrapping`:
   
   ```
       private static boolean requiresNullWrapping(Expr expr, Analyzer analyzer) {
           if (expr.isConstant()) {
               return false;
           }
   
           if (expr instanceof SlotRef) {
               SlotRef slotRef = (SlotRef) expr;
               Column column = slotRef.getDesc().getColumn();
               if (column == null) {
                   return true;
               } else {
                   return !column.isAllowNull();
               }
           }
   
           if (expr instanceof ArithmeticExpr || expr instanceof FunctionCallExpr
                   || expr instanceof TimestampArithmeticExpr) {
               String functionName = expr.getFn().getFunctionName().getFunction();
               return nonNullResultWithNullParamFunctions.contains(functionName);
           }
   
           return true;
       }
   ```
   
   But which could not fix this issue completely,We should make the tupleIds in TupleIsNullPredicate is right.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangkaisen commented on issue #2932: TupleIsNullPredicate lead BE core

Posted by GitBox <gi...@apache.org>.
kangkaisen commented on issue #2932: TupleIsNullPredicate lead BE core
URL: https://github.com/apache/incubator-doris/issues/2932#issuecomment-588043385
 
 
   **So we could fix this issue by make `if` fn support date type or make `date_format` fn support date type**

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangkaisen commented on issue #2932: TupleIsNullPredicate lead BE core

Posted by GitBox <gi...@apache.org>.
kangkaisen commented on issue #2932: TupleIsNullPredicate lead BE core
URL: https://github.com/apache/incubator-doris/issues/2932#issuecomment-588043016
 
 
   The logic chain is following:
   1. `date_format(if(, NULL, `dt`), '%Y%m%d')` as HASH_PARTITIONED exprs,which is not right, we should use Agg  intermediate materialized slot
   2. we don't use Agg  intermediate materialized slot as  HASH_PARTITIONED exprs, becasue
   ```
               // the parent fragment is partitioned on the grouping exprs;
               // substitute grouping exprs to reference the *output* of the agg, not the input
               partitionExprs = Expr.substituteList(partitionExprs,
                       node.getAggInfo().getIntermediateSmap(), ctx_.getRootAnalyzer(), false);
               parentPartition = DataPartition.hashPartitioned(partitionExprs);
   ```
   the partitionExprs substitute failed。
   3. partitionExprs substitute failed because partitionExprs  has a casttodate child,but agg info getIntermediateSmap has a cast in datetime child.
   4. The cast to date or cast to datetime child exist because `TupleIsNullPredicate` insert a `if` Expr.   we don't have `if date` fn, so Doris use `if int` Expr.
   5. the `date` in the `catstodate` depend on slot dt date type. the `datetime` in the `catstodatetime` depend on datetime arg type in `date_format` function.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] kangkaisen closed issue #2932: TupleIsNullPredicate lead BE core

Posted by GitBox <gi...@apache.org>.
kangkaisen closed issue #2932: TupleIsNullPredicate lead BE core
URL: https://github.com/apache/incubator-doris/issues/2932
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org