You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/06/25 08:56:21 UTC

[GitHub] [incubator-doris] morningman opened a new pull request #3947: [Enhance] Add prepare phase for some timestamp functions

morningman opened a new pull request #3947:
URL: https://github.com/apache/incubator-doris/pull/3947


   Fix: #3946 
   
   CL:
   1. Add prepare phase for `from_unixtime()` and `date_format()` functions, to handle the format string once for all.
   2. Find the cctz timezone when init `runtime state`, so that don't need to find timezone for each rows.
   3. Add constant rewrite rule for `utc_timestamp()`
   4. Add doc for `to_date()`
   
   The performance shows bellow:
   
   11,000,000 rows
   
   SQL1: `select count(from_unixtime(k1)) from tbl1;`
   Before: 8.85s
   After: 2.85s
   
   SQL2: `select count(from_unixtime(k1, '%Y-%m-%d %H:%i:%s')) from tbl1 limit 1;`
   Before: 10.73s
   After: 4.85s
   
   The date string format seems still slow, we may need a further enhancement about it.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] wutiangan commented on a change in pull request #3947: [Enhance] Add prepare phase for some timestamp functions

Posted by GitBox <gi...@apache.org>.
wutiangan commented on a change in pull request #3947:
URL: https://github.com/apache/incubator-doris/pull/3947#discussion_r446109605



##########
File path: be/src/runtime/datetime_value.cpp
##########
@@ -1535,7 +1521,10 @@ bool DateTimeValue::unix_timestamp(int64_t* timestamp, const std::string& timezo
     if (!find_cctz_time_zone(timezone, ctz)) {

Review comment:
       can ctz put it into cache like 'data foramt' function?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] wutiangan commented on a change in pull request #3947: [Enhance] Add prepare phase for some timestamp functions

Posted by GitBox <gi...@apache.org>.
wutiangan commented on a change in pull request #3947:
URL: https://github.com/apache/incubator-doris/pull/3947#discussion_r446108091



##########
File path: be/src/exprs/timestamp_functions.cpp
##########
@@ -599,7 +673,7 @@ IntVal TimestampFunctions::to_unix(
 
 DateTimeVal TimestampFunctions::utc_timestamp(FunctionContext* context) {
     DateTimeValue dtv;
-    if (!dtv.from_unixtime(context->impl()->state()->timestamp_ms() / 1000, "+00:00")) {
+    if (!dtv.from_unixtime(context->impl()->state()->timestamp_ms() / 1000, "+0:00")) {

Review comment:
       why "+00:00" modify to "+0:00"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #3947: [Enhance] Add prepare phase for some timestamp functions

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #3947:
URL: https://github.com/apache/incubator-doris/pull/3947#discussion_r446279706



##########
File path: be/src/runtime/datetime_value.cpp
##########
@@ -1535,7 +1521,10 @@ bool DateTimeValue::unix_timestamp(int64_t* timestamp, const std::string& timezo
     if (!find_cctz_time_zone(timezone, ctz)) {

Review comment:
       It needs a thread safe cache.
   Since there is not much calls of `find_cctz_time_zone()`, I think we can improve it later.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #3947: [Enhance] Add prepare phase for some timestamp functions

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #3947:
URL: https://github.com/apache/incubator-doris/pull/3947#discussion_r446195099



##########
File path: be/src/exprs/timestamp_functions.cpp
##########
@@ -599,7 +673,7 @@ IntVal TimestampFunctions::to_unix(
 
 DateTimeVal TimestampFunctions::utc_timestamp(FunctionContext* context) {
     DateTimeValue dtv;
-    if (!dtv.from_unixtime(context->impl()->state()->timestamp_ms() / 1000, "+00:00")) {
+    if (!dtv.from_unixtime(context->impl()->state()->timestamp_ms() / 1000, "+0:00")) {

Review comment:
       My bad, I will change it back.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] wutiangan commented on a change in pull request #3947: [Enhance] Add prepare phase for some timestamp functions

Posted by GitBox <gi...@apache.org>.
wutiangan commented on a change in pull request #3947:
URL: https://github.com/apache/incubator-doris/pull/3947#discussion_r446108794



##########
File path: be/src/exprs/timestamp_functions.cpp
##########
@@ -451,20 +451,79 @@ BigIntVal TimestampFunctions::timestamp_diff(FunctionContext* ctx, const DateTim
     }
 }
 
+void TimestampFunctions::format_prepare(
+        doris_udf::FunctionContext* context,
+        doris_udf::FunctionContext::FunctionStateScope scope) {
+
+    if (scope != FunctionContext::FRAGMENT_LOCAL
+            || context->get_num_args() < 2
+            || context->get_arg_type(1)->type != doris_udf::FunctionContext::Type::TYPE_VARCHAR
+            || !context->is_arg_constant(1)) {
+        VLOG(10) << "format_prepare returned";
+        return;
+    }
+
+    FormatCtx* fc = new FormatCtx();
+    context->set_function_state(scope, fc);
+
+    StringVal* format = reinterpret_cast<StringVal*>(context->get_constant_arg(1));
+    if (UNLIKELY(format->is_null)) {
+        fc->is_valid = false;
+        return;
+    }
+
+    fc->fmt = convert_format(context, *format);
+    int format_len = DateTimeValue::compute_format_len((const char*) fc->fmt.ptr, fc->fmt.len);
+    if (UNLIKELY(format_len >= 128)) {
+        fc->is_valid = false;
+        return;
+    }
+
+    fc->is_valid = true;
+    return;
+}
+
+void TimestampFunctions::format_close(
+        doris_udf::FunctionContext* context,
+        doris_udf::FunctionContext::FunctionStateScope scope) {
+    if (scope != FunctionContext::FRAGMENT_LOCAL) {
+        return;
+    }
+
+    FormatCtx* fc = reinterpret_cast<FormatCtx*>(context->get_function_state(FunctionContext::FRAGMENT_LOCAL));
+    if (fc != nullptr) {
+        delete fc;
+    }
+}
+
 StringVal TimestampFunctions::date_format(
         FunctionContext* ctx, const DateTimeVal& ts_val, const StringVal& format) {
     if (ts_val.is_null || format.is_null) {
         return StringVal::null();
     }
+
     DateTimeValue ts_value = DateTimeValue::from_datetime_val(ts_val);
-    if (ts_value.compute_format_len((const char*)format.ptr, format.len) >= 128) {
-        return StringVal::null();
+    FormatCtx* fc = reinterpret_cast<FormatCtx*>(ctx->get_function_state(FunctionContext::FRAGMENT_LOCAL));
+    if (UNLIKELY(fc == nullptr)) {
+        // prepare phase failed, calculate at runtime
+        StringVal new_fmt = convert_format(ctx, format);
+        if (DateTimeValue::compute_format_len((const char*) new_fmt.ptr, new_fmt.len) >= 128) {

Review comment:
       in old code, first invoke function "compute_format_len" , then invoke function "convert_format", but your call sequence is the opposite , is it matter?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a change in pull request #3947: [Enhance] Add prepare phase for some timestamp functions

Posted by GitBox <gi...@apache.org>.
morningman commented on a change in pull request #3947:
URL: https://github.com/apache/incubator-doris/pull/3947#discussion_r446194550



##########
File path: be/src/exprs/timestamp_functions.cpp
##########
@@ -451,20 +451,79 @@ BigIntVal TimestampFunctions::timestamp_diff(FunctionContext* ctx, const DateTim
     }
 }
 
+void TimestampFunctions::format_prepare(
+        doris_udf::FunctionContext* context,
+        doris_udf::FunctionContext::FunctionStateScope scope) {
+
+    if (scope != FunctionContext::FRAGMENT_LOCAL
+            || context->get_num_args() < 2
+            || context->get_arg_type(1)->type != doris_udf::FunctionContext::Type::TYPE_VARCHAR
+            || !context->is_arg_constant(1)) {
+        VLOG(10) << "format_prepare returned";
+        return;
+    }
+
+    FormatCtx* fc = new FormatCtx();
+    context->set_function_state(scope, fc);
+
+    StringVal* format = reinterpret_cast<StringVal*>(context->get_constant_arg(1));
+    if (UNLIKELY(format->is_null)) {
+        fc->is_valid = false;
+        return;
+    }
+
+    fc->fmt = convert_format(context, *format);
+    int format_len = DateTimeValue::compute_format_len((const char*) fc->fmt.ptr, fc->fmt.len);
+    if (UNLIKELY(format_len >= 128)) {
+        fc->is_valid = false;
+        return;
+    }
+
+    fc->is_valid = true;
+    return;
+}
+
+void TimestampFunctions::format_close(
+        doris_udf::FunctionContext* context,
+        doris_udf::FunctionContext::FunctionStateScope scope) {
+    if (scope != FunctionContext::FRAGMENT_LOCAL) {
+        return;
+    }
+
+    FormatCtx* fc = reinterpret_cast<FormatCtx*>(context->get_function_state(FunctionContext::FRAGMENT_LOCAL));
+    if (fc != nullptr) {
+        delete fc;
+    }
+}
+
 StringVal TimestampFunctions::date_format(
         FunctionContext* ctx, const DateTimeVal& ts_val, const StringVal& format) {
     if (ts_val.is_null || format.is_null) {
         return StringVal::null();
     }
+
     DateTimeValue ts_value = DateTimeValue::from_datetime_val(ts_val);
-    if (ts_value.compute_format_len((const char*)format.ptr, format.len) >= 128) {
-        return StringVal::null();
+    FormatCtx* fc = reinterpret_cast<FormatCtx*>(ctx->get_function_state(FunctionContext::FRAGMENT_LOCAL));
+    if (UNLIKELY(fc == nullptr)) {
+        // prepare phase failed, calculate at runtime
+        StringVal new_fmt = convert_format(ctx, format);
+        if (DateTimeValue::compute_format_len((const char*) new_fmt.ptr, new_fmt.len) >= 128) {

Review comment:
       Doesn't matter, the old code's order is incorrect, but it doesn't matter, because `compute_format_len` will return < 128 in most cases.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #3947: [Enhance] Add prepare phase for some timestamp functions

Posted by GitBox <gi...@apache.org>.
morningman merged pull request #3947:
URL: https://github.com/apache/incubator-doris/pull/3947


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org