You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "zhangstar333 (via GitHub)" <gi...@apache.org> on 2024/04/19 10:54:37 UTC

[PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

zhangstar333 opened a new pull request, #33904:
URL: https://github.com/apache/doris/pull/33904

   before get_value, it's will insert one row into column once, 
   now could insert many result into column once, could reduce some virtual function call
   
   ## Proposed changes
   
   Issue Number: close #xxx
   
   <!--Describe your changes.-->
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on code in PR #33904:
URL: https://github.com/apache/doris/pull/33904#discussion_r1574032704


##########
be/src/vec/exprs/table_function/vexplode_json_array.cpp:
##########
@@ -118,66 +121,115 @@
                     wbytes = snprintf(tmp_buf, sizeof(tmp_buf), "%f", v.GetDouble());
                 }
                 _backup_string.emplace_back(tmp_buf, wbytes);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 // do not set _data_string here.
                 // Because the address of the string stored in `_backup_string` may
                 // change each time `emplace_back()` is called.
                 break;
+            }
             case rapidjson::Type::kFalseType:
                 _backup_string.emplace_back(true_value);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 break;
             case rapidjson::Type::kTrueType:
                 _backup_string.emplace_back(false_value);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 break;
             case rapidjson::Type::kNullType:
                 _backup_string.emplace_back();
-                _string_nulls.push_back(true);
+                _values_null_flag.emplace_back(true);
                 break;
             default:
                 _backup_string.emplace_back();
-                _string_nulls.push_back(true);
+                _values_null_flag.emplace_back(true);
                 break;
             }
         }
         // Must set _data_string at the end, so that we can
         // save the real addr of string in `_backup_string` to `_data_string`.
         for (auto& str : _backup_string) {
-            _data_string.emplace_back(str);
+            _data_string_ref.emplace_back(str.data(), str.length());
         }
         break;
     }
     case ExplodeJsonArrayType::JSON: {
-        _data_string.clear();
+        _data_string_ref.clear();
         _backup_string.clear();
-        _string_nulls.clear();
+        _values_null_flag.clear();
         for (auto& v : document.GetArray()) {
             if (v.IsObject()) {
                 rapidjson::StringBuffer buffer;
                 rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
                 v.Accept(writer);
                 _backup_string.emplace_back(buffer.GetString(), buffer.GetSize());
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
             } else {
-                _data_string.push_back({});
-                _string_nulls.push_back(true);
+                _backup_string.emplace_back();
+                _values_null_flag.emplace_back(true);
             }
         }
         // Must set _data_string at the end, so that we can
         // save the real addr of string in `_backup_string` to `_data_string`.
         for (auto& str : _backup_string) {
-            _data_string.emplace_back(str);
+            _data_string_ref.emplace_back(str);
         }
         break;
     }
     default:
-        CHECK(false) << type;
+        CHECK(false) << _data_type;
         break;
     }
     return size;
 }
 
+Status ParsedData::insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,

Review Comment:
   warning: method 'insert_result_from_parsed_data' can be made const [readability-make-member-function-const]
   
   be/src/vec/exprs/table_function/vexplode_json_array.cpp:185:
   ```diff
   -                                                   int64_t cur_offset) {
   +                                                   int64_t cur_offset) const {
   ```
   



##########
be/src/vec/exprs/table_function/vexplode_json_array.cpp:
##########
@@ -40,25 +44,23 @@ std::string ParsedData::false_value = "false";
 auto max_value = std::numeric_limits<int64_t>::max(); //9223372036854775807
 auto min_value = std::numeric_limits<int64_t>::min(); //-9223372036854775808
 
-int ParsedData::set_output(ExplodeJsonArrayType type, rapidjson::Document& document) {
+int ParsedData::set_output(rapidjson::Document& document) {

Review Comment:
   warning: function 'set_output' exceeds recommended size/complexity thresholds [readability-function-size]
   ```cpp
   int ParsedData::set_output(rapidjson::Document& document) {
                   ^
   ```
   <details>
   <summary>Additional context</summary>
   
   **be/src/vec/exprs/table_function/vexplode_json_array.cpp:46:** 136 lines including whitespace and comments (threshold 80)
   ```cpp
   int ParsedData::set_output(rapidjson::Document& document) {
                   ^
   ```
   
   </details>
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2066327049

   Thank you for your contribution to Apache Doris.
   Don't know what should be done next? See [How to process your PR](https://cwiki.apache.org/confluence/display/DORIS/How+to+process+your+PR)
   
   Since 2024-03-18, the Document has been moved to [doris-website](https://github.com/apache/doris-website).
   See [Doris Document](https://cwiki.apache.org/confluence/display/DORIS/Doris+Document).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2067534907

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2071542012

   TeamCity be ut coverage result:
    Function Coverage: 35.22% (8916/25317) 
    Line Coverage: 26.99% (73324/271711)
    Region Coverage: 26.17% (37877/144750)
    Branch Coverage: 22.99% (19288/83908)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/42bab03ac0fc6a76f44f48c254fb2f29918f5764_42bab03ac0fc6a76f44f48c254fb2f29918f5764/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2072581866

   TeamCity be ut coverage result:
    Function Coverage: 35.22% (8916/25315) 
    Line Coverage: 26.98% (73327/271799)
    Region Coverage: 26.14% (37864/144824)
    Branch Coverage: 22.97% (19282/83944)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/6c0bac360caf427997725e492a4711e19938fa73_6c0bac360caf427997725e492a4711e19938fa73/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2069498829

   
   <details>
   <summary>ClickBench: <b>Total hot run time: 29.88 s</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 42bab03ac0fc6a76f44f48c254fb2f29918f5764, data reload: false
   
   query1	0.04	0.03	0.03
   query2	0.09	0.04	0.04
   query3	0.22	0.05	0.06
   query4	1.67	0.07	0.07
   query5	0.50	0.49	0.50
   query6	1.47	0.72	0.71
   query7	0.02	0.02	0.01
   query8	0.06	0.04	0.05
   query9	0.56	0.50	0.49
   query10	0.54	0.54	0.54
   query11	0.15	0.12	0.11
   query12	0.14	0.11	0.11
   query13	0.61	0.58	0.59
   query14	0.76	0.76	0.76
   query15	0.85	0.80	0.80
   query16	0.35	0.36	0.35
   query17	0.94	1.03	1.02
   query18	0.20	0.22	0.27
   query19	1.74	1.74	1.68
   query20	0.01	0.01	0.02
   query21	15.41	0.65	0.65
   query22	3.98	8.24	1.47
   query23	18.30	1.38	1.24
   query24	1.94	0.22	0.22
   query25	0.13	0.09	0.07
   query26	0.27	0.16	0.17
   query27	0.08	0.09	0.08
   query28	13.26	1.01	0.97
   query29	12.58	3.29	3.26
   query30	0.25	0.08	0.06
   query31	2.84	0.38	0.37
   query32	3.28	0.48	0.46
   query33	2.84	2.84	2.84
   query34	17.09	4.36	4.43
   query35	4.48	4.50	4.57
   query36	0.65	0.45	0.46
   query37	0.18	0.17	0.16
   query38	0.15	0.15	0.14
   query39	0.05	0.04	0.04
   query40	0.16	0.13	0.14
   query41	0.10	0.05	0.05
   query42	0.05	0.05	0.05
   query43	0.04	0.04	0.04
   Total cold run time: 109.03 s
   Total hot run time: 29.88 s
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2068495959

   TeamCity be ut coverage result:
    Function Coverage: 35.35% (8913/25215) 
    Line Coverage: 27.08% (73304/270702)
    Region Coverage: 26.22% (37861/144389)
    Branch Coverage: 23.04% (19286/83704)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/3b12636705fe2ce43371468974d7b2417fb9dfab_3b12636705fe2ce43371468974d7b2417fb9dfab/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "HappenLee (via GitHub)" <gi...@apache.org>.
HappenLee commented on code in PR #33904:
URL: https://github.com/apache/doris/pull/33904#discussion_r1576167062


##########
be/src/vec/exprs/table_function/vexplode_json_array.h:
##########
@@ -19,85 +19,231 @@
 
 #include <glog/logging.h>
 #include <rapidjson/document.h>
-#include <stddef.h>
-#include <stdint.h>
 
 #include <ostream>
 #include <string>
 #include <vector>
 
 #include "common/status.h"
 #include "gutil/integral_types.h"
+#include "rapidjson/stringbuffer.h"
+#include "rapidjson/writer.h"
 #include "vec/common/string_ref.h"
+#include "vec/core/types.h"
 #include "vec/data_types/data_type.h"
 #include "vec/exprs/table_function/table_function.h"
 
-namespace doris {
-namespace vectorized {
-class Block;
-} // namespace vectorized
-} // namespace doris
-
 namespace doris::vectorized {
 
-enum ExplodeJsonArrayType { INT = 0, DOUBLE, STRING, JSON };
-
+template <typename T>
 struct ParsedData {
-    static std::string true_value;
-    static std::string false_value;
+    ParsedData() = default;
+    virtual ~ParsedData() = default;
+    virtual void reset() = 0;
+    virtual int set_output(rapidjson::Document& document, int value_size) = 0;
+    virtual void insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,
+                                                int64_t cur_offset) = 0;
+    const char* get_null_flag_address(int cur_offset) {
+        return reinterpret_cast<const char*>(_values_null_flag.data() + cur_offset);
+    }
+    std::vector<UInt8> _values_null_flag;
+};
 
-    // The number parsed from json array
-    // the `_backup` saved the real number entity.
-    std::vector<void*> _data;
-    std::vector<StringRef> _data_string;
+struct ParsedDataInt : public ParsedData<int64_t> {
+    static auto constexpr max_value = std::numeric_limits<int64_t>::max(); //9223372036854775807

Review Comment:
   MAX_VALUE constexpr



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2072514413

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2072220963

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2067550048

   
   <details>
   <summary>ClickBench: <b>Total hot run time: 29.94 s</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 9c22a3cc51eccc210886e778d62a74b18ca22f10, data reload: false
   
   query1	0.04	0.03	0.03
   query2	0.07	0.04	0.04
   query3	0.23	0.05	0.05
   query4	1.67	0.08	0.09
   query5	0.51	0.50	0.51
   query6	1.48	0.73	0.72
   query7	0.02	0.02	0.01
   query8	0.06	0.04	0.05
   query9	0.54	0.49	0.49
   query10	0.55	0.55	0.55
   query11	0.16	0.11	0.11
   query12	0.14	0.12	0.11
   query13	0.61	0.59	0.58
   query14	0.76	0.78	0.77
   query15	0.83	0.81	0.79
   query16	0.37	0.36	0.35
   query17	0.93	1.02	1.01
   query18	0.22	0.23	0.22
   query19	1.82	1.69	1.61
   query20	0.01	0.02	0.02
   query21	15.40	0.65	0.64
   query22	4.29	7.78	1.65
   query23	18.30	1.34	1.26
   query24	1.55	0.29	0.21
   query25	0.15	0.08	0.08
   query26	0.25	0.16	0.16
   query27	0.08	0.08	0.08
   query28	13.44	1.01	0.98
   query29	12.58	3.24	3.25
   query30	0.25	0.06	0.05
   query31	2.88	0.38	0.37
   query32	3.28	0.46	0.45
   query33	2.82	2.79	2.83
   query34	17.10	4.38	4.44
   query35	4.45	4.46	4.49
   query36	0.65	0.46	0.46
   query37	0.18	0.15	0.15
   query38	0.15	0.15	0.16
   query39	0.05	0.03	0.04
   query40	0.18	0.14	0.14
   query41	0.09	0.05	0.05
   query42	0.06	0.05	0.04
   query43	0.04	0.04	0.04
   Total cold run time: 109.24 s
   Total hot run time: 29.94 s
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 merged PR #33904:
URL: https://github.com/apache/doris/pull/33904


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2072290803

   TeamCity be ut coverage result:
    Function Coverage: 35.22% (8916/25315) 
    Line Coverage: 26.99% (73345/271792)
    Region Coverage: 26.16% (37880/144812)
    Branch Coverage: 22.98% (19291/83940)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/c7c85ab58ac3d08f98a79e8c4e635ef3f266ec0d_c7c85ab58ac3d08f98a79e8c4e635ef3f266ec0d/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on code in PR #33904:
URL: https://github.com/apache/doris/pull/33904#discussion_r1572211348


##########
be/src/vec/exprs/table_function/vexplode_json_array.cpp:
##########
@@ -118,66 +112,93 @@
                     wbytes = snprintf(tmp_buf, sizeof(tmp_buf), "%f", v.GetDouble());
                 }
                 _backup_string.emplace_back(tmp_buf, wbytes);
-                _string_nulls.push_back(false);
-                // do not set _data_string here.
-                // Because the address of the string stored in `_backup_string` may
-                // change each time `emplace_back()` is called.
                 break;
             case rapidjson::Type::kFalseType:
                 _backup_string.emplace_back(true_value);
-                _string_nulls.push_back(false);
                 break;
             case rapidjson::Type::kTrueType:
                 _backup_string.emplace_back(false_value);
-                _string_nulls.push_back(false);
                 break;
             case rapidjson::Type::kNullType:
-                _backup_string.emplace_back();
-                _string_nulls.push_back(true);
+                _backup_string.emplace_back("", 0);
+                _values_null_flag[i] = 1;
                 break;
             default:
-                _backup_string.emplace_back();
-                _string_nulls.push_back(true);
+                _backup_string.emplace_back("", 0);
+                _values_null_flag[i] = 1;
                 break;
             }
-        }
-        // Must set _data_string at the end, so that we can
-        // save the real addr of string in `_backup_string` to `_data_string`.
-        for (auto& str : _backup_string) {
-            _data_string.emplace_back(str);
+            ++i;
         }
         break;
     }
     case ExplodeJsonArrayType::JSON: {
-        _data_string.clear();
-        _backup_string.clear();
-        _string_nulls.clear();
+        int i = 0;
         for (auto& v : document.GetArray()) {
             if (v.IsObject()) {
                 rapidjson::StringBuffer buffer;
                 rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
                 v.Accept(writer);
                 _backup_string.emplace_back(buffer.GetString(), buffer.GetSize());
-                _string_nulls.push_back(false);
             } else {
-                _data_string.push_back({});
-                _string_nulls.push_back(true);
+                _backup_string.emplace_back("", 0);
+                _values_null_flag[i] = 1;
             }
-        }
-        // Must set _data_string at the end, so that we can
-        // save the real addr of string in `_backup_string` to `_data_string`.
-        for (auto& str : _backup_string) {
-            _data_string.emplace_back(str);
+            ++i;
         }
         break;
     }
     default:
-        CHECK(false) << type;
+        CHECK(false) << _data_type;
         break;
     }
     return size;
 }
 
+Status ParsedData::insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,

Review Comment:
   warning: method 'insert_result_from_parsed_data' can be made const [readability-make-member-function-const]
   
   be/src/vec/exprs/table_function/vexplode_json_array.cpp:158:
   ```diff
   -                                                   int64_t cur_offset) {
   +                                                   int64_t cur_offset) const {
   ```
   



##########
be/src/vec/exprs/table_function/vexplode_json_array.cpp:
##########
@@ -40,25 +43,23 @@ std::string ParsedData::false_value = "false";
 auto max_value = std::numeric_limits<int64_t>::max(); //9223372036854775807
 auto min_value = std::numeric_limits<int64_t>::min(); //-9223372036854775808
 
-int ParsedData::set_output(ExplodeJsonArrayType type, rapidjson::Document& document) {
+int ParsedData::set_output(rapidjson::Document& document) {

Review Comment:
   warning: function 'set_output' exceeds recommended size/complexity thresholds [readability-function-size]
   ```cpp
   int ParsedData::set_output(rapidjson::Document& document) {
                   ^
   ```
   <details>
   <summary>Additional context</summary>
   
   **be/src/vec/exprs/table_function/vexplode_json_array.cpp:45:** 110 lines including whitespace and comments (threshold 80)
   ```cpp
   int ParsedData::set_output(rapidjson::Document& document) {
                   ^
   ```
   
   </details>
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2067552820

   TeamCity be ut coverage result:
    Function Coverage: 35.39% (8921/25208) 
    Line Coverage: 27.10% (73328/270559)
    Region Coverage: 26.25% (37891/144325)
    Branch Coverage: 23.06% (19290/83654)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/9c22a3cc51eccc210886e778d62a74b18ca22f10_9c22a3cc51eccc210886e778d62a74b18ca22f10/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2066861215

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on code in PR #33904:
URL: https://github.com/apache/doris/pull/33904#discussion_r1573134705


##########
be/src/vec/exprs/table_function/vexplode_json_array.cpp:
##########
@@ -40,25 +43,23 @@ std::string ParsedData::false_value = "false";
 auto max_value = std::numeric_limits<int64_t>::max(); //9223372036854775807
 auto min_value = std::numeric_limits<int64_t>::min(); //-9223372036854775808
 
-int ParsedData::set_output(ExplodeJsonArrayType type, rapidjson::Document& document) {
+int ParsedData::set_output(rapidjson::Document& document) {

Review Comment:
   warning: function 'set_output' exceeds recommended size/complexity thresholds [readability-function-size]
   ```cpp
   int ParsedData::set_output(rapidjson::Document& document) {
                   ^
   ```
   <details>
   <summary>Additional context</summary>
   
   **be/src/vec/exprs/table_function/vexplode_json_array.cpp:45:** 136 lines including whitespace and comments (threshold 80)
   ```cpp
   int ParsedData::set_output(rapidjson::Document& document) {
                   ^
   ```
   
   </details>
   



##########
be/src/vec/exprs/table_function/vexplode_json_array.cpp:
##########
@@ -118,66 +120,115 @@
                     wbytes = snprintf(tmp_buf, sizeof(tmp_buf), "%f", v.GetDouble());
                 }
                 _backup_string.emplace_back(tmp_buf, wbytes);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 // do not set _data_string here.
                 // Because the address of the string stored in `_backup_string` may
                 // change each time `emplace_back()` is called.
                 break;
+            }
             case rapidjson::Type::kFalseType:
                 _backup_string.emplace_back(true_value);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 break;
             case rapidjson::Type::kTrueType:
                 _backup_string.emplace_back(false_value);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 break;
             case rapidjson::Type::kNullType:
                 _backup_string.emplace_back();
-                _string_nulls.push_back(true);
+                _values_null_flag.emplace_back(true);
                 break;
             default:
                 _backup_string.emplace_back();
-                _string_nulls.push_back(true);
+                _values_null_flag.emplace_back(true);
                 break;
             }
         }
         // Must set _data_string at the end, so that we can
         // save the real addr of string in `_backup_string` to `_data_string`.
         for (auto& str : _backup_string) {
-            _data_string.emplace_back(str);
+            _data_string_ref.emplace_back(str.data(), str.length());
         }
         break;
     }
     case ExplodeJsonArrayType::JSON: {
-        _data_string.clear();
+        _data_string_ref.clear();
         _backup_string.clear();
-        _string_nulls.clear();
+        _values_null_flag.clear();
         for (auto& v : document.GetArray()) {
             if (v.IsObject()) {
                 rapidjson::StringBuffer buffer;
                 rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
                 v.Accept(writer);
                 _backup_string.emplace_back(buffer.GetString(), buffer.GetSize());
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
             } else {
-                _data_string.push_back({});
-                _string_nulls.push_back(true);
+                _backup_string.emplace_back();
+                _values_null_flag.emplace_back(true);
             }
         }
         // Must set _data_string at the end, so that we can
         // save the real addr of string in `_backup_string` to `_data_string`.
         for (auto& str : _backup_string) {
-            _data_string.emplace_back(str);
+            _data_string_ref.emplace_back(str);
         }
         break;
     }
     default:
-        CHECK(false) << type;
+        CHECK(false) << _data_type;
         break;
     }
     return size;
 }
 
+Status ParsedData::insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,

Review Comment:
   warning: method 'insert_result_from_parsed_data' can be made const [readability-make-member-function-const]
   
   be/src/vec/exprs/table_function/vexplode_json_array.cpp:184:
   ```diff
   -                                                   int64_t cur_offset) {
   +                                                   int64_t cur_offset) const {
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2066329931

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2066358416

   TeamCity be ut coverage result:
    Function Coverage: 35.41% (8922/25193) 
    Line Coverage: 27.12% (73292/270282)
    Region Coverage: 26.26% (37864/144203)
    Branch Coverage: 23.07% (19290/83604)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/e805e93b13068273183e1399820f04d28703e7cd_e805e93b13068273183e1399820f04d28703e7cd/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2067545892

   
   <details>
   <summary>TPC-H: <b>Total hot run time: 38344 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 9c22a3cc51eccc210886e778d62a74b18ca22f10, data reload: false
   
   ------ Round 1 ----------------------------------
   q1	17592	4276	4258	4258
   q2	2014	191	187	187
   q3	10440	1160	1189	1160
   q4	10192	757	737	737
   q5	7516	2730	2647	2647
   q6	219	133	134	133
   q7	1028	609	579	579
   q8	9219	2056	2051	2051
   q9	7400	6618	6525	6525
   q10	8611	3578	3534	3534
   q11	440	236	234	234
   q12	503	228	215	215
   q13	17768	2960	2967	2960
   q14	261	221	236	221
   q15	533	479	489	479
   q16	528	372	374	372
   q17	955	742	621	621
   q18	7262	6723	6644	6644
   q19	7465	1497	1512	1497
   q20	631	330	312	312
   q21	3458	2685	2769	2685
   q22	359	293	297	293
   Total cold run time: 114394 ms
   Total hot run time: 38344 ms
   
   ----- Round 2, with runtime_filter_mode=off -----
   q1	4353	4219	4194	4194
   q2	363	274	271	271
   q3	2991	2679	2721	2679
   q4	1878	1609	1575	1575
   q5	5306	5339	5305	5305
   q6	205	123	123	123
   q7	2237	1813	1897	1813
   q8	3211	3334	3300	3300
   q9	8657	8544	8673	8544
   q10	4041	3923	4033	3923
   q11	607	497	491	491
   q12	792	644	615	615
   q13	17140	3269	3148	3148
   q14	322	289	299	289
   q15	544	485	482	482
   q16	505	444	459	444
   q17	1799	1517	1502	1502
   q18	7969	7843	7804	7804
   q19	1664	1574	1585	1574
   q20	2044	1870	1806	1806
   q21	8907	4922	4965	4922
   q22	553	456	475	456
   Total cold run time: 76088 ms
   Total hot run time: 55260 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2069395644

   TeamCity be ut coverage result:
    Function Coverage: 35.35% (8916/25222) 
    Line Coverage: 27.09% (73320/270679)
    Region Coverage: 26.24% (37876/144367)
    Branch Coverage: 23.05% (19286/83674)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/42bab03ac0fc6a76f44f48c254fb2f29918f5764_42bab03ac0fc6a76f44f48c254fb2f29918f5764/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on code in PR #33904:
URL: https://github.com/apache/doris/pull/33904#discussion_r1574701752


##########
be/src/vec/exprs/table_function/vexplode_json_array.h:
##########
@@ -19,85 +19,231 @@
 
 #include <glog/logging.h>

Review Comment:
   warning: 'glog/logging.h' file not found [clang-diagnostic-error]
   ```cpp
   #include <glog/logging.h>
            ^
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2069477976

   
   <details>
   <summary>TPC-DS: <b>Total hot run time: 185238 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 42bab03ac0fc6a76f44f48c254fb2f29918f5764, data reload: false
   
   query1	880	372	358	358
   query2	6614	2720	2416	2416
   query3	6659	208	206	206
   query4	26601	21271	21285	21271
   query5	4162	411	404	404
   query6	279	180	180	180
   query7	4582	291	300	291
   query8	236	187	180	180
   query9	8389	2272	2247	2247
   query10	403	253	239	239
   query11	14682	14096	14175	14096
   query12	136	87	87	87
   query13	1635	365	355	355
   query14	9794	7493	7506	7493
   query15	236	192	184	184
   query16	8072	264	264	264
   query17	1935	594	557	557
   query18	2066	286	278	278
   query19	207	150	162	150
   query20	91	86	87	86
   query21	197	126	126	126
   query22	5060	4797	4830	4797
   query23	33841	33209	33373	33209
   query24	11106	3071	3044	3044
   query25	628	387	388	387
   query26	1135	152	155	152
   query27	2506	365	370	365
   query28	6970	2094	1989	1989
   query29	887	627	608	608
   query30	234	179	179	179
   query31	1013	732	759	732
   query32	95	56	52	52
   query33	733	260	244	244
   query34	1239	504	497	497
   query35	848	700	687	687
   query36	1051	913	934	913
   query37	138	69	87	69
   query38	3492	3410	3388	3388
   query39	1657	1585	1602	1585
   query40	185	128	128	128
   query41	45	43	50	43
   query42	105	97	97	97
   query43	582	555	533	533
   query44	1247	746	739	739
   query45	267	279	270	270
   query46	1118	728	721	721
   query47	2114	1963	1943	1943
   query48	379	299	328	299
   query49	903	396	399	396
   query50	781	399	400	399
   query51	6813	6838	6662	6662
   query52	102	90	93	90
   query53	359	276	270	270
   query54	294	232	218	218
   query55	76	73	73	73
   query56	242	218	220	218
   query57	1215	1135	1112	1112
   query58	219	189	191	189
   query59	3387	3051	3019	3019
   query60	251	229	225	225
   query61	87	84	84	84
   query62	608	447	435	435
   query63	303	274	269	269
   query64	5120	3926	3859	3859
   query65	3065	3008	3015	3008
   query66	766	323	324	323
   query67	15464	15138	15260	15138
   query68	6305	553	540	540
   query69	540	303	300	300
   query70	1237	1159	1173	1159
   query71	1461	1263	1262	1262
   query72	6516	2591	2469	2469
   query73	725	321	319	319
   query74	6960	6399	6351	6351
   query75	3711	2622	2584	2584
   query76	4183	982	960	960
   query77	591	261	257	257
   query78	11139	10423	10298	10298
   query79	5895	518	519	518
   query80	1965	427	422	422
   query81	527	238	248	238
   query82	1731	92	91	91
   query83	337	167	165	165
   query84	263	83	86	83
   query85	1462	264	257	257
   query86	477	290	288	288
   query87	3452	3256	3271	3256
   query88	4805	2407	2463	2407
   query89	470	375	379	375
   query90	1917	179	175	175
   query91	121	93	92	92
   query92	58	46	44	44
   query93	5180	494	502	494
   query94	1074	177	177	177
   query95	373	288	288	288
   query96	598	272	264	264
   query97	3094	2917	2929	2917
   query98	221	219	212	212
   query99	1265	851	862	851
   Total cold run time: 295239 ms
   Total hot run time: 185238 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2068464134

   
   <details>
   <summary>TPC-H: <b>Total hot run time: 38703 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 3b12636705fe2ce43371468974d7b2417fb9dfab, data reload: false
   
   ------ Round 1 ----------------------------------
   q1	17621	4538	4548	4538
   q2	2008	185	176	176
   q3	10477	1115	1171	1115
   q4	10188	826	732	732
   q5	7507	2646	2631	2631
   q6	215	131	130	130
   q7	1010	621	586	586
   q8	9244	2073	2050	2050
   q9	7462	6556	6518	6518
   q10	8597	3490	3536	3490
   q11	436	230	227	227
   q12	454	231	218	218
   q13	17774	2972	2996	2972
   q14	279	245	230	230
   q15	510	491	471	471
   q16	520	391	375	375
   q17	968	673	739	673
   q18	7260	6718	6745	6718
   q19	6021	1534	1485	1485
   q20	628	309	303	303
   q21	3411	2758	2776	2758
   q22	358	307	307	307
   Total cold run time: 112948 ms
   Total hot run time: 38703 ms
   
   ----- Round 2, with runtime_filter_mode=off -----
   q1	4366	4273	4246	4246
   q2	359	266	277	266
   q3	2952	2759	2765	2759
   q4	1908	1622	1593	1593
   q5	5368	5357	5287	5287
   q6	214	123	124	123
   q7	2241	1855	1859	1855
   q8	3225	3358	3343	3343
   q9	8596	8533	8608	8533
   q10	3907	3647	3747	3647
   q11	583	470	482	470
   q12	761	592	591	591
   q13	16399	2984	3021	2984
   q14	296	299	277	277
   q15	510	465	471	465
   q16	472	417	422	417
   q17	1760	1494	1478	1478
   q18	7957	7758	7794	7758
   q19	1721	1674	1649	1649
   q20	2035	1849	1878	1849
   q21	8963	4934	5052	4934
   q22	558	519	469	469
   Total cold run time: 75151 ms
   Total hot run time: 54993 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2072314585

   
   <details>
   <summary>ClickBench: <b>Total hot run time: 31.56 s</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit c7c85ab58ac3d08f98a79e8c4e635ef3f266ec0d, data reload: false
   
   query1	0.04	0.04	0.04
   query2	0.08	0.04	0.04
   query3	0.23	0.05	0.04
   query4	1.68	0.08	0.07
   query5	0.50	0.52	0.52
   query6	1.22	0.86	0.83
   query7	0.02	0.02	0.01
   query8	0.05	0.04	0.04
   query9	0.51	0.44	0.45
   query10	0.51	0.51	0.51
   query11	0.15	0.11	0.11
   query12	0.14	0.12	0.12
   query13	0.65	0.63	0.65
   query14	0.94	0.87	1.02
   query15	0.85	0.84	0.84
   query16	0.38	0.38	0.36
   query17	1.06	1.01	1.00
   query18	0.22	0.23	0.24
   query19	1.85	1.78	1.80
   query20	0.02	0.01	0.01
   query21	15.43	0.67	0.66
   query22	4.30	7.58	2.01
   query23	18.34	1.39	1.34
   query24	1.99	0.25	0.26
   query25	0.15	0.09	0.10
   query26	0.28	0.17	0.17
   query27	0.08	0.08	0.09
   query28	13.25	1.01	1.02
   query29	12.71	3.33	3.38
   query30	0.26	0.07	0.06
   query31	2.84	0.40	0.40
   query32	3.25	0.49	0.50
   query33	2.76	2.98	2.83
   query34	17.32	4.58	4.57
   query35	4.55	4.66	4.60
   query36	0.65	0.47	0.46
   query37	0.21	0.18	0.17
   query38	0.19	0.19	0.18
   query39	0.06	0.05	0.05
   query40	0.17	0.16	0.17
   query41	0.12	0.07	0.06
   query42	0.07	0.06	0.06
   query43	0.04	0.05	0.04
   Total cold run time: 110.12 s
   Total hot run time: 31.56 s
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on code in PR #33904:
URL: https://github.com/apache/doris/pull/33904#discussion_r1573133228


##########
be/src/vec/exprs/table_function/vexplode_json_array.cpp:
##########
@@ -40,25 +43,23 @@ std::string ParsedData::false_value = "false";
 auto max_value = std::numeric_limits<int64_t>::max(); //9223372036854775807
 auto min_value = std::numeric_limits<int64_t>::min(); //-9223372036854775808
 
-int ParsedData::set_output(ExplodeJsonArrayType type, rapidjson::Document& document) {
+int ParsedData::set_output(rapidjson::Document& document) {

Review Comment:
   warning: function 'set_output' exceeds recommended size/complexity thresholds [readability-function-size]
   ```cpp
   int ParsedData::set_output(rapidjson::Document& document) {
                   ^
   ```
   <details>
   <summary>Additional context</summary>
   
   **be/src/vec/exprs/table_function/vexplode_json_array.cpp:45:** 135 lines including whitespace and comments (threshold 80)
   ```cpp
   int ParsedData::set_output(rapidjson::Document& document) {
                   ^
   ```
   
   </details>
   



##########
be/src/vec/exprs/table_function/vexplode_json_array.cpp:
##########
@@ -118,66 +120,115 @@ int ParsedData::set_output(ExplodeJsonArrayType type, rapidjson::Document& docum
                     wbytes = snprintf(tmp_buf, sizeof(tmp_buf), "%f", v.GetDouble());
                 }
                 _backup_string.emplace_back(tmp_buf, wbytes);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 // do not set _data_string here.
                 // Because the address of the string stored in `_backup_string` may
                 // change each time `emplace_back()` is called.
                 break;
+            }
             case rapidjson::Type::kFalseType:
                 _backup_string.emplace_back(true_value);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 break;
             case rapidjson::Type::kTrueType:
                 _backup_string.emplace_back(false_value);
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
                 break;
             case rapidjson::Type::kNullType:
                 _backup_string.emplace_back();
-                _string_nulls.push_back(true);
+                _values_null_flag.emplace_back(true);
                 break;
             default:
                 _backup_string.emplace_back();
-                _string_nulls.push_back(true);
+                _values_null_flag.emplace_back(true);
                 break;
             }
         }
         // Must set _data_string at the end, so that we can
         // save the real addr of string in `_backup_string` to `_data_string`.
         for (auto& str : _backup_string) {
-            _data_string.emplace_back(str);
+            _data_string_ref.emplace_back(str.data(), str.length());
         }
         break;
     }
     case ExplodeJsonArrayType::JSON: {
-        _data_string.clear();
+        _data_string_ref.clear();
         _backup_string.clear();
-        _string_nulls.clear();
+        _values_null_flag.clear();
         for (auto& v : document.GetArray()) {
             if (v.IsObject()) {
                 rapidjson::StringBuffer buffer;
                 rapidjson::Writer<rapidjson::StringBuffer> writer(buffer);
                 v.Accept(writer);
                 _backup_string.emplace_back(buffer.GetString(), buffer.GetSize());
-                _string_nulls.push_back(false);
+                _values_null_flag.emplace_back(false);
             } else {
-                _data_string.push_back({});
-                _string_nulls.push_back(true);
+                _backup_string.emplace_back();
+                _values_null_flag.emplace_back(true);
             }
         }
         // Must set _data_string at the end, so that we can
         // save the real addr of string in `_backup_string` to `_data_string`.
         for (auto& str : _backup_string) {
-            _data_string.emplace_back(str);
+            _data_string_ref.emplace_back(str);
         }
         break;
     }
     default:
-        CHECK(false) << type;
+        CHECK(false) << _data_type;
         break;
     }
     return size;
 }
 

Review Comment:
   warning: method 'insert_result_from_parsed_data' can be made const [readability-make-member-function-const]
   
   be/src/vec/exprs/table_function/vexplode_json_array.cpp:183:
   ```diff
   -                                                   int64_t cur_offset) {
   +                                                   int64_t cur_offset) const {
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2066943783

   TeamCity be ut coverage result:
    Function Coverage: 35.39% (8920/25207) 
    Line Coverage: 27.09% (73273/270488)
    Region Coverage: 26.24% (37860/144301)
    Branch Coverage: 23.05% (19283/83648)
    Coverage Report: http://coverage.selectdb-in.cc/coverage/006e9bab330ce708ce933f29fe750d68a403a5f9_006e9bab330ce708ce933f29fe750d68a403a5f9/report/index.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2069434426

   
   <details>
   <summary>TPC-H: <b>Total hot run time: 38449 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
   Tpch sf100 test result on commit 42bab03ac0fc6a76f44f48c254fb2f29918f5764, data reload: false
   
   ------ Round 1 ----------------------------------
   q1	17625	4313	4242	4242
   q2	2011	189	191	189
   q3	10433	1234	1169	1169
   q4	10188	791	743	743
   q5	7496	2726	2666	2666
   q6	216	134	137	134
   q7	1024	596	583	583
   q8	9227	2049	2034	2034
   q9	7368	6612	6552	6552
   q10	8585	3571	3518	3518
   q11	456	234	226	226
   q12	464	212	209	209
   q13	17765	2919	2953	2919
   q14	262	234	243	234
   q15	521	488	486	486
   q16	540	379	376	376
   q17	953	665	679	665
   q18	7374	6674	6679	6674
   q19	6086	1535	1524	1524
   q20	657	308	293	293
   q21	3478	2722	2889	2722
   q22	369	291	306	291
   Total cold run time: 113098 ms
   Total hot run time: 38449 ms
   
   ----- Round 2, with runtime_filter_mode=off -----
   q1	4377	4219	4209	4209
   q2	362	264	266	264
   q3	3000	2740	2684	2684
   q4	1893	1584	1627	1584
   q5	5326	5386	5319	5319
   q6	211	122	123	122
   q7	2223	1865	1885	1865
   q8	3182	3359	3311	3311
   q9	8570	8570	8593	8570
   q10	4133	3827	3978	3827
   q11	608	491	492	491
   q12	802	619	636	619
   q13	16392	3315	3129	3129
   q14	326	275	307	275
   q15	515	484	478	478
   q16	518	446	440	440
   q17	1839	1531	1465	1465
   q18	8192	7909	7891	7891
   q19	1676	1518	1588	1518
   q20	2090	1896	1861	1861
   q21	9755	4983	5022	4983
   q22	560	484	512	484
   Total cold run time: 76550 ms
   Total hot run time: 55389 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2072704660

   
   <details>
   <summary>ClickBench: <b>Total hot run time: 31.15 s</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
   ClickBench test result on commit 6c0bac360caf427997725e492a4711e19938fa73, data reload: false
   
   query1	0.04	0.03	0.03
   query2	0.09	0.04	0.04
   query3	0.23	0.05	0.05
   query4	1.68	0.06	0.06
   query5	0.50	0.50	0.50
   query6	1.30	0.87	0.82
   query7	0.02	0.02	0.01
   query8	0.06	0.04	0.04
   query9	0.50	0.45	0.45
   query10	0.49	0.51	0.49
   query11	0.15	0.11	0.10
   query12	0.14	0.10	0.11
   query13	0.63	0.64	0.64
   query14	0.99	1.00	0.89
   query15	0.91	0.85	0.87
   query16	0.37	0.37	0.38
   query17	1.03	1.07	0.98
   query18	0.22	0.23	0.23
   query19	1.90	1.88	1.93
   query20	0.02	0.01	0.01
   query21	15.40	0.66	0.65
   query22	4.34	7.77	1.55
   query23	18.35	1.37	1.26
   query24	1.30	0.35	0.34
   query25	0.14	0.09	0.09
   query26	0.27	0.18	0.17
   query27	0.09	0.09	0.09
   query28	13.40	1.03	1.01
   query29	12.65	3.45	3.38
   query30	0.26	0.08	0.06
   query31	2.83	0.40	0.40
   query32	3.24	0.48	0.50
   query33	2.76	2.92	2.92
   query34	17.07	4.70	4.43
   query35	4.44	4.59	4.65
   query36	0.64	0.46	0.46
   query37	0.21	0.18	0.17
   query38	0.20	0.19	0.19
   query39	0.05	0.06	0.05
   query40	0.19	0.16	0.15
   query41	0.10	0.06	0.07
   query42	0.07	0.06	0.06
   query43	0.04	0.05	0.05
   Total cold run time: 109.31 s
   Total hot run time: 31.15 s
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2073878617

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2071505370

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "doris-robot (via GitHub)" <gi...@apache.org>.
doris-robot commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2067548798

   
   <details>
   <summary>TPC-DS: <b>Total hot run time: 185599 ms</b></summary>
   
   ```
   machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
   scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
   TPC-DS sf100 test result on commit 9c22a3cc51eccc210886e778d62a74b18ca22f10, data reload: false
   
   query1	894	369	358	358
   query2	6192	2705	2459	2459
   query3	6657	203	199	199
   query4	23474	21218	21213	21213
   query5	4097	408	405	405
   query6	262	182	168	168
   query7	4585	287	292	287
   query8	244	196	190	190
   query9	8592	2354	2319	2319
   query10	412	243	253	243
   query11	14786	14148	14077	14077
   query12	137	95	85	85
   query13	1660	359	364	359
   query14	9595	7593	8046	7593
   query15	281	185	185	185
   query16	8252	267	260	260
   query17	1995	581	556	556
   query18	2129	284	278	278
   query19	332	152	154	152
   query20	91	89	84	84
   query21	195	124	125	124
   query22	5002	4872	4826	4826
   query23	33844	33411	33505	33411
   query24	11233	2985	3087	2985
   query25	582	396	379	379
   query26	708	164	163	163
   query27	2300	349	368	349
   query28	5896	2088	2079	2079
   query29	861	655	624	624
   query30	281	174	173	173
   query31	986	755	754	754
   query32	97	53	54	53
   query33	655	245	244	244
   query34	895	492	490	490
   query35	861	695	682	682
   query36	1075	933	924	924
   query37	110	72	68	68
   query38	3461	3343	3327	3327
   query39	1644	1595	1589	1589
   query40	173	128	127	127
   query41	45	44	45	44
   query42	104	100	95	95
   query43	594	566	555	555
   query44	1109	757	743	743
   query45	300	260	270	260
   query46	1100	737	754	737
   query47	2030	1934	1930	1930
   query48	376	306	303	303
   query49	821	385	429	385
   query50	790	398	396	396
   query51	6857	6858	6764	6764
   query52	104	89	86	86
   query53	338	276	274	274
   query54	306	234	244	234
   query55	77	73	72	72
   query56	234	221	232	221
   query57	1196	1128	1154	1128
   query58	209	217	192	192
   query59	3361	3330	3175	3175
   query60	251	221	222	221
   query61	86	85	88	85
   query62	599	438	441	438
   query63	305	281	274	274
   query64	4852	3921	3899	3899
   query65	3056	3023	2990	2990
   query66	741	324	321	321
   query67	15434	15090	14820	14820
   query68	5623	518	535	518
   query69	524	297	301	297
   query70	1186	1180	1169	1169
   query71	1450	1260	1261	1260
   query72	6565	2608	2435	2435
   query73	730	318	315	315
   query74	6961	6535	6460	6460
   query75	3640	2654	2671	2654
   query76	4109	1001	966	966
   query77	617	253	259	253
   query78	10883	10287	10293	10287
   query79	7410	521	502	502
   query80	1538	437	431	431
   query81	510	247	239	239
   query82	491	96	93	93
   query83	192	160	160	160
   query84	269	82	79	79
   query85	841	265	261	261
   query86	335	305	292	292
   query87	3451	3256	3251	3251
   query88	4515	2311	2326	2311
   query89	483	372	370	370
   query90	2016	178	182	178
   query91	123	95	93	93
   query92	53	47	46	46
   query93	5831	493	496	493
   query94	1117	178	173	173
   query95	392	287	289	287
   query96	588	265	259	259
   query97	3107	2904	2947	2904
   query98	234	217	212	212
   query99	1169	868	896	868
   Total cold run time: 287754 ms
   Total hot run time: 185599 ms
   ```
   </details>
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2069314341

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "HappenLee (via GitHub)" <gi...@apache.org>.
HappenLee commented on code in PR #33904:
URL: https://github.com/apache/doris/pull/33904#discussion_r1576181650


##########
be/src/vec/exprs/table_function/vexplode_json_array.h:
##########
@@ -19,85 +19,231 @@
 
 #include <glog/logging.h>
 #include <rapidjson/document.h>
-#include <stddef.h>
-#include <stdint.h>
 
 #include <ostream>
 #include <string>
 #include <vector>
 
 #include "common/status.h"
 #include "gutil/integral_types.h"
+#include "rapidjson/stringbuffer.h"
+#include "rapidjson/writer.h"
 #include "vec/common/string_ref.h"
+#include "vec/core/types.h"
 #include "vec/data_types/data_type.h"
 #include "vec/exprs/table_function/table_function.h"
 
-namespace doris {
-namespace vectorized {
-class Block;
-} // namespace vectorized
-} // namespace doris
-
 namespace doris::vectorized {
 
-enum ExplodeJsonArrayType { INT = 0, DOUBLE, STRING, JSON };
-
+template <typename T>
 struct ParsedData {
-    static std::string true_value;
-    static std::string false_value;
+    ParsedData() = default;
+    virtual ~ParsedData() = default;
+    virtual void reset() = 0;
+    virtual int set_output(rapidjson::Document& document, int value_size) = 0;
+    virtual void insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,
+                                                int64_t cur_offset) = 0;
+    const char* get_null_flag_address(int cur_offset) {
+        return reinterpret_cast<const char*>(_values_null_flag.data() + cur_offset);
+    }
+    std::vector<UInt8> _values_null_flag;
+};
 
-    // The number parsed from json array
-    // the `_backup` saved the real number entity.
-    std::vector<void*> _data;
-    std::vector<StringRef> _data_string;
+struct ParsedDataInt : public ParsedData<int64_t> {
+    static auto constexpr max_value = std::numeric_limits<int64_t>::max(); //9223372036854775807
+    static auto constexpr min_value = std::numeric_limits<int64_t>::min(); //-9223372036854775808
+
+    int set_output(rapidjson::Document& document, int value_size) override {
+        _values_null_flag.resize(value_size, 0);
+        _backup_int.resize(value_size);
+        int i = 0;
+        for (auto& v : document.GetArray()) {
+            if (v.IsInt64()) {
+                _backup_int[i] = v.GetInt64();
+            } else if (v.IsUint64()) {
+                auto value = v.GetUint64();
+                if (value > max_value) {
+                    _backup_int[i] = max_value;
+                } else {
+                    _backup_int[i] = value;
+                }
+            } else if (v.IsDouble()) {
+                auto value = v.GetDouble();
+                if (value > max_value) {
+                    _backup_int[i] = max_value;
+                } else if (value < min_value) {
+                    _backup_int[i] = min_value;
+                } else {
+                    _backup_int[i] = long(value);
+                }
+            } else {
+                _values_null_flag[i] = 1;
+                _backup_int[i] = 0;
+            }
+            ++i;
+        }
+        return value_size;
+    }
+    void insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,
+                                        int64_t cur_offset) override {
+        assert_cast<ColumnInt64*>(column.get())
+                ->insert_many_raw_data(
+                        reinterpret_cast<const char*>(_backup_int.data() + cur_offset), max_step);
+    }
+    void reset() override { _backup_int.clear(); }
     std::vector<int64_t> _backup_int;
+};
+
+struct ParsedDataDouble : public ParsedData<double> {
+    int set_output(rapidjson::Document& document, int value_size) override {
+        _values_null_flag.resize(value_size, 0);
+        _backup_double.resize(value_size);
+        int i = 0;
+        for (auto& v : document.GetArray()) {
+            if (v.IsDouble()) {
+                _backup_double[i] = v.GetDouble();
+            } else {
+                _backup_double[i] = 0;
+                _values_null_flag[i] = 1;
+            }
+            ++i;
+        }
+        return value_size;
+    }
+    void insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,
+                                        int64_t cur_offset) override {
+        assert_cast<ColumnFloat64*>(column.get())
+                ->insert_many_raw_data(
+                        reinterpret_cast<const char*>(_backup_double.data() + cur_offset),
+                        max_step);
+    }
+    void reset() override { _backup_double.clear(); }
     std::vector<double> _backup_double;
+};
+
+struct ParsedDataStringBase : public ParsedData<std::string> {
+    void insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,
+                                        int64_t cur_offset) override {
+        assert_cast<ColumnString*>(column.get())
+                ->insert_many_strings(_data_string_ref.data() + cur_offset, max_step);
+    }
+    void reset() override {
+        _data_string_ref.clear();
+        _backup_string.clear();
+    }
+
+    static std::string true_value;
+    static std::string false_value;

Review Comment:
   constexpr char* =  UPPER



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "HappenLee (via GitHub)" <gi...@apache.org>.
HappenLee commented on code in PR #33904:
URL: https://github.com/apache/doris/pull/33904#discussion_r1576182132


##########
be/src/vec/exprs/table_function/vexplode_json_array.h:
##########
@@ -19,85 +19,231 @@
 
 #include <glog/logging.h>
 #include <rapidjson/document.h>
-#include <stddef.h>
-#include <stdint.h>
 
 #include <ostream>
 #include <string>
 #include <vector>
 
 #include "common/status.h"
 #include "gutil/integral_types.h"
+#include "rapidjson/stringbuffer.h"
+#include "rapidjson/writer.h"
 #include "vec/common/string_ref.h"
+#include "vec/core/types.h"
 #include "vec/data_types/data_type.h"
 #include "vec/exprs/table_function/table_function.h"
 
-namespace doris {
-namespace vectorized {
-class Block;
-} // namespace vectorized
-} // namespace doris
-
 namespace doris::vectorized {
 
-enum ExplodeJsonArrayType { INT = 0, DOUBLE, STRING, JSON };
-
+template <typename T>
 struct ParsedData {
-    static std::string true_value;
-    static std::string false_value;
+    ParsedData() = default;
+    virtual ~ParsedData() = default;
+    virtual void reset() = 0;
+    virtual int set_output(rapidjson::Document& document, int value_size) = 0;
+    virtual void insert_result_from_parsed_data(MutableColumnPtr& column, int max_step,
+                                                int64_t cur_offset) = 0;
+    const char* get_null_flag_address(int cur_offset) {
+        return reinterpret_cast<const char*>(_values_null_flag.data() + cur_offset);
+    }
+    std::vector<UInt8> _values_null_flag;

Review Comment:
   std::vector<T> _backup



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2071510301

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2071510340

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] [improve](table function) opt explode/explode_map/explode_json table function [doris]

Posted by "zhangstar333 (via GitHub)" <gi...@apache.org>.
zhangstar333 commented on PR #33904:
URL: https://github.com/apache/doris/pull/33904#issuecomment-2068435489

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org