You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/10/29 16:47:52 UTC

[GitHub] [doris] xiaokang opened a new pull request, #13778: [feature](jsonb type)refactor JSONB type using column and add testcase

xiaokang opened a new pull request, #13778:
URL: https://github.com/apache/doris/pull/13778

   # Proposed changes
   
   Issue Number: close [DSIP-016: Support JSON type](https://cwiki.apache.org/confluence/display/DORIS/DSIP-016%3A+Support+JSON+type?src=contextnavpagetreemode)
   
   ## Problem summary
   
   1. Refactor JSONB type using ColumnString instead making a copy.
   2. Add regression testcase for JSONB load and functions.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
       - [ ] Yes
       - [ ] No
       - [ ] I don't know
   3. Has unit tests been added:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   4. Has document been added or modified:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   5. Does it need to update dependencies:
       - [ ] Yes
       - [ ] No
   6. Are there any changes that cannot be rolled back:
       - [ ] Yes (If Yes, please explain WHY)
       - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #13778: [feature](jsonb type)refactor JSONB type using column and add testcase

Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #13778:
URL: https://github.com/apache/doris/pull/13778#issuecomment-1296318372

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 38.42 seconds
    load time: 560 seconds
    storage size: 17154644069 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221030181308_clickbench_pr_35879.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morningman commented on a diff in pull request #13778: [feature](jsonb type)refactor JSONB type using column and add testcase

Posted by GitBox <gi...@apache.org>.
morningman commented on code in PR #13778:
URL: https://github.com/apache/doris/pull/13778#discussion_r1014761169


##########
be/src/vec/functions/function_jsonb.cpp:
##########
@@ -405,13 +404,13 @@ struct JsonbExtractStringImpl {
             if constexpr (std::is_same_v<DataTypeJsonb, ReturnType>) {
                 writer->reset();
                 writer->writeValue(value);
-                // StringOP::push_value_string(
-                //     std::string_view(writer->getOutput()->getBuffer(), writer->getOutput()->getSize()),
-                //     i, res_data, res_offsets);
-                res_data.insert(writer->getOutput()->getBuffer(),
-                                writer->getOutput()->getBuffer() + writer->getOutput()->getSize());
-                res_data.push_back('\0');
-                res_offsets[i] = res_data.size();
+                StringOP::push_value_string(std::string_view(writer->getOutput()->getBuffer(),
+                                                             writer->getOutput()->getSize()),
+                                            i, res_data, res_offsets);
+                // res_data.insert(writer->getOutput()->getBuffer(),

Review Comment:
   Remove unused code?
   Or add some comment if you want to use it later.



##########
be/src/vec/sink/vmysql_result_writer.cpp:
##########
@@ -110,14 +110,8 @@ Status VMysqlResultWriter::_add_one_column(const ColumnPtr& column_ptr,
             }
             if constexpr (type == TYPE_JSONB) {
                 const auto json_val = column->get_data_at(i);
-                if (json_val.data == nullptr) {
-                    if (json_val.size == 0) {
-                        // 0x01 is a magic num, not useful actually, just for present ""
-                        char* tmp_val = reinterpret_cast<char*>(0x01);
-                        buf_ret = _buffer.push_string(tmp_val, json_val.size);
-                    } else {
-                        buf_ret = _buffer.push_null();
-                    }
+                if (json_val.data == nullptr || json_val.size == 0) {

Review Comment:
   Do we need to distinguish "null" and "empty string"?
   



##########
be/src/vec/data_types/data_type_jsonb.cpp:
##########
@@ -30,51 +29,36 @@
 
 namespace doris::vectorized {
 
-template <typename Reader>
-static inline void read(IColumn& column, Reader&& reader) {
-    ColumnJsonb& column_json = assert_cast<ColumnJsonb&>(column);
-    ColumnJsonb::Chars& data = column_json.get_chars();
-    ColumnJsonb::Offsets& offsets = column_json.get_offsets();
-    size_t old_chars_size = data.size();
-    size_t old_offsets_size = offsets.size();
-    try {
-        reader(data);
-        data.push_back(0);
-        offsets.push_back(data.size());
-    } catch (...) {
-        offsets.resize_assume_reserved(old_offsets_size);
-        data.resize_assume_reserved(old_chars_size);
-        throw;
-    }
-}
-
 std::string DataTypeJsonb::to_string(const IColumn& column, size_t row_num) const {
     const StringRef& s =
-            reinterpret_cast<const ColumnJsonb&>(*column.convert_to_full_column_if_const().get())
+            reinterpret_cast<const ColumnString&>(*column.convert_to_full_column_if_const().get())
                     .get_data_at(row_num);
-    return JsonbToJson::jsonb_to_json_string(s.data, s.size);
+    return s.size > 0 ? JsonbToJson::jsonb_to_json_string(s.data, s.size) : "";

Review Comment:
   is `s.size == 0` a special value to indicate something?
   If not, i suggest to move this logic inside the `JsonbToJson::jsonb_to_json_string()`, or it is very error-prone.



##########
fe/fe-core/src/main/jflex/sql_scanner.flex:
##########
@@ -270,7 +270,7 @@ import org.apache.doris.qe.SqlModeHelper;
         keywordMap.put("isolation", new Integer(SqlParserSymbols.KW_ISOLATION));
         keywordMap.put("job", new Integer(SqlParserSymbols.KW_JOB));
         keywordMap.put("join", new Integer(SqlParserSymbols.KW_JOIN));
-        keywordMap.put("json", new Integer(SqlParserSymbols.KW_JSON));
+        keywordMap.put("jsonb", new Integer(SqlParserSymbols.KW_JSONB));

Review Comment:
   why changing `json` to `jsonb`?
   I think `jsonb` is an internal implementation, so for user interface, it is better be "json"?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morningman merged pull request #13778: [feature](jsonb type)refactor JSONB type using column and add testcase

Posted by GitBox <gi...@apache.org>.
morningman merged PR #13778:
URL: https://github.com/apache/doris/pull/13778


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org