You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/10/26 10:18:39 UTC

[GitHub] [doris] mrhhsg opened a new pull request, #13694: [Feature](string-function) Add function mask/mask_first_n/mask_last_n

mrhhsg opened a new pull request, #13694:
URL: https://github.com/apache/doris/pull/13694

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem summary
   
   Implementation of mask function from [hive](https://cwiki.apache.org/confluence/display/hive/languagemanual+udf#LanguageManualUDF-DataMaskingFunctions).
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: 
       - [ ] Yes
       - [ ] No
       - [ ] I don't know
   2. Has unit tests been added:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   3. Has document been added or modified:
       - [ ] Yes
       - [ ] No
       - [ ] No Need
   4. Does it need to update dependencies:
       - [ ] Yes
       - [ ] No
   5. Are there any changes that cannot be rolled back:
       - [ ] Yes (If Yes, please explain WHY)
       - [ ] No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #13694: [Feature](string-function) Add function mask/mask_first_n/mask_last_n

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13694:
URL: https://github.com/apache/doris/pull/13694#issuecomment-1293723550

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #13694: [Feature](string-function) Add function mask/mask_first_n/mask_last_n

Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #13694:
URL: https://github.com/apache/doris/pull/13694#issuecomment-1292815016

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 39.08 seconds
    load time: 601 seconds
    storage size: 17154875712 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221027003932_clickbench_pr_34486.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #13694: [Feature](string-function) Add function mask/mask_first_n/mask_last_n

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13694:
URL: https://github.com/apache/doris/pull/13694#issuecomment-1293723617

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #13694: [Feature](string-function) Add function mask/mask_first_n/mask_last_n

Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #13694:
URL: https://github.com/apache/doris/pull/13694#issuecomment-1294286538

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 38.83 seconds
    load time: 563 seconds
    storage size: 17154644722 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221028010814_clickbench_pr_34993.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei merged pull request #13694: [Feature](string-function) Add function mask/mask_first_n/mask_last_n

Posted by GitBox <gi...@apache.org>.
yiguolei merged PR #13694:
URL: https://github.com/apache/doris/pull/13694


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] HappenLee commented on a diff in pull request #13694: [Feature](string-function) Add function mask/mask_first_n/mask_last_n

Posted by GitBox <gi...@apache.org>.
HappenLee commented on code in PR #13694:
URL: https://github.com/apache/doris/pull/13694#discussion_r1005549013


##########
be/src/vec/functions/function_string.h:
##########
@@ -302,6 +302,211 @@ struct Substr2Impl {
     }
 };
 
+template <bool Reverse>
+class FunctionMaskPartial;
+class FunctionMask : public IFunction {
+public:
+    static constexpr auto name = "mask";
+    static constexpr auto DEFAULT_UPPER_MASK = 'X';
+    static constexpr auto DEFAULT_LOWER_MASK = 'x';
+    static constexpr auto DEFAULT_NUMBER_MASK = 'n';
+    String get_name() const override { return name; }
+    static FunctionPtr create() { return std::make_shared<FunctionMask>(); }
+
+    DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+        return make_nullable(std::make_shared<DataTypeString>());
+    }
+
+    size_t get_number_of_arguments() const override { return 0; }
+
+    bool is_variadic() const override { return true; }
+
+    bool use_default_implementation_for_nulls() const override { return true; }
+    bool use_default_implementation_for_constants() const override { return true; }
+
+    Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                        size_t result, size_t input_rows_count) override {
+        DCHECK_GE(arguments.size(), 1);
+        DCHECK_LE(arguments.size(), 4);
+
+        auto null_map = ColumnUInt8::create(input_rows_count, 0);
+
+        char upper = DEFAULT_UPPER_MASK, lower = DEFAULT_LOWER_MASK, number = DEFAULT_NUMBER_MASK;
+
+        ColumnPtr source_column;
+        auto res = ColumnString::create();
+        source_column = block.get_by_position(arguments[0]).column;
+        if (source_column->is_nullable()) {
+            auto* nullable = assert_cast<const ColumnNullable*>(source_column.get());
+            VectorizedUtils::update_null_map(null_map->get_data(), nullable->get_null_map_data());
+            source_column = nullable->get_nested_column_ptr();
+        }
+

Review Comment:
   maybe return error the other column is not `const` 



##########
gensrc/script/doris_builtins_functions.py:
##########
@@ -2054,6 +2054,9 @@
     [['substr', 'substring'], 'VARCHAR', ['VARCHAR', 'INT', 'INT'],
         '_ZN5doris15StringFunctions9substringEPN'
         '9doris_udf15FunctionContextERKNS1_9StringValERKNS1_6IntValES9_', '', '', 'vec', 'ALWAYS_NULLABLE'],
+    [['mask'], 'STRING', ['STRING', '...'], '', '', '', 'vec', 'CUSTOM'],

Review Comment:
   seems the function is `default` not `custom`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] hello-stephen commented on pull request #13694: [Feature](string-function) Add function mask/mask_first_n/mask_last_n

Posted by GitBox <gi...@apache.org>.
hello-stephen commented on PR #13694:
URL: https://github.com/apache/doris/pull/13694#issuecomment-1292316523

   TeamCity pipeline, clickbench performance test result:
    the sum of best hot time: 39.85 seconds
    load time: 585 seconds
    storage size: 17154852160 Bytes
    https://doris-community-test-1308700295.cos.ap-hongkong.myqcloud.com/tmp/20221026163919_clickbench_pr_34482.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org