You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "niebayes (via GitHub)" <gi...@apache.org> on 2023/06/29 08:07:27 UTC

[GitHub] [doris] niebayes opened a new pull request, #21330: support array_contains_all function

niebayes opened a new pull request, #21330:
URL: https://github.com/apache/doris/pull/21330

   ## Proposed changes
   
   Issue Number: close #17310 
   
   <!--Describe your changes.-->
   
   ## Further comments
   
   Only support parsing and processing single-dimension arrays currently.
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21330: support array_contains_all function

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1622165150

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1257270542


##########
regression-test/data/query_p0/sql_functions/array_functions/test_array_functions.out:
##########
@@ -704,15 +704,15 @@
 9	\N
 
 -- !select_array_shuffle1 --
-1	[1, 2, 3]	6	6	[3, 2, 1]	[3, 2, 1]
+1	[1, 2, 3]	6	6	[1, 3, 2]	[1, 3, 2]

Review Comment:
   is that determinate result ? if not, do not change other results , otherwise it can broke our pipeline tests



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1627559583

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1631805375

   run p0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21330: support array_contains_all function

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1616746205

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1251401845


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -48,136 +49,116 @@ class FunctionArrayContainsAll : public IFunction {
     size_t get_number_of_arguments() const override { return 2; }
 
     DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
-        return std::make_shared<DataTypeUInt8>();
+        return make_nullable(std::make_shared<DataTypeUInt8>());
     }
 
+    // the semantics of this function is to check if the left array contains all of the right elements.
+    // it's important to note that the arrays are interpreted as sets, and hence the order of the elements
+    // and the number of occurrences of each element are not taken into account.
     Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
                         size_t result, size_t input_rows_count) override {
-        // perpare the result column.
         auto result_col = ColumnUInt8::create(input_rows_count, 0);
+        auto result_null_map = ColumnUInt8::create(input_rows_count, 0);
 
-        // fetch the input columns.
         const auto& left_input_col = block.get_by_position(arguments[0]).column;
         const auto& right_input_col = block.get_by_position(arguments[1]).column;
 
-        // remove the constness of the input columns if necessary.
-        const auto& [left_col_nullable, is_left_const] = unpack_if_const(left_input_col);
-        const auto& [right_col_nullable, is_right_const] = unpack_if_const(right_input_col);
+        // since the input maybe literal, we have to remove constness accordingly.
+        // since the input maybe null, we make it nullable to unify the processing.
+        const auto left_col = make_nullable(unpack_if_const(left_input_col).first);
+        const auto right_col = make_nullable(unpack_if_const(right_input_col).first);
 
-        // casts the columns in advance to avoid the repeated casting in the for loop.
-        // we won't access the cells until we're sure that they're not null,
-        // so it's safe to cast the columns in advance.
-        const ColumnArray* left_col_array = check_and_get_column<ColumnArray>(left_col_nullable);
-        const ColumnArray* right_col_array = check_and_get_column<ColumnArray>(right_col_nullable);
+        const ColumnNullable* left_col_nullable = check_and_get_column<ColumnNullable>(left_col);
+        const ColumnNullable* right_col_nullable = check_and_get_column<ColumnNullable>(right_col);
 
-        for (size_t i = 0; i < input_rows_count; ++i) {
-            // FIXME(niebayes): the null checking seems already done in the frontend.
-            if (left_col_nullable->is_null_at(i) || right_col_nullable->is_null_at(i)) {
-                continue;
-            }
+        const ColumnArray* left_col_array =
+                check_and_get_column<ColumnArray>(left_col_nullable->get_nested_column());
+        const ColumnArray* right_col_array =
+                check_and_get_column<ColumnArray>(right_col_nullable->get_nested_column());
 
-            // each array is a cell in a column array.
-            // however, arrays in a column are flattened to reduce storage overhead.
-            // therefore, we need to use offsets to delimit among arrays and
-            // to locate the elements in an array.
-            const ColumnNullable* left_nested_col_nullable =
-                    check_and_get_column<ColumnNullable>(left_col_array->get_data());
-            const ColumnNullable* right_nested_col_nullable =
-                    check_and_get_column<ColumnNullable>(right_col_array->get_data());
-
-            const ColumnArray::Offsets64& left_offsets = left_col_array->get_offsets();
-            const ColumnArray::Offsets64& right_offsets = right_col_array->get_offsets();
-
-            // construct arrays.
-            const Array left_array = make_array(left_nested_col_nullable, left_offsets, i);
-            const Array right_array = make_array(right_nested_col_nullable, right_offsets, i);
-
-            // check if the left array contains all of the right elements.
-            auto result_data = &result_col->get_data()[i];
-            _check_left_contains_all_right(left_array, right_array, result_data);
-        }
-
-        // store the result column in the specified `result` column of the block.
-        block.replace_by_position(result, std::move(result_col));
-
-        return Status::OK();
-    }
-
-private:
-    // the internal array type.
-    using Offset = ColumnArray::Offset64;
-    struct Array {
-        const ColumnPtr& data;     // data[i] is the i-th element of the array.
-        const NullMap& null_map;   // null_map[i] = true if data[i] is null.
-        const Offset start_offset; // the offset of the first element in the array.
-        const Offset end_offset;   // the offset of the last element in the array.
-
-        Array(const ColumnPtr& data_, const NullMap& null_map_, const Offset start_offset_,
-              const Offset end_offset_)
-                : data {data_},
-                  null_map {null_map_},
-                  start_offset {start_offset_},
-                  end_offset {end_offset_} {}
-    };
-
-    // construct an `Array` instance from a ColumnArray.
-    static Array make_array(const ColumnNullable* col_nullable,
-                            const ColumnArray::Offsets64& offsets, const size_t cur_row) {
-        const ColumnPtr& data = col_nullable->get_nested_column_ptr();
-        const NullMap& null_map = col_nullable->get_null_map_data();
-        const Offset start_offset = cur_row == 0 ? 0 : offsets[cur_row - 1] + 1;
-        const Offset end_offset = offsets[cur_row];
-
-        return Array(data, null_map, start_offset, end_offset);
-    }
-
-    // the semantics of this function is to check if the left array contains all of the right elements.
-    // it's important to note that the arrays are interpreted as sets, and hence the order of the elements
-    // and the number of occurrences of each element are not taken into account.
-    void _check_left_contains_all_right(const Array& left_array, const Array& right_array,
-                                        UInt8* result) {
-        static constexpr UInt8 CONTAINS_ALL = 1;
-        static constexpr UInt8 NOT_CONTAINS_ALL = 0;
+        // data columns are single-dimension columns which stores elements of all arrays.
+        const ColumnNullable* left_data_col_nullable =
+                check_and_get_column<ColumnNullable>(left_col_array->get_data());
+        const ColumnNullable* right_data_col_nullable =
+                check_and_get_column<ColumnNullable>(right_col_array->get_data());
 
-        // set the default result to NOT_CONTAINS_ALL.
-        *result = NOT_CONTAINS_ALL;
+        for (size_t row = 0; row < input_rows_count; ++row) {
+            if (left_col_nullable->is_null_at(row) || right_col_nullable->is_null_at(row)) {
+                result_null_map->get_data()[row] = 1;
+                continue;
+            }
 
-        const bool left_has_nulls = !left_array.null_map.empty();
-        const bool right_has_nulls = !right_array.null_map.empty();
+            const auto& left_offsets = _get_offsets_of_row(left_col_array->get_offsets(), row);
+            const auto& right_offsets = _get_offsets_of_row(right_col_array->get_offsets(), row);
 
-        // the left array cannot contain all of the right elements if the right array has nulls while the left does not.
-        if (right_has_nulls && !left_has_nulls) {
-            return;
-        }
+            const bool left_has_nulls = _has_nulls(left_data_col_nullable, left_offsets);
+            const bool right_has_nulls = _has_nulls(right_data_col_nullable, right_offsets);
 
-        // for each element in the right array, check if it is contained in the left array.
-        // if any element is not contained, then the left array does not contain all of the right elements.
-        for (size_t j = right_array.start_offset; j <= right_array.end_offset; ++j) {
-            // skip null elements in the right array.
-            if (right_has_nulls && right_array.null_map[j]) {
+            if (right_has_nulls && !left_has_nulls) {
                 continue;
             }
 
-            // true if the current element is contained in the left array.
-            bool contained = false;
-            for (size_t i = left_array.start_offset; i <= left_array.end_offset; ++i) {
-                // skip null elements in the left array.
-                if (left_has_nulls && left_array.null_map[i]) {
+            // for each element in the right array, check if it is contained in the left array.
+            // if any element is not contained, then the left array does not contain all of the right elements.
+            bool contains_all = true;
+
+            for (size_t ri = right_offsets.first; ri <= right_offsets.second; ++ri) {
+                // skip null elements in the right array.
+                if (right_data_col_nullable->is_null_at(ri)) {
                     continue;
                 }
 
-                if (left_array.data->compare_at(i, j, *right_array.data, -1) == 0) {
-                    contained = true;
+                // true if the left array contains this element.
+                bool contained = false;
+
+                for (size_t li = left_offsets.first; li <= left_offsets.second; ++li) {
+                    // skip null elements in the left array.
+                    if (left_data_col_nullable->is_null_at(li)) {
+                        continue;
+                    }
+
+                    // ColumnNullable::compare_at will invoke the `compare_at` of the nested column.
+                    if (left_data_col_nullable->compare_at(li, ri, *right_data_col_nullable, -1) ==
+                        0) {
+                        contained = true;
+                        break;
+                    }
+                }
+
+                if (!contained) {
+                    contains_all = false;
                     break;
                 }
             }
-            if (!contained) {
-                return;
+
+            if (contains_all) {
+                result_col->get_data()[row] = 1;
             }
         }
 
-        // all elements in the right array are contained in the left array.
-        *result = CONTAINS_ALL;
+        auto result_col_nullable =
+                ColumnNullable::create(std::move(result_col), std::move(result_null_map));
+        block.replace_by_position(result, std::move(result_col_nullable));
+
+        return Status::OK();
+    }
+
+private:
+    // get the start and end offsets of the array at the given row.
+    std::pair<size_t, size_t> _get_offsets_of_row(const ColumnArray::Offsets64& offsets,
+                                                  const size_t row) {
+        const size_t start_offset = row == 0 ? 0 : offsets[row - 1] + 1;

Review Comment:
   no need this!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1250186886


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -0,0 +1,189 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "common/status.h"
+#include "vec/columns/column.h"
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_const.h"
+#include "vec/columns/column_nullable.h"
+#include "vec/common/assert_cast.h"
+#include "vec/core/block.h"
+#include "vec/core/column_numbers.h"
+#include "vec/core/types.h"
+#include "vec/data_types/data_type.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/function.h"
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+class FunctionArrayContainsAll : public IFunction {
+public:
+    static constexpr auto name {"array_contains_all"};
+
+    static FunctionPtr create() { return std::make_shared<FunctionArrayContainsAll>(); }
+
+    String get_name() const override { return name; }
+
+    bool is_variadic() const override { return false; }
+
+    bool use_default_implementation_for_nulls() const override { return false; }
+
+    bool use_default_implementation_for_constants() const override { return false; }
+
+    size_t get_number_of_arguments() const override { return 2; }
+
+    DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+        return std::make_shared<DataTypeUInt8>();
+    }
+
+    Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                        size_t result, size_t input_rows_count) override {
+        // perpare the result column.
+        auto result_col = ColumnUInt8::create(input_rows_count, 0);
+
+        // fetch the input columns.
+        const auto& left_input_col = block.get_by_position(arguments[0]).column;
+        const auto& right_input_col = block.get_by_position(arguments[1]).column;
+
+        // remove the constness of the input columns if necessary.
+        const auto& [left_col_nullable, is_left_const] = unpack_if_const(left_input_col);
+        const auto& [right_col_nullable, is_right_const] = unpack_if_const(right_input_col);
+
+        // casts the columns in advance to avoid the repeated casting in the for loop.
+        // we won't access the cells until we're sure that they're not null,
+        // so it's safe to cast the columns in advance.
+        const ColumnArray* left_col_array = check_and_get_column<ColumnArray>(left_col_nullable);
+        const ColumnArray* right_col_array = check_and_get_column<ColumnArray>(right_col_nullable);
+
+        for (size_t i = 0; i < input_rows_count; ++i) {
+            // FIXME(niebayes): the null checking seems already done in the frontend.
+            if (left_col_nullable->is_null_at(i) || right_col_nullable->is_null_at(i)) {
+                continue;
+            }
+
+            // each array is a cell in a column array.
+            // however, arrays in a column are flattened to reduce storage overhead.
+            // therefore, we need to use offsets to delimit among arrays and
+            // to locate the elements in an array.
+            const ColumnNullable* left_nested_col_nullable =
+                    check_and_get_column<ColumnNullable>(left_col_array->get_data());
+            const ColumnNullable* right_nested_col_nullable =
+                    check_and_get_column<ColumnNullable>(right_col_array->get_data());
+
+            const ColumnArray::Offsets64& left_offsets = left_col_array->get_offsets();
+            const ColumnArray::Offsets64& right_offsets = right_col_array->get_offsets();
+
+            // construct arrays.
+            const Array left_array = make_array(left_nested_col_nullable, left_offsets, i);
+            const Array right_array = make_array(right_nested_col_nullable, right_offsets, i);
+
+            // check if the left array contains all of the right elements.
+            auto result_data = &result_col->get_data()[i];
+            _check_left_contains_all_right(left_array, right_array, result_data);
+        }
+
+        // store the result column in the specified `result` column of the block.
+        block.replace_by_position(result, std::move(result_col));
+
+        return Status::OK();
+    }
+
+private:
+    // the internal array type.
+    using Offset = ColumnArray::Offset64;
+    struct Array {
+        const ColumnPtr& data;     // data[i] is the i-th element of the array.
+        const NullMap& null_map;   // null_map[i] = true if data[i] is null.
+        const Offset start_offset; // the offset of the first element in the array.
+        const Offset end_offset;   // the offset of the last element in the array.
+
+        Array(const ColumnPtr& data_, const NullMap& null_map_, const Offset start_offset_,
+              const Offset end_offset_)
+                : data {data_},
+                  null_map {null_map_},
+                  start_offset {start_offset_},
+                  end_offset {end_offset_} {}
+    };
+
+    // construct an `Array` instance from a ColumnArray.
+    static Array make_array(const ColumnNullable* col_nullable,
+                            const ColumnArray::Offsets64& offsets, const size_t cur_row) {
+        const ColumnPtr& data = col_nullable->get_nested_column_ptr();
+        const NullMap& null_map = col_nullable->get_null_map_data();
+        const Offset start_offset = cur_row == 0 ? 0 : offsets[cur_row - 1] + 1;
+        const Offset end_offset = offsets[cur_row];
+
+        return Array(data, null_map, start_offset, end_offset);
+    }
+
+    // the semantics of this function is to check if the left array contains all of the right elements.
+    // it's important to note that the arrays are interpreted as sets, and hence the order of the elements
+    // and the number of occurrences of each element are not taken into account.
+    void _check_left_contains_all_right(const Array& left_array, const Array& right_array,
+                                        UInt8* result) {
+        static constexpr UInt8 CONTAINS_ALL = 1;
+        static constexpr UInt8 NOT_CONTAINS_ALL = 0;
+
+        // set the default result to NOT_CONTAINS_ALL.
+        *result = NOT_CONTAINS_ALL;
+
+        const bool left_has_nulls = !left_array.null_map.empty();
+        const bool right_has_nulls = !right_array.null_map.empty();
+
+        // the left array cannot contain all of the right elements if the right array has nulls while the left does not.
+        if (right_has_nulls && !left_has_nulls) {
+            return;
+        }
+
+        // for each element in the right array, check if it is contained in the left array.
+        // if any element is not contained, then the left array does not contain all of the right elements.
+        for (size_t j = right_array.start_offset; j <= right_array.end_offset; ++j) {
+            // skip null elements in the right array.
+            if (right_has_nulls && right_array.null_map[j]) {
+                continue;
+            }
+
+            // true if the current element is contained in the left array.
+            bool contained = false;
+            for (size_t i = left_array.start_offset; i <= left_array.end_offset; ++i) {

Review Comment:
   here O(n) = j * i which is cost much more time .  can u just make O(n) = j+i ?
   U can use function_array_map.h to help u get this time cost 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1246406458


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -0,0 +1,160 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+ // or more contributor license agreements.  See the NOTICE file
+ // distributed with this work for additional information
+ // regarding copyright ownership.  The ASF licenses this file
+ // to you under the Apache License, Version 2.0 (the
+ // "License"); you may not use this file except in compliance
+ // with the License.  You may obtain a copy of the License at
+ //
+ //   http://www.apache.org/licenses/LICENSE-2.0
+ //
+ // Unless required by applicable law or agreed to in writing,
+ // software distributed under the License is distributed on an
+ // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ // KIND, either express or implied.  See the License for the
+ // specific language governing permissions and limitations
+ // under the License.
+
+ #include "common/status.h"
+ #include "vec/columns/column.h"
+ #include "vec/columns/column_array.h"
+ #include "vec/columns/column_const.h"
+ #include "vec/columns/column_nullable.h"
+ #include "vec/common/assert_cast.h"
+ #include "vec/core/block.h"
+ #include "vec/core/column_numbers.h"
+ #include "vec/core/types.h"
+ #include "vec/data_types/data_type.h"
+ #include "vec/data_types/data_type_number.h"
+ #include "vec/functions/function.h"
+ #include "vec/functions/simple_function_factory.h"
+
+ namespace doris::vectorized {
+
+ class FunctionArrayContainsAll : public IFunction {
+ public:
+     static constexpr auto name {"array_contains_all"};
+
+     static FunctionPtr create() { return std::make_shared<FunctionArrayContainsAll>(); }
+
+     String get_name() const override { return name; }
+
+     bool use_default_implementation_for_nulls() const override { return false; }
+
+     bool is_variadic() const override { return false; }
+
+     size_t get_number_of_arguments() const override { return 2; }
+
+     DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+         return std::make_shared<DataTypeUInt8>();
+     }
+
+     Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                         size_t result, size_t input_rows_count) override {
+         // construct arrays from the input columns.
+         const Array left_array = from_input_column(block.get_by_position(arguments[0]).column);
+         const Array right_array = from_input_column(block.get_by_position(arguments[1]).column);
+
+         // construct a column to store the execution result.
+         auto result_column = ColumnUInt8::create(input_rows_count);
+         UInt8* result_data = result_column->get_data().data();
+
+         // check if the left array contains all of the right elements.
+         _execute_internal(left_array, right_array, result_data);
+
+         // store the result column in the specified `result` column of the block.
+         block.replace_by_position(result, std::move(result_column));
+
+         return Status::OK();
+     }
+
+ private:
+     // the internal array type.
+     struct Array {

Review Comment:
   this array is just first row in column_array ? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] niebayes commented on a diff in pull request #21330: support array_contains_all function

Posted by "niebayes (via GitHub)" <gi...@apache.org>.
niebayes commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1257178549


##########
regression-test/data/query_p0/sql_functions/array_functions/test_array_functions.out:
##########
@@ -704,15 +704,15 @@
 9	\N
 
 -- !select_array_shuffle1 --
-1	[1, 2, 3]	6	6	[3, 2, 1]	[3, 2, 1]
+1	[1, 2, 3]	6	6	[1, 3, 2]	[1, 3, 2]

Review Comment:
   Seeding introduces randomness into the codes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1631805682

   run arm


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1251404926


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -26,6 +26,7 @@
 #include "vec/core/types.h"
 #include "vec/data_types/data_type.h"
 #include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"

Review Comment:
   unuseless include 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1250185563


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -0,0 +1,189 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "common/status.h"
+#include "vec/columns/column.h"
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_const.h"
+#include "vec/columns/column_nullable.h"
+#include "vec/common/assert_cast.h"
+#include "vec/core/block.h"
+#include "vec/core/column_numbers.h"
+#include "vec/core/types.h"
+#include "vec/data_types/data_type.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/function.h"
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+class FunctionArrayContainsAll : public IFunction {
+public:
+    static constexpr auto name {"array_contains_all"};
+
+    static FunctionPtr create() { return std::make_shared<FunctionArrayContainsAll>(); }
+
+    String get_name() const override { return name; }
+
+    bool is_variadic() const override { return false; }
+
+    bool use_default_implementation_for_nulls() const override { return false; }
+
+    bool use_default_implementation_for_constants() const override { return false; }
+
+    size_t get_number_of_arguments() const override { return 2; }
+
+    DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+        return std::make_shared<DataTypeUInt8>();
+    }
+
+    Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                        size_t result, size_t input_rows_count) override {
+        // perpare the result column.
+        auto result_col = ColumnUInt8::create(input_rows_count, 0);
+
+        // fetch the input columns.
+        const auto& left_input_col = block.get_by_position(arguments[0]).column;
+        const auto& right_input_col = block.get_by_position(arguments[1]).column;
+
+        // remove the constness of the input columns if necessary.
+        const auto& [left_col_nullable, is_left_const] = unpack_if_const(left_input_col);
+        const auto& [right_col_nullable, is_right_const] = unpack_if_const(right_input_col);
+
+        // casts the columns in advance to avoid the repeated casting in the for loop.
+        // we won't access the cells until we're sure that they're not null,
+        // so it's safe to cast the columns in advance.
+        const ColumnArray* left_col_array = check_and_get_column<ColumnArray>(left_col_nullable);
+        const ColumnArray* right_col_array = check_and_get_column<ColumnArray>(right_col_nullable);
+
+        for (size_t i = 0; i < input_rows_count; ++i) {
+            // FIXME(niebayes): the null checking seems already done in the frontend.
+            if (left_col_nullable->is_null_at(i) || right_col_nullable->is_null_at(i)) {
+                continue;
+            }
+
+            // each array is a cell in a column array.
+            // however, arrays in a column are flattened to reduce storage overhead.
+            // therefore, we need to use offsets to delimit among arrays and
+            // to locate the elements in an array.
+            const ColumnNullable* left_nested_col_nullable =
+                    check_and_get_column<ColumnNullable>(left_col_array->get_data());
+            const ColumnNullable* right_nested_col_nullable =
+                    check_and_get_column<ColumnNullable>(right_col_array->get_data());
+
+            const ColumnArray::Offsets64& left_offsets = left_col_array->get_offsets();
+            const ColumnArray::Offsets64& right_offsets = right_col_array->get_offsets();
+
+            // construct arrays.
+            const Array left_array = make_array(left_nested_col_nullable, left_offsets, i);

Review Comment:
   I think there is no need to make make_array !just use column_array is enough! 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21330: support array_contains_all function

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1627394646

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21330: support array_contains_all function

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1618906017

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1257122788


##########
regression-test/data/query_p0/sql_functions/array_functions/test_array_functions.out:
##########
@@ -704,15 +704,15 @@
 9	\N
 
 -- !select_array_shuffle1 --
-1	[1, 2, 3]	6	6	[3, 2, 1]	[3, 2, 1]
+1	[1, 2, 3]	6	6	[1, 3, 2]	[1, 3, 2]

Review Comment:
   why these result changed ? did ur code will change other function ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21330: support array_contains_all function

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1627559663

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] niebayes commented on pull request #21330: support array_contains_all function

Posted by "niebayes (via GitHub)" <gi...@apache.org>.
niebayes commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1625503096

   @amorynan Hi, I've updated the implementation. Please review it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1250185563


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -0,0 +1,189 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "common/status.h"
+#include "vec/columns/column.h"
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_const.h"
+#include "vec/columns/column_nullable.h"
+#include "vec/common/assert_cast.h"
+#include "vec/core/block.h"
+#include "vec/core/column_numbers.h"
+#include "vec/core/types.h"
+#include "vec/data_types/data_type.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/function.h"
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+class FunctionArrayContainsAll : public IFunction {
+public:
+    static constexpr auto name {"array_contains_all"};
+
+    static FunctionPtr create() { return std::make_shared<FunctionArrayContainsAll>(); }
+
+    String get_name() const override { return name; }
+
+    bool is_variadic() const override { return false; }
+
+    bool use_default_implementation_for_nulls() const override { return false; }
+
+    bool use_default_implementation_for_constants() const override { return false; }
+
+    size_t get_number_of_arguments() const override { return 2; }
+
+    DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+        return std::make_shared<DataTypeUInt8>();
+    }
+
+    Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                        size_t result, size_t input_rows_count) override {
+        // perpare the result column.
+        auto result_col = ColumnUInt8::create(input_rows_count, 0);
+
+        // fetch the input columns.
+        const auto& left_input_col = block.get_by_position(arguments[0]).column;
+        const auto& right_input_col = block.get_by_position(arguments[1]).column;
+
+        // remove the constness of the input columns if necessary.
+        const auto& [left_col_nullable, is_left_const] = unpack_if_const(left_input_col);
+        const auto& [right_col_nullable, is_right_const] = unpack_if_const(right_input_col);
+
+        // casts the columns in advance to avoid the repeated casting in the for loop.
+        // we won't access the cells until we're sure that they're not null,
+        // so it's safe to cast the columns in advance.
+        const ColumnArray* left_col_array = check_and_get_column<ColumnArray>(left_col_nullable);
+        const ColumnArray* right_col_array = check_and_get_column<ColumnArray>(right_col_nullable);
+
+        for (size_t i = 0; i < input_rows_count; ++i) {
+            // FIXME(niebayes): the null checking seems already done in the frontend.
+            if (left_col_nullable->is_null_at(i) || right_col_nullable->is_null_at(i)) {
+                continue;
+            }
+
+            // each array is a cell in a column array.
+            // however, arrays in a column are flattened to reduce storage overhead.
+            // therefore, we need to use offsets to delimit among arrays and
+            // to locate the elements in an array.
+            const ColumnNullable* left_nested_col_nullable =
+                    check_and_get_column<ColumnNullable>(left_col_array->get_data());
+            const ColumnNullable* right_nested_col_nullable =
+                    check_and_get_column<ColumnNullable>(right_col_array->get_data());
+
+            const ColumnArray::Offsets64& left_offsets = left_col_array->get_offsets();
+            const ColumnArray::Offsets64& right_offsets = right_col_array->get_offsets();
+
+            // construct arrays.
+            const Array left_array = make_array(left_nested_col_nullable, left_offsets, i);

Review Comment:
   I think there is no need to make make_array !just use column_array is enough!, just make code simple like this below...
   ```
           const ColumnArray* left_col_array = check_and_get_column<ColumnArray>(left_col_nullable);
           const ColumnArray* right_col_array = check_and_get_column<ColumnArray>(right_col_nullable);
           const ColumnNullable* left_nested_col_nullable =
                   check_and_get_column<ColumnNullable>(left_col_array->get_data());
           const ColumnNullable* right_nested_col_nullable =
                   check_and_get_column<ColumnNullable>(right_col_array->get_data());
   
           const ColumnArray::Offsets64& left_offsets = left_col_array->get_offsets();
           const ColumnArray::Offsets64& right_offsets = right_col_array->get_offsets();
           auto& result_col_data = ColumnUInt8::create(input_rows_count, 0)->get_data();
           for (size_t r = 0; r < input_rows_count; ++r) { // loop for row
               if (left_col_nullable->is_null_at(r) || right_col_nullable->is_null_at(r)) {
                   continue;
               }
               for (size_t ri = right_offsets[r-1]; ri < right_offsets[r]; ++ri) { // loop for right column array
                   // skip null elements in the right array.
                   if (right_nested_col_nullable->is_null_at(ri)) {
                       continue;
                   }
   
                   bool contained = true;
                   for (size_t li = left_offsets[r-1]; li <= left_offsets[r]; ++li) { // loop for left column array
                       // skip null elements in the left array.
                       if (left_nested_col_nullable->is_null_at(li)) {
                           continue;
                       }
   
                       if (left_nested_col_nullable->compare_at(li, ri, *right_nested_col_nullable, -1) != 0) {
                           // once here not equal, just not satisfy has_all
                           contained = false;
                           break;
                       }
                   }
                   if (contained) {
                       result_col_data[r] = 1;
                   }
              }
           }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1617116880

   U didn't resolve all question!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21330: support array_contains_all function

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1612616414

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1246404909


##########
regression-test/suites/query_p0/sql_functions/array_functions/test_array_functions_by_literal.groovy:
##########
@@ -30,6 +30,17 @@ suite("test_array_functions_by_literal") {
     qt_sql "select array_contains(array(cast ('2023-02-04' as datev2),cast ('2023-02-05' as datev2)), cast ('2023-02-05' as datev2))"
     qt_sql "select array_contains(array(cast (111.111 as decimalv3(6,3)),cast (222.222 as decimalv3(6,3))), cast (111.111 as decimalv3(6,3)))"
 
+    // array_contains_all function
+    qt_sql "select array_contains_all([], [])"

Review Comment:
   add this function in table, not just literal. see test_array_functions.groovy



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


Re: [PR] support array_contains_all function [doris]

Posted by "niebayes (via GitHub)" <gi...@apache.org>.
niebayes closed pull request #21330: support array_contains_all function
URL: https://github.com/apache/doris/pull/21330


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1631805442

   run feut


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] niebayes commented on pull request #21330: support array_contains_all function

Posted by "niebayes (via GitHub)" <gi...@apache.org>.
niebayes commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1616743148

   @amorynan I've updated the implementation. Please review it. There remains an issue in SQL parsing and I will resolve it ASAP.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21330: support array_contains_all function

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1619623533

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] niebayes commented on a diff in pull request #21330: support array_contains_all function

Posted by "niebayes (via GitHub)" <gi...@apache.org>.
niebayes commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1257178549


##########
regression-test/data/query_p0/sql_functions/array_functions/test_array_functions.out:
##########
@@ -704,15 +704,15 @@
 9	\N
 
 -- !select_array_shuffle1 --
-1	[1, 2, 3]	6	6	[3, 2, 1]	[3, 2, 1]
+1	[1, 2, 3]	6	6	[1, 3, 2]	[1, 3, 2]

Review Comment:
   <img width="574" alt="image" src="https://github.com/apache/doris/assets/25320945/f81eb18a-649b-423c-b2fb-0a85c659157a"
   
   Seeding introduces randomness into the codes.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1246428151


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -0,0 +1,160 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+ // or more contributor license agreements.  See the NOTICE file
+ // distributed with this work for additional information
+ // regarding copyright ownership.  The ASF licenses this file
+ // to you under the Apache License, Version 2.0 (the
+ // "License"); you may not use this file except in compliance
+ // with the License.  You may obtain a copy of the License at
+ //
+ //   http://www.apache.org/licenses/LICENSE-2.0
+ //
+ // Unless required by applicable law or agreed to in writing,
+ // software distributed under the License is distributed on an
+ // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ // KIND, either express or implied.  See the License for the
+ // specific language governing permissions and limitations
+ // under the License.
+
+ #include "common/status.h"
+ #include "vec/columns/column.h"
+ #include "vec/columns/column_array.h"
+ #include "vec/columns/column_const.h"
+ #include "vec/columns/column_nullable.h"
+ #include "vec/common/assert_cast.h"
+ #include "vec/core/block.h"
+ #include "vec/core/column_numbers.h"
+ #include "vec/core/types.h"
+ #include "vec/data_types/data_type.h"
+ #include "vec/data_types/data_type_number.h"
+ #include "vec/functions/function.h"
+ #include "vec/functions/simple_function_factory.h"
+
+ namespace doris::vectorized {
+
+ class FunctionArrayContainsAll : public IFunction {
+ public:
+     static constexpr auto name {"array_contains_all"};
+
+     static FunctionPtr create() { return std::make_shared<FunctionArrayContainsAll>(); }
+
+     String get_name() const override { return name; }
+
+     bool use_default_implementation_for_nulls() const override { return false; }
+
+     bool is_variadic() const override { return false; }
+
+     size_t get_number_of_arguments() const override { return 2; }
+
+     DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+         return std::make_shared<DataTypeUInt8>();
+     }
+
+     Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                         size_t result, size_t input_rows_count) override {
+         // construct arrays from the input columns.
+         const Array left_array = from_input_column(block.get_by_position(arguments[0]).column);
+         const Array right_array = from_input_column(block.get_by_position(arguments[1]).column);
+
+         // construct a column to store the execution result.
+         auto result_column = ColumnUInt8::create(input_rows_count);
+         UInt8* result_data = result_column->get_data().data();
+
+         // check if the left array contains all of the right elements.
+         _execute_internal(left_array, right_array, result_data);
+
+         // store the result column in the specified `result` column of the block.
+         block.replace_by_position(result, std::move(result_column));
+
+         return Status::OK();
+     }
+
+ private:
+     // the internal array type.
+     struct Array {

Review Comment:
   but in ur code from_input_column() :size_t num_elements = offsets[0] - offsets[-1] which just for first row in column_array ! and we can use ColumnArrayExecutionData here 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] niebayes commented on a diff in pull request #21330: support array_contains_all function

Posted by "niebayes (via GitHub)" <gi...@apache.org>.
niebayes commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1246414334


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -0,0 +1,160 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+ // or more contributor license agreements.  See the NOTICE file
+ // distributed with this work for additional information
+ // regarding copyright ownership.  The ASF licenses this file
+ // to you under the Apache License, Version 2.0 (the
+ // "License"); you may not use this file except in compliance
+ // with the License.  You may obtain a copy of the License at
+ //
+ //   http://www.apache.org/licenses/LICENSE-2.0
+ //
+ // Unless required by applicable law or agreed to in writing,
+ // software distributed under the License is distributed on an
+ // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ // KIND, either express or implied.  See the License for the
+ // specific language governing permissions and limitations
+ // under the License.
+
+ #include "common/status.h"
+ #include "vec/columns/column.h"
+ #include "vec/columns/column_array.h"
+ #include "vec/columns/column_const.h"
+ #include "vec/columns/column_nullable.h"
+ #include "vec/common/assert_cast.h"
+ #include "vec/core/block.h"
+ #include "vec/core/column_numbers.h"
+ #include "vec/core/types.h"
+ #include "vec/data_types/data_type.h"
+ #include "vec/data_types/data_type_number.h"
+ #include "vec/functions/function.h"
+ #include "vec/functions/simple_function_factory.h"
+
+ namespace doris::vectorized {
+
+ class FunctionArrayContainsAll : public IFunction {
+ public:
+     static constexpr auto name {"array_contains_all"};
+
+     static FunctionPtr create() { return std::make_shared<FunctionArrayContainsAll>(); }
+
+     String get_name() const override { return name; }
+
+     bool use_default_implementation_for_nulls() const override { return false; }
+
+     bool is_variadic() const override { return false; }
+
+     size_t get_number_of_arguments() const override { return 2; }
+
+     DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+         return std::make_shared<DataTypeUInt8>();
+     }
+
+     Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                         size_t result, size_t input_rows_count) override {
+         // construct arrays from the input columns.
+         const Array left_array = from_input_column(block.get_by_position(arguments[0]).column);
+         const Array right_array = from_input_column(block.get_by_position(arguments[1]).column);
+
+         // construct a column to store the execution result.
+         auto result_column = ColumnUInt8::create(input_rows_count);
+         UInt8* result_data = result_column->get_data().data();
+
+         // check if the left array contains all of the right elements.
+         _execute_internal(left_array, right_array, result_data);
+
+         // store the result column in the specified `result` column of the block.
+         block.replace_by_position(result, std::move(result_column));
+
+         return Status::OK();
+     }
+
+ private:
+     // the internal array type.
+     struct Array {

Review Comment:
   `Array` is an aggregate type wrapping array data and the corresponding nullmap. It's used for the sake of code reuse and readability.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1250347552


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -0,0 +1,189 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "common/status.h"
+#include "vec/columns/column.h"
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_const.h"
+#include "vec/columns/column_nullable.h"
+#include "vec/common/assert_cast.h"
+#include "vec/core/block.h"
+#include "vec/core/column_numbers.h"
+#include "vec/core/types.h"
+#include "vec/data_types/data_type.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/function.h"
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+class FunctionArrayContainsAll : public IFunction {
+public:
+    static constexpr auto name {"array_contains_all"};
+
+    static FunctionPtr create() { return std::make_shared<FunctionArrayContainsAll>(); }
+
+    String get_name() const override { return name; }
+
+    bool is_variadic() const override { return false; }
+
+    bool use_default_implementation_for_nulls() const override { return false; }
+
+    bool use_default_implementation_for_constants() const override { return false; }
+
+    size_t get_number_of_arguments() const override { return 2; }
+
+    DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+        return std::make_shared<DataTypeUInt8>();
+    }
+
+    Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                        size_t result, size_t input_rows_count) override {
+        // perpare the result column.
+        auto result_col = ColumnUInt8::create(input_rows_count, 0);
+
+        // fetch the input columns.
+        const auto& left_input_col = block.get_by_position(arguments[0]).column;
+        const auto& right_input_col = block.get_by_position(arguments[1]).column;
+
+        // remove the constness of the input columns if necessary.
+        const auto& [left_col_nullable, is_left_const] = unpack_if_const(left_input_col);
+        const auto& [right_col_nullable, is_right_const] = unpack_if_const(right_input_col);
+
+        // casts the columns in advance to avoid the repeated casting in the for loop.
+        // we won't access the cells until we're sure that they're not null,
+        // so it's safe to cast the columns in advance.
+        const ColumnArray* left_col_array = check_and_get_column<ColumnArray>(left_col_nullable);
+        const ColumnArray* right_col_array = check_and_get_column<ColumnArray>(right_col_nullable);
+
+        for (size_t i = 0; i < input_rows_count; ++i) {
+            // FIXME(niebayes): the null checking seems already done in the frontend.
+            if (left_col_nullable->is_null_at(i) || right_col_nullable->is_null_at(i)) {
+                continue;
+            }
+
+            // each array is a cell in a column array.
+            // however, arrays in a column are flattened to reduce storage overhead.
+            // therefore, we need to use offsets to delimit among arrays and
+            // to locate the elements in an array.
+            const ColumnNullable* left_nested_col_nullable =
+                    check_and_get_column<ColumnNullable>(left_col_array->get_data());

Review Comment:
   out of loop



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1246401594


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -0,0 +1,160 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+ // or more contributor license agreements.  See the NOTICE file
+ // distributed with this work for additional information
+ // regarding copyright ownership.  The ASF licenses this file
+ // to you under the Apache License, Version 2.0 (the
+ // "License"); you may not use this file except in compliance
+ // with the License.  You may obtain a copy of the License at
+ //
+ //   http://www.apache.org/licenses/LICENSE-2.0
+ //
+ // Unless required by applicable law or agreed to in writing,
+ // software distributed under the License is distributed on an
+ // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ // KIND, either express or implied.  See the License for the
+ // specific language governing permissions and limitations
+ // under the License.
+
+ #include "common/status.h"
+ #include "vec/columns/column.h"
+ #include "vec/columns/column_array.h"
+ #include "vec/columns/column_const.h"
+ #include "vec/columns/column_nullable.h"
+ #include "vec/common/assert_cast.h"
+ #include "vec/core/block.h"
+ #include "vec/core/column_numbers.h"
+ #include "vec/core/types.h"
+ #include "vec/data_types/data_type.h"
+ #include "vec/data_types/data_type_number.h"
+ #include "vec/functions/function.h"
+ #include "vec/functions/simple_function_factory.h"
+
+ namespace doris::vectorized {
+
+ class FunctionArrayContainsAll : public IFunction {
+ public:
+     static constexpr auto name {"array_contains_all"};
+
+     static FunctionPtr create() { return std::make_shared<FunctionArrayContainsAll>(); }
+
+     String get_name() const override { return name; }
+
+     bool use_default_implementation_for_nulls() const override { return false; }
+
+     bool is_variadic() const override { return false; }
+
+     size_t get_number_of_arguments() const override { return 2; }
+
+     DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
+         return std::make_shared<DataTypeUInt8>();
+     }
+
+     Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
+                         size_t result, size_t input_rows_count) override {
+         // construct arrays from the input columns.
+         const Array left_array = from_input_column(block.get_by_position(arguments[0]).column);
+         const Array right_array = from_input_column(block.get_by_position(arguments[1]).column);
+
+         // construct a column to store the execution result.
+         auto result_column = ColumnUInt8::create(input_rows_count);
+         UInt8* result_data = result_column->get_data().data();
+
+         // check if the left array contains all of the right elements.
+         _execute_internal(left_array, right_array, result_data);
+
+         // store the result column in the specified `result` column of the block.
+         block.replace_by_position(result, std::move(result_column));
+
+         return Status::OK();
+     }
+
+ private:
+     // the internal array type.
+     struct Array {
+         ColumnPtr data {nullptr};
+         const NullMap& null_map;
+         const size_t num_elements;
+
+         Array(ColumnPtr data_, const NullMap& null_map_, const size_t num_elements_)
+                 : data {data_}, null_map {null_map_}, num_elements {num_elements_} {}
+     };
+
+     // construct an `Array` instance from the input column.
+     static Array from_input_column(ColumnPtr column) {
+         const auto& [nullable_column, _] = unpack_if_const(column);
+
+         // applying NULL checking on the nullable columns is somewhat the canonical way.
+         // however, the NULL checking is already performed in the frontend.
+         // so we simply fetch the nested array column without any further checking.
+         const ColumnArray* array_column = assert_cast<const ColumnArray*>(nullable_column.get());
+
+         // fetch the data and the corresponding null map.
+         const auto& nested_nullable_column =
+                 assert_cast<const ColumnNullable&>(array_column->get_data());
+         ColumnPtr data = nested_nullable_column.get_nested_column_ptr();
+         const NullMap& null_map = nested_nullable_column.get_null_map_data();
+
+         // count the number of elements in the array.
+         const auto& offsets = array_column->get_offsets();
+         // FIXME(niebayes): the usage pattern of `offsets` is somewhat confusing.
+         // maybe we can find another more elegant way to count the number of elements.
+         const size_t num_elements = offsets[0] - offsets[-1];

Review Comment:
   we just use offsets[cur_row] - offsets[cur_row-1] to get current row's array element , u can see this in column_array.cpp 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on a diff in pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on code in PR #21330:
URL: https://github.com/apache/doris/pull/21330#discussion_r1251403973


##########
be/src/vec/functions/array/function_array_contains_all.cpp:
##########
@@ -48,136 +49,116 @@ class FunctionArrayContainsAll : public IFunction {
     size_t get_number_of_arguments() const override { return 2; }
 
     DataTypePtr get_return_type_impl(const DataTypes& arguments) const override {
-        return std::make_shared<DataTypeUInt8>();
+        return make_nullable(std::make_shared<DataTypeUInt8>());
     }
 
+    // the semantics of this function is to check if the left array contains all of the right elements.
+    // it's important to note that the arrays are interpreted as sets, and hence the order of the elements
+    // and the number of occurrences of each element are not taken into account.
     Status execute_impl(FunctionContext* context, Block& block, const ColumnNumbers& arguments,
                         size_t result, size_t input_rows_count) override {
-        // perpare the result column.
         auto result_col = ColumnUInt8::create(input_rows_count, 0);
+        auto result_null_map = ColumnUInt8::create(input_rows_count, 0);
 
-        // fetch the input columns.
         const auto& left_input_col = block.get_by_position(arguments[0]).column;
         const auto& right_input_col = block.get_by_position(arguments[1]).column;
 
-        // remove the constness of the input columns if necessary.
-        const auto& [left_col_nullable, is_left_const] = unpack_if_const(left_input_col);
-        const auto& [right_col_nullable, is_right_const] = unpack_if_const(right_input_col);
+        // since the input maybe literal, we have to remove constness accordingly.
+        // since the input maybe null, we make it nullable to unify the processing.
+        const auto left_col = make_nullable(unpack_if_const(left_input_col).first);
+        const auto right_col = make_nullable(unpack_if_const(right_input_col).first);
 
-        // casts the columns in advance to avoid the repeated casting in the for loop.
-        // we won't access the cells until we're sure that they're not null,
-        // so it's safe to cast the columns in advance.
-        const ColumnArray* left_col_array = check_and_get_column<ColumnArray>(left_col_nullable);
-        const ColumnArray* right_col_array = check_and_get_column<ColumnArray>(right_col_nullable);
+        const ColumnNullable* left_col_nullable = check_and_get_column<ColumnNullable>(left_col);
+        const ColumnNullable* right_col_nullable = check_and_get_column<ColumnNullable>(right_col);
 
-        for (size_t i = 0; i < input_rows_count; ++i) {
-            // FIXME(niebayes): the null checking seems already done in the frontend.
-            if (left_col_nullable->is_null_at(i) || right_col_nullable->is_null_at(i)) {
-                continue;
-            }
+        const ColumnArray* left_col_array =
+                check_and_get_column<ColumnArray>(left_col_nullable->get_nested_column());
+        const ColumnArray* right_col_array =
+                check_and_get_column<ColumnArray>(right_col_nullable->get_nested_column());
 
-            // each array is a cell in a column array.
-            // however, arrays in a column are flattened to reduce storage overhead.
-            // therefore, we need to use offsets to delimit among arrays and
-            // to locate the elements in an array.
-            const ColumnNullable* left_nested_col_nullable =
-                    check_and_get_column<ColumnNullable>(left_col_array->get_data());
-            const ColumnNullable* right_nested_col_nullable =
-                    check_and_get_column<ColumnNullable>(right_col_array->get_data());
-
-            const ColumnArray::Offsets64& left_offsets = left_col_array->get_offsets();
-            const ColumnArray::Offsets64& right_offsets = right_col_array->get_offsets();
-
-            // construct arrays.
-            const Array left_array = make_array(left_nested_col_nullable, left_offsets, i);
-            const Array right_array = make_array(right_nested_col_nullable, right_offsets, i);
-
-            // check if the left array contains all of the right elements.
-            auto result_data = &result_col->get_data()[i];
-            _check_left_contains_all_right(left_array, right_array, result_data);
-        }
-
-        // store the result column in the specified `result` column of the block.
-        block.replace_by_position(result, std::move(result_col));
-
-        return Status::OK();
-    }
-
-private:
-    // the internal array type.
-    using Offset = ColumnArray::Offset64;
-    struct Array {
-        const ColumnPtr& data;     // data[i] is the i-th element of the array.
-        const NullMap& null_map;   // null_map[i] = true if data[i] is null.
-        const Offset start_offset; // the offset of the first element in the array.
-        const Offset end_offset;   // the offset of the last element in the array.
-
-        Array(const ColumnPtr& data_, const NullMap& null_map_, const Offset start_offset_,
-              const Offset end_offset_)
-                : data {data_},
-                  null_map {null_map_},
-                  start_offset {start_offset_},
-                  end_offset {end_offset_} {}
-    };
-
-    // construct an `Array` instance from a ColumnArray.
-    static Array make_array(const ColumnNullable* col_nullable,
-                            const ColumnArray::Offsets64& offsets, const size_t cur_row) {
-        const ColumnPtr& data = col_nullable->get_nested_column_ptr();
-        const NullMap& null_map = col_nullable->get_null_map_data();
-        const Offset start_offset = cur_row == 0 ? 0 : offsets[cur_row - 1] + 1;
-        const Offset end_offset = offsets[cur_row];
-
-        return Array(data, null_map, start_offset, end_offset);
-    }
-
-    // the semantics of this function is to check if the left array contains all of the right elements.
-    // it's important to note that the arrays are interpreted as sets, and hence the order of the elements
-    // and the number of occurrences of each element are not taken into account.
-    void _check_left_contains_all_right(const Array& left_array, const Array& right_array,
-                                        UInt8* result) {
-        static constexpr UInt8 CONTAINS_ALL = 1;
-        static constexpr UInt8 NOT_CONTAINS_ALL = 0;
+        // data columns are single-dimension columns which stores elements of all arrays.
+        const ColumnNullable* left_data_col_nullable =
+                check_and_get_column<ColumnNullable>(left_col_array->get_data());
+        const ColumnNullable* right_data_col_nullable =
+                check_and_get_column<ColumnNullable>(right_col_array->get_data());
 
-        // set the default result to NOT_CONTAINS_ALL.
-        *result = NOT_CONTAINS_ALL;
+        for (size_t row = 0; row < input_rows_count; ++row) {
+            if (left_col_nullable->is_null_at(row) || right_col_nullable->is_null_at(row)) {
+                result_null_map->get_data()[row] = 1;
+                continue;
+            }
 
-        const bool left_has_nulls = !left_array.null_map.empty();
-        const bool right_has_nulls = !right_array.null_map.empty();
+            const auto& left_offsets = _get_offsets_of_row(left_col_array->get_offsets(), row);

Review Comment:
   make get_offsets out of loop



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #21330: support array_contains_all function

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1627392138

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] amorynan commented on pull request #21330: support array_contains_all function

Posted by "amorynan (via GitHub)" <gi...@apache.org>.
amorynan commented on PR #21330:
URL: https://github.com/apache/doris/pull/21330#issuecomment-1627570401

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org