You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/07/12 06:43:53 UTC

[GitHub] [doris] xy720 opened a new pull request, #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

xy720 opened a new pull request, #10781:
URL: https://github.com/apache/doris/pull/10781

   Usage example:
   
   `array_union/array_except/array_intersect`
   
   ```
   mysql> set enable_vectorized_engine=true;
   ```
   
   ```
   mysql> select k1,k2,k3,array_union(k2,k3) from array_type_table_two_array_nullable;
   +------+-----------------+--------------+-------------------------+
   | k1   | k2              | k3           | array_union(`k2`, `k3`) |
   +------+-----------------+--------------+-------------------------+
   |    1 | [1, 2, 3]       | [2, 4, 5]    | [1, 2, 3, 4, 5]         |
   |    2 | [1, NULL, 3]    | [1, 3, 5]    | [1, NULL, 3, 5]         |
   |    2 | [2, 3]          | [1, 5]       | [2, 3, 1, 5]            |
   |    4 | [NULL, NULL, 2] | [2, NULL, 4] | [NULL, 2, 4]            |
   |    5 | NULL            | [1, 2, 3]    | NULL                    |
   |    6 | [1, 1, 1]       | [2, 2, 2]    | [1, 2]                  |
   +------+-----------------+--------------+-------------------------+
   
   mysql> select k1,k2,k3,array_except(k2,k3) from array_type_table_two_array_nullable;
   +------+-----------------+--------------+--------------------------+
   | k1   | k2              | k3           | array_except(`k2`, `k3`) |
   +------+-----------------+--------------+--------------------------+
   |    1 | [1, 2, 3]       | [2, 4, 5]    | [1, 3]                   |
   |    2 | [1, NULL, 3]    | [1, 3, 5]    | [NULL]                   |
   |    2 | [2, 3]          | [1, 5]       | [2, 3]                   |
   |    4 | [NULL, NULL, 2] | [2, NULL, 4] | []                       |
   |    5 | NULL            | [1, 2, 3]    | NULL                     |
   |    6 | [1, 1, 1]       | [2, 2, 2]    | [1]                      |
   +------+-----------------+--------------+--------------------------+
   
   mysql> select k1,k2,k3,array_intersect(k2,k3) from array_type_table_two_array_nullable;
   +------+-----------------+--------------+-----------------------------+
   | k1   | k2              | k3           | array_intersect(`k2`, `k3`) |
   +------+-----------------+--------------+-----------------------------+
   |    1 | [1, 2, 3]       | [2, 4, 5]    | [2]                         |
   |    2 | [1, NULL, 3]    | [1, 3, 5]    | [1, 3]                      |
   |    2 | [2, 3]          | [1, 5]       | []                          |
   |    4 | [NULL, NULL, 2] | [2, NULL, 4] | [NULL, 2]                   |
   |    5 | NULL            | [1, 2, 3]    | NULL                        |
   |    6 | [1, 1, 1]       | [2, 2, 2]    | []                          |
   +------+-----------------+--------------+-----------------------------+
   ```
   
   related to #10052 
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on PR #10781:
URL: https://github.com/apache/doris/pull/10781#issuecomment-1190156061

   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r918707587


##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,258 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename ColumnType>
+struct UnionAction;
+
+template <typename ColumnType>
+struct ExceptAction;
+
+template <typename ColumnType>
+struct IntersectAction;
+
+template <typename ColumnType, SetOperation operation>
+struct ActionImpl;
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::UNION> {
+    using Action = UnionAction<ColumnType>;
+    static constexpr auto apply_left_first = true;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::EXCEPT> {
+    using Action = ExceptAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::INTERSECT> {
+    using Action = IntersectAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Action = typename ActionImpl<ColumnType, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,

Review Comment:
   apply_left and apply_right look like the same



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922139011


##########
be/src/vec/functions/array/function_array_union.cpp:
##########
@@ -0,0 +1,67 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/functions/array/function_array_binary.h"
+#include "vec/functions/array/function_array_set.h"
+
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+struct NameArrayUnion {
+    static constexpr auto name = "array_union";
+};
+
+template <typename Set, typename Element>
+struct UnionAction {
+    // True if set has null element
+    bool null_flag = false;
+    // True if result_set has null element
+    bool result_null_flag = false;
+    // True if it should apply the left array first.
+    static constexpr auto apply_left_first = true;

Review Comment:
   apply_left_first  maybe rename to execute_left_column_first or process_left_column_first, will be better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #10781:
URL: https://github.com/apache/doris/pull/10781#issuecomment-1191274126

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922114090


##########
be/src/vec/functions/array/function_array_except.cpp:
##########
@@ -0,0 +1,79 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/functions/array/function_array_binary.h"
+#include "vec/functions/array/function_array_set.h"
+
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+struct NameArrayExcept {
+    static constexpr auto name = "array_except";
+};
+
+template <typename Set, typename Element>
+struct ExceptAction {
+    // True if set has null element
+    bool null_flag = false;
+    // True if result_set has null element
+    bool result_null_flag = false;
+    // True if it should apply the left array first.
+    static constexpr auto apply_left_first = false;
+
+    // Handle Null element.
+    // Return ture means this null element should put into result column.
+    bool apply_null(bool left_or_right) {
+        if (left_or_right) {

Review Comment:
   Is `left_or_right` always equal to `apply_left_first` ? If it is true, we do not need to parse `left_or_right` param.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922140889


##########
be/src/vec/functions/array/function_array_union.cpp:
##########
@@ -0,0 +1,67 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/functions/array/function_array_binary.h"
+#include "vec/functions/array/function_array_set.h"
+
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+struct NameArrayUnion {
+    static constexpr auto name = "array_union";
+};
+
+template <typename Set, typename Element>
+struct UnionAction {
+    // True if set has null element
+    bool null_flag = false;
+    // True if result_set has null element
+    bool result_null_flag = false;
+    // True if it should apply the left array first.
+    static constexpr auto apply_left_first = true;
+
+    // Handle Null element.
+    // Return ture means this null element should put into result column.
+    bool apply_null(bool left_or_right) {

Review Comment:
   left_or_right is also not a good name, because if left_or_right=true, we do not know it is left or not.
   
   maybe is_left?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922137078


##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,223 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename Set, typename Element>
+struct UnionAction;
+
+template <typename Set, typename Element>
+struct ExceptAction;
+
+template <typename Set, typename Element>
+struct IntersectAction;
+
+template <typename Set, typename Element, SetOperation operation>
+struct ActionImpl;
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::UNION> {
+    using Action = UnionAction<Set, Element>;
+};
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::EXCEPT> {
+    using Action = ExceptAction<Set, Element>;
+};
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::INTERSECT> {
+    using Action = IntersectAction<Set, Element>;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Element = typename ColumnType::value_type;
+    using ElementNativeType = typename NativeType<Element>::Type;
+    using Set = HashSetWithStackMemory<ElementNativeType, DefaultHash<ElementNativeType>, 4>;
+    using Action = typename ActionImpl<Set, Element, operation>::Action;
+    Action action;
+    Set set;
+    Set result_set;
+
+    void apply(const ColumnArrayExecutionData& src, size_t off, size_t len,
+               ColumnArrayMutableData& dst, size_t* count,  bool left_or_right) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null(left_or_right)) {
+                    dst_data.push_back(Element());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply(set, result_set, src_data[i], left_or_right)) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void reset() {
+        set.clear();
+        result_set.clear();
+    }
+};
+
+template <SetOperation operation>
+struct OpenSetImpl<operation, ColumnString> {
+    using Set = HashSetWithStackMemory<StringRef, DefaultHash<StringRef>, 4>;
+    using Action = typename ActionImpl<Set, StringRef, operation>::Action;
+    Action action;
+    Set set;
+    Set result_set;
+
+    void apply(const ColumnArrayExecutionData& src, size_t off, size_t len,
+               ColumnArrayMutableData& dst, size_t* count, bool left_or_right) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null(left_or_right)) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply(set, result_set, src_column.get_data_at(i), left_or_right)) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void reset() {
+        set.clear();
+        result_set.clear();
+    }
+};
+
+template <SetOperation operation>
+struct ArraySetImpl {
+public:
+    static DataTypePtr get_return_type(const DataTypes& arguments) {
+        const DataTypeArray* array_left =
+                check_and_get_data_type<DataTypeArray>(arguments[0].get());
+        // if any nested type of array arguments is nullable then return array with
+        // nullable nested type.
+        if (array_left->get_nested_type()->is_nullable()) {
+            return arguments[0];
+        }
+        return arguments[1];
+    }
+
+    static Status execute(ColumnPtr& res_ptr, const ColumnArrayExecutionData& left_data,
+                          const ColumnArrayExecutionData& right_data) {
+        MutableColumnPtr array_nested_column = nullptr;
+        ColumnArrayMutableData dst;
+        if (left_data.nested_nullmap_data || right_data.nested_nullmap_data) {
+            array_nested_column =
+                    ColumnNullable::create(left_data.nested_col->clone_empty(), ColumnUInt8::create());
+            dst.nullable_col = reinterpret_cast<ColumnNullable*>(array_nested_column.get());
+            dst.nested_nullmap_data = &dst.nullable_col->get_null_map_data();
+            dst.nested_col = dst.nullable_col->get_nested_column_ptr().get();
+        } else {
+            array_nested_column = left_data.nested_col->clone_empty();
+            dst.nested_col = array_nested_column.get();
+        }
+        auto dst_offsets_column = ColumnArray::ColumnOffsets::create();
+        dst.offsets_col = dst_offsets_column.get();
+        dst.offsets_ptr = &dst_offsets_column->get_data();
+
+        ColumnPtr res_column;
+        if (_execute_internal<ColumnString>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDate>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDateTime>(dst, left_data, right_data) ||
+            _execute_internal<ColumnUInt8>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt8>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt16>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt32>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt64>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt128>(dst, left_data, right_data) ||
+            _execute_internal<ColumnFloat32>(dst, left_data, right_data) ||
+            _execute_internal<ColumnFloat64>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDecimal128>(dst, left_data, right_data)) {
+            res_column = assemble_column_array(dst);
+            if (res_column) {
+                res_ptr = std::move(res_column);
+                return Status::OK();
+            }
+        }
+        return Status::RuntimeError("Unexpected columns: {}, {}", left_data.nested_col->get_name(),
+                                    right_data.nested_col->get_name());
+    }
+
+private:
+    template <typename ColumnType>
+    static bool _execute_internal(ColumnArrayMutableData& dst, const ColumnArrayExecutionData& left_data,
+                                    const ColumnArrayExecutionData& right_data) {
+        using Impl = OpenSetImpl<operation, ColumnType>;
+        if (!check_column<ColumnType>(*left_data.nested_col)) {
+            return false;
+        }
+        constexpr auto apply_left_first = Impl::Action::apply_left_first;
+        size_t current = 0;
+        Impl impl;
+        for (size_t row = 0; row < left_data.offsets_ptr->size(); ++row) {
+            size_t count = 0;
+            size_t left_off = (*left_data.offsets_ptr)[row - 1];
+            size_t left_len = (*left_data.offsets_ptr)[row] - left_off;
+            size_t right_off = (*right_data.offsets_ptr)[row - 1];
+            size_t right_len = (*right_data.offsets_ptr)[row] - right_off;
+            if (apply_left_first) {

Review Comment:
   ```suggestion
               if constexpr (apply_left_first) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922110741


##########
be/src/vec/functions/array/function_array_except.cpp:
##########
@@ -0,0 +1,79 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/functions/array/function_array_binary.h"
+#include "vec/functions/array/function_array_set.h"
+
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+struct NameArrayExcept {
+    static constexpr auto name = "array_except";
+};
+
+template <typename Set, typename Element>
+struct ExceptAction {
+    // True if set has null element
+    bool null_flag = false;
+    // True if result_set has null element
+    bool result_null_flag = false;
+    // True if it should apply the left array first.
+    static constexpr auto apply_left_first = false;
+
+    // Handle Null element.
+    // Return ture means this null element should put into result column.

Review Comment:
   ```suggestion
       // Return true means this null element should put into result column.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922142316


##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,223 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename Set, typename Element>
+struct UnionAction;
+
+template <typename Set, typename Element>
+struct ExceptAction;
+
+template <typename Set, typename Element>
+struct IntersectAction;
+
+template <typename Set, typename Element, SetOperation operation>
+struct ActionImpl;
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::UNION> {
+    using Action = UnionAction<Set, Element>;
+};
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::EXCEPT> {
+    using Action = ExceptAction<Set, Element>;
+};
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::INTERSECT> {
+    using Action = IntersectAction<Set, Element>;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Element = typename ColumnType::value_type;
+    using ElementNativeType = typename NativeType<Element>::Type;
+    using Set = HashSetWithStackMemory<ElementNativeType, DefaultHash<ElementNativeType>, 4>;
+    using Action = typename ActionImpl<Set, Element, operation>::Action;
+    Action action;
+    Set set;
+    Set result_set;
+
+    void apply(const ColumnArrayExecutionData& src, size_t off, size_t len,
+               ColumnArrayMutableData& dst, size_t* count,  bool left_or_right) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null(left_or_right)) {
+                    dst_data.push_back(Element());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply(set, result_set, src_data[i], left_or_right)) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void reset() {
+        set.clear();
+        result_set.clear();
+    }
+};
+
+template <SetOperation operation>
+struct OpenSetImpl<operation, ColumnString> {
+    using Set = HashSetWithStackMemory<StringRef, DefaultHash<StringRef>, 4>;
+    using Action = typename ActionImpl<Set, StringRef, operation>::Action;
+    Action action;
+    Set set;
+    Set result_set;
+
+    void apply(const ColumnArrayExecutionData& src, size_t off, size_t len,
+               ColumnArrayMutableData& dst, size_t* count, bool left_or_right) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null(left_or_right)) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply(set, result_set, src_column.get_data_at(i), left_or_right)) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void reset() {
+        set.clear();
+        result_set.clear();
+    }
+};
+
+template <SetOperation operation>
+struct ArraySetImpl {
+public:
+    static DataTypePtr get_return_type(const DataTypes& arguments) {
+        const DataTypeArray* array_left =
+                check_and_get_data_type<DataTypeArray>(arguments[0].get());
+        // if any nested type of array arguments is nullable then return array with
+        // nullable nested type.
+        if (array_left->get_nested_type()->is_nullable()) {
+            return arguments[0];
+        }
+        return arguments[1];
+    }
+
+    static Status execute(ColumnPtr& res_ptr, const ColumnArrayExecutionData& left_data,
+                          const ColumnArrayExecutionData& right_data) {
+        MutableColumnPtr array_nested_column = nullptr;
+        ColumnArrayMutableData dst;
+        if (left_data.nested_nullmap_data || right_data.nested_nullmap_data) {
+            array_nested_column =
+                    ColumnNullable::create(left_data.nested_col->clone_empty(), ColumnUInt8::create());
+            dst.nullable_col = reinterpret_cast<ColumnNullable*>(array_nested_column.get());
+            dst.nested_nullmap_data = &dst.nullable_col->get_null_map_data();
+            dst.nested_col = dst.nullable_col->get_nested_column_ptr().get();
+        } else {
+            array_nested_column = left_data.nested_col->clone_empty();
+            dst.nested_col = array_nested_column.get();
+        }
+        auto dst_offsets_column = ColumnArray::ColumnOffsets::create();
+        dst.offsets_col = dst_offsets_column.get();
+        dst.offsets_ptr = &dst_offsets_column->get_data();
+
+        ColumnPtr res_column;
+        if (_execute_internal<ColumnString>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDate>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDateTime>(dst, left_data, right_data) ||
+            _execute_internal<ColumnUInt8>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt8>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt16>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt32>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt64>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt128>(dst, left_data, right_data) ||
+            _execute_internal<ColumnFloat32>(dst, left_data, right_data) ||
+            _execute_internal<ColumnFloat64>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDecimal128>(dst, left_data, right_data)) {
+            res_column = assemble_column_array(dst);
+            if (res_column) {
+                res_ptr = std::move(res_column);
+                return Status::OK();
+            }
+        }
+        return Status::RuntimeError("Unexpected columns: {}, {}", left_data.nested_col->get_name(),
+                                    right_data.nested_col->get_name());
+    }
+
+private:
+    template <typename ColumnType>
+    static bool _execute_internal(ColumnArrayMutableData& dst, const ColumnArrayExecutionData& left_data,
+                                    const ColumnArrayExecutionData& right_data) {
+        using Impl = OpenSetImpl<operation, ColumnType>;
+        if (!check_column<ColumnType>(*left_data.nested_col)) {
+            return false;
+        }
+        constexpr auto apply_left_first = Impl::Action::apply_left_first;
+        size_t current = 0;
+        Impl impl;
+        for (size_t row = 0; row < left_data.offsets_ptr->size(); ++row) {
+            size_t count = 0;
+            size_t left_off = (*left_data.offsets_ptr)[row - 1];
+            size_t left_len = (*left_data.offsets_ptr)[row] - left_off;
+            size_t right_off = (*right_data.offsets_ptr)[row - 1];
+            size_t right_len = (*right_data.offsets_ptr)[row] - right_off;
+            if (apply_left_first) {
+                impl.apply(left_data, left_off, left_len, dst, &count, true);

Review Comment:
   ```suggestion
                   impl.apply<true>(left_data, left_off, left_len, dst, &count);
   ```
   
   maybe it is possible to pass left_or_rigth info through template argument.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xy720 merged pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
xy720 merged PR #10781:
URL: https://github.com/apache/doris/pull/10781


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #10781:
URL: https://github.com/apache/doris/pull/10781#issuecomment-1191274182

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xy720 commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
xy720 commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922807243


##########
be/src/vec/functions/array/function_array_union.cpp:
##########
@@ -0,0 +1,67 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/functions/array/function_array_binary.h"
+#include "vec/functions/array/function_array_set.h"
+
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+struct NameArrayUnion {
+    static constexpr auto name = "array_union";
+};
+
+template <typename Set, typename Element>
+struct UnionAction {
+    // True if set has null element
+    bool null_flag = false;
+    // True if result_set has null element
+    bool result_null_flag = false;
+    // True if it should apply the left array first.
+    static constexpr auto apply_left_first = true;
+
+    // Handle Null element.
+    // Return ture means this null element should put into result column.
+    bool apply_null(bool left_or_right) {

Review Comment:
   Great. I will use is_left.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] carlvinhust2012 commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
carlvinhust2012 commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r918668436


##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,258 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename ColumnType>
+struct UnionAction;
+
+template <typename ColumnType>
+struct ExceptAction;
+
+template <typename ColumnType>
+struct IntersectAction;
+
+template <typename ColumnType, SetOperation operation>
+struct ActionImpl;
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::UNION> {
+    using Action = UnionAction<ColumnType>;
+    static constexpr auto apply_left_first = true;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::EXCEPT> {
+    using Action = ExceptAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::INTERSECT> {
+    using Action = IntersectAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Action = typename ActionImpl<ColumnType, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_right()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_right(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+};
+
+template <SetOperation operation>
+struct OpenSetImpl<operation, ColumnString> {
+    using Action = typename ActionImpl<ColumnString, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(src_column.get_data_at(i))) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_right()) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_right(src_column.get_data_at(i))) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+};
+
+template <SetOperation operation>
+struct ArraySetImpl {
+public:
+    static DataTypePtr get_return_type(const DataTypes& arguments) {
+        const DataTypeArray* array_left =
+                check_and_get_data_type<DataTypeArray>(arguments[0].get());
+        // if any nested type of array arguments is nullable then return array with
+        // nullable nested type.
+        if (array_left->get_nested_type()->is_nullable()) {
+            return arguments[0];
+        }
+        return arguments[1];
+    }
+
+    static Status execute(ColumnPtr& res_ptr, const ColumnArrayExecutionData& left_data,
+                          const ColumnArrayExecutionData& right_data) {
+        MutableColumnPtr array_nested_column = nullptr;
+        ColumnArrayMutableData dst;
+        if (left_data.nested_nullmap_data || right_data.nested_nullmap_data) {
+            array_nested_column =
+                    ColumnNullable::create(left_data.nested_col->clone_empty(), ColumnUInt8::create());
+            dst.nullable_col = reinterpret_cast<ColumnNullable*>(array_nested_column.get());
+            dst.nested_nullmap_data = &dst.nullable_col->get_null_map_data();
+            dst.nested_col = dst.nullable_col->get_nested_column_ptr().get();
+        } else {
+            array_nested_column = left_data.nested_col->clone_empty();
+            dst.nested_col = array_nested_column.get();
+        }
+        auto dst_offsets_column = ColumnArray::ColumnOffsets::create();
+        dst.offsets_col = dst_offsets_column.get();
+        dst.offsets_ptr = &dst_offsets_column->get_data();
+
+        ColumnPtr res_column;
+        if (_execute_expand<ColumnString>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDate>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDateTime>(dst, left_data, right_data) ||
+            _execute_expand<ColumnUInt8>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt8>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt16>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt32>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt64>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt128>(dst, left_data, right_data) ||
+            _execute_expand<ColumnFloat32>(dst, left_data, right_data) ||
+            _execute_expand<ColumnFloat64>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDecimal128>(dst, left_data, right_data)) {
+            res_column = assemble_column_array(dst);
+            if (res_column) {
+                res_ptr = std::move(res_column);
+                return Status::OK();
+            }
+        }
+        return Status::RuntimeError("Unexpected columns: {}, {}", left_data.nested_col->get_name(),
+                                    right_data.nested_col->get_name());
+    }
+
+private:
+    template <typename ColumnType>
+    static Status _execute_internal(ColumnArrayMutableData& dst, const ColumnArrayExecutionData& left_data,
+                                    const ColumnArrayExecutionData& right_data) {
+        using Impl = OpenSetImpl<operation, ColumnType>;
+        static constexpr auto apply_left_first = Impl::Action::apply_left_first;
+        size_t current = 0;
+        for (size_t row = 0; row < left_data.offsets_ptr->size(); ++row) {
+            size_t count = 0;
+            size_t left_off = (*left_data.offsets_ptr)[row - 1];
+            size_t left_len = (*left_data.offsets_ptr)[row] - left_off;
+            size_t right_off = (*right_data.offsets_ptr)[row - 1];
+            size_t right_len = (*right_data.offsets_ptr)[row] - right_off;
+            Impl impl;

Review Comment:
   why not move this "Impl impl;" out of loop?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xy720 commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
xy720 commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922037237


##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,258 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename ColumnType>
+struct UnionAction;
+
+template <typename ColumnType>
+struct ExceptAction;
+
+template <typename ColumnType>
+struct IntersectAction;
+
+template <typename ColumnType, SetOperation operation>
+struct ActionImpl;
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::UNION> {
+    using Action = UnionAction<ColumnType>;
+    static constexpr auto apply_left_first = true;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::EXCEPT> {
+    using Action = ExceptAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::INTERSECT> {
+    using Action = IntersectAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Action = typename ActionImpl<ColumnType, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_right()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_right(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+};
+
+template <SetOperation operation>
+struct OpenSetImpl<operation, ColumnString> {
+    using Action = typename ActionImpl<ColumnString, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(src_column.get_data_at(i))) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_right()) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_right(src_column.get_data_at(i))) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+};
+
+template <SetOperation operation>
+struct ArraySetImpl {
+public:
+    static DataTypePtr get_return_type(const DataTypes& arguments) {
+        const DataTypeArray* array_left =
+                check_and_get_data_type<DataTypeArray>(arguments[0].get());
+        // if any nested type of array arguments is nullable then return array with
+        // nullable nested type.
+        if (array_left->get_nested_type()->is_nullable()) {
+            return arguments[0];
+        }
+        return arguments[1];
+    }
+
+    static Status execute(ColumnPtr& res_ptr, const ColumnArrayExecutionData& left_data,
+                          const ColumnArrayExecutionData& right_data) {
+        MutableColumnPtr array_nested_column = nullptr;
+        ColumnArrayMutableData dst;
+        if (left_data.nested_nullmap_data || right_data.nested_nullmap_data) {
+            array_nested_column =
+                    ColumnNullable::create(left_data.nested_col->clone_empty(), ColumnUInt8::create());
+            dst.nullable_col = reinterpret_cast<ColumnNullable*>(array_nested_column.get());
+            dst.nested_nullmap_data = &dst.nullable_col->get_null_map_data();
+            dst.nested_col = dst.nullable_col->get_nested_column_ptr().get();
+        } else {
+            array_nested_column = left_data.nested_col->clone_empty();
+            dst.nested_col = array_nested_column.get();
+        }
+        auto dst_offsets_column = ColumnArray::ColumnOffsets::create();
+        dst.offsets_col = dst_offsets_column.get();
+        dst.offsets_ptr = &dst_offsets_column->get_data();
+
+        ColumnPtr res_column;
+        if (_execute_expand<ColumnString>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDate>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDateTime>(dst, left_data, right_data) ||
+            _execute_expand<ColumnUInt8>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt8>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt16>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt32>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt64>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt128>(dst, left_data, right_data) ||
+            _execute_expand<ColumnFloat32>(dst, left_data, right_data) ||
+            _execute_expand<ColumnFloat64>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDecimal128>(dst, left_data, right_data)) {
+            res_column = assemble_column_array(dst);
+            if (res_column) {
+                res_ptr = std::move(res_column);
+                return Status::OK();
+            }
+        }
+        return Status::RuntimeError("Unexpected columns: {}, {}", left_data.nested_col->get_name(),
+                                    right_data.nested_col->get_name());
+    }
+
+private:
+    template <typename ColumnType>
+    static Status _execute_internal(ColumnArrayMutableData& dst, const ColumnArrayExecutionData& left_data,
+                                    const ColumnArrayExecutionData& right_data) {
+        using Impl = OpenSetImpl<operation, ColumnType>;
+        static constexpr auto apply_left_first = Impl::Action::apply_left_first;
+        size_t current = 0;
+        for (size_t row = 0; row < left_data.offsets_ptr->size(); ++row) {
+            size_t count = 0;
+            size_t left_off = (*left_data.offsets_ptr)[row - 1];
+            size_t left_len = (*left_data.offsets_ptr)[row] - left_off;
+            size_t right_off = (*right_data.offsets_ptr)[row - 1];
+            size_t right_len = (*right_data.offsets_ptr)[row] - right_off;
+            Impl impl;

Review Comment:
   moved.



##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,258 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename ColumnType>
+struct UnionAction;
+
+template <typename ColumnType>
+struct ExceptAction;
+
+template <typename ColumnType>
+struct IntersectAction;
+
+template <typename ColumnType, SetOperation operation>
+struct ActionImpl;
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::UNION> {
+    using Action = UnionAction<ColumnType>;
+    static constexpr auto apply_left_first = true;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::EXCEPT> {
+    using Action = ExceptAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::INTERSECT> {
+    using Action = IntersectAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Action = typename ActionImpl<ColumnType, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,

Review Comment:
   I have simplified it.



##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,258 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename ColumnType>
+struct UnionAction;
+
+template <typename ColumnType>
+struct ExceptAction;
+
+template <typename ColumnType>
+struct IntersectAction;
+
+template <typename ColumnType, SetOperation operation>
+struct ActionImpl;
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::UNION> {
+    using Action = UnionAction<ColumnType>;
+    static constexpr auto apply_left_first = true;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::EXCEPT> {
+    using Action = ExceptAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::INTERSECT> {
+    using Action = IntersectAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Action = typename ActionImpl<ColumnType, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_right()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_right(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+};
+
+template <SetOperation operation>
+struct OpenSetImpl<operation, ColumnString> {
+    using Action = typename ActionImpl<ColumnString, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(src_column.get_data_at(i))) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_right()) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_right(src_column.get_data_at(i))) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+};
+
+template <SetOperation operation>
+struct ArraySetImpl {
+public:
+    static DataTypePtr get_return_type(const DataTypes& arguments) {
+        const DataTypeArray* array_left =
+                check_and_get_data_type<DataTypeArray>(arguments[0].get());
+        // if any nested type of array arguments is nullable then return array with
+        // nullable nested type.
+        if (array_left->get_nested_type()->is_nullable()) {
+            return arguments[0];
+        }
+        return arguments[1];
+    }
+
+    static Status execute(ColumnPtr& res_ptr, const ColumnArrayExecutionData& left_data,
+                          const ColumnArrayExecutionData& right_data) {
+        MutableColumnPtr array_nested_column = nullptr;
+        ColumnArrayMutableData dst;
+        if (left_data.nested_nullmap_data || right_data.nested_nullmap_data) {
+            array_nested_column =
+                    ColumnNullable::create(left_data.nested_col->clone_empty(), ColumnUInt8::create());
+            dst.nullable_col = reinterpret_cast<ColumnNullable*>(array_nested_column.get());
+            dst.nested_nullmap_data = &dst.nullable_col->get_null_map_data();
+            dst.nested_col = dst.nullable_col->get_nested_column_ptr().get();
+        } else {
+            array_nested_column = left_data.nested_col->clone_empty();
+            dst.nested_col = array_nested_column.get();
+        }
+        auto dst_offsets_column = ColumnArray::ColumnOffsets::create();
+        dst.offsets_col = dst_offsets_column.get();
+        dst.offsets_ptr = &dst_offsets_column->get_data();
+
+        ColumnPtr res_column;
+        if (_execute_expand<ColumnString>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDate>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDateTime>(dst, left_data, right_data) ||
+            _execute_expand<ColumnUInt8>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt8>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt16>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt32>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt64>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt128>(dst, left_data, right_data) ||
+            _execute_expand<ColumnFloat32>(dst, left_data, right_data) ||
+            _execute_expand<ColumnFloat64>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDecimal128>(dst, left_data, right_data)) {
+            res_column = assemble_column_array(dst);
+            if (res_column) {
+                res_ptr = std::move(res_column);
+                return Status::OK();
+            }
+        }
+        return Status::RuntimeError("Unexpected columns: {}, {}", left_data.nested_col->get_name(),
+                                    right_data.nested_col->get_name());
+    }
+
+private:
+    template <typename ColumnType>
+    static Status _execute_internal(ColumnArrayMutableData& dst, const ColumnArrayExecutionData& left_data,
+                                    const ColumnArrayExecutionData& right_data) {
+        using Impl = OpenSetImpl<operation, ColumnType>;
+        static constexpr auto apply_left_first = Impl::Action::apply_left_first;

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r918722483


##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,258 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename ColumnType>
+struct UnionAction;
+
+template <typename ColumnType>
+struct ExceptAction;
+
+template <typename ColumnType>
+struct IntersectAction;
+
+template <typename ColumnType, SetOperation operation>
+struct ActionImpl;
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::UNION> {
+    using Action = UnionAction<ColumnType>;
+    static constexpr auto apply_left_first = true;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::EXCEPT> {
+    using Action = ExceptAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <typename ColumnType>
+struct ActionImpl<ColumnType, SetOperation::INTERSECT> {
+    using Action = IntersectAction<ColumnType>;
+    static constexpr auto apply_left_first = false;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Action = typename ActionImpl<ColumnType, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_right()) {
+                    dst_data.push_back(typename ColumnType::value_type());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_right(&src_data[i])) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+};
+
+template <SetOperation operation>
+struct OpenSetImpl<operation, ColumnString> {
+    using Action = typename ActionImpl<ColumnString, operation>::Action;
+    Action action;
+    void apply_left(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_left()) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_left(src_column.get_data_at(i))) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void apply_right(const ColumnArrayExecutionData& src, size_t off, size_t len,
+                    ColumnArrayMutableData& dst, size_t* count) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null_right()) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply_right(src_column.get_data_at(i))) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+};
+
+template <SetOperation operation>
+struct ArraySetImpl {
+public:
+    static DataTypePtr get_return_type(const DataTypes& arguments) {
+        const DataTypeArray* array_left =
+                check_and_get_data_type<DataTypeArray>(arguments[0].get());
+        // if any nested type of array arguments is nullable then return array with
+        // nullable nested type.
+        if (array_left->get_nested_type()->is_nullable()) {
+            return arguments[0];
+        }
+        return arguments[1];
+    }
+
+    static Status execute(ColumnPtr& res_ptr, const ColumnArrayExecutionData& left_data,
+                          const ColumnArrayExecutionData& right_data) {
+        MutableColumnPtr array_nested_column = nullptr;
+        ColumnArrayMutableData dst;
+        if (left_data.nested_nullmap_data || right_data.nested_nullmap_data) {
+            array_nested_column =
+                    ColumnNullable::create(left_data.nested_col->clone_empty(), ColumnUInt8::create());
+            dst.nullable_col = reinterpret_cast<ColumnNullable*>(array_nested_column.get());
+            dst.nested_nullmap_data = &dst.nullable_col->get_null_map_data();
+            dst.nested_col = dst.nullable_col->get_nested_column_ptr().get();
+        } else {
+            array_nested_column = left_data.nested_col->clone_empty();
+            dst.nested_col = array_nested_column.get();
+        }
+        auto dst_offsets_column = ColumnArray::ColumnOffsets::create();
+        dst.offsets_col = dst_offsets_column.get();
+        dst.offsets_ptr = &dst_offsets_column->get_data();
+
+        ColumnPtr res_column;
+        if (_execute_expand<ColumnString>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDate>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDateTime>(dst, left_data, right_data) ||
+            _execute_expand<ColumnUInt8>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt8>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt16>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt32>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt64>(dst, left_data, right_data) ||
+            _execute_expand<ColumnInt128>(dst, left_data, right_data) ||
+            _execute_expand<ColumnFloat32>(dst, left_data, right_data) ||
+            _execute_expand<ColumnFloat64>(dst, left_data, right_data) ||
+            _execute_expand<ColumnDecimal128>(dst, left_data, right_data)) {
+            res_column = assemble_column_array(dst);
+            if (res_column) {
+                res_ptr = std::move(res_column);
+                return Status::OK();
+            }
+        }
+        return Status::RuntimeError("Unexpected columns: {}, {}", left_data.nested_col->get_name(),
+                                    right_data.nested_col->get_name());
+    }
+
+private:
+    template <typename ColumnType>
+    static Status _execute_internal(ColumnArrayMutableData& dst, const ColumnArrayExecutionData& left_data,
+                                    const ColumnArrayExecutionData& right_data) {
+        using Impl = OpenSetImpl<operation, ColumnType>;
+        static constexpr auto apply_left_first = Impl::Action::apply_left_first;

Review Comment:
   ```suggestion
          constexpr auto apply_left_first = Impl::Action::apply_left_first;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] cambyzju commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
cambyzju commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922113206


##########
be/src/vec/functions/array/function_array_union.cpp:
##########
@@ -0,0 +1,67 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/functions/array/function_array_binary.h"
+#include "vec/functions/array/function_array_set.h"
+
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+struct NameArrayUnion {
+    static constexpr auto name = "array_union";
+};
+
+template <typename Set, typename Element>
+struct UnionAction {
+    // True if set has null element
+    bool null_flag = false;
+    // True if result_set has null element
+    bool result_null_flag = false;
+    // True if it should apply the left array first.
+    static constexpr auto apply_left_first = true;
+
+    // Handle Null element.
+    // Return ture means this null element should put into result column.

Review Comment:
   ```suggestion
       // Return true means this null element should put into result column.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xy720 commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
xy720 commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922806796


##########
be/src/vec/functions/array/function_array_except.cpp:
##########
@@ -0,0 +1,79 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "vec/functions/array/function_array_binary.h"
+#include "vec/functions/array/function_array_set.h"
+
+#include "vec/functions/simple_function_factory.h"
+
+namespace doris::vectorized {
+
+struct NameArrayExcept {
+    static constexpr auto name = "array_except";
+};
+
+template <typename Set, typename Element>
+struct ExceptAction {
+    // True if set has null element
+    bool null_flag = false;
+    // True if result_set has null element
+    bool result_null_flag = false;
+    // True if it should apply the left array first.
+    static constexpr auto apply_left_first = false;
+
+    // Handle Null element.
+    // Return ture means this null element should put into result column.
+    bool apply_null(bool left_or_right) {
+        if (left_or_right) {

Review Comment:
   No. 
   If `apply_left_first` = true, 
   means apply_left first, then `left_or_right` = true
   and apply_right second, then `left_or_right` = false.
   So actually the value of `left_or_right` is true at first, then is become false.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] xy720 commented on a diff in pull request #10781: [feature-wip](array-type) add function array_union/array_except/array_intersect

Posted by GitBox <gi...@apache.org>.
xy720 commented on code in PR #10781:
URL: https://github.com/apache/doris/pull/10781#discussion_r922805951


##########
be/src/vec/functions/array/function_array_set.h:
##########
@@ -0,0 +1,223 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+#include "vec/columns/column_array.h"
+#include "vec/columns/column_string.h"
+#include "vec/common/hash_table/hash_set.h"
+#include "vec/data_types/data_type_array.h"
+#include "vec/data_types/data_type_number.h"
+#include "vec/functions/array/function_array_utils.h"
+#include "vec/functions/function.h"
+#include "vec/functions/function_helpers.h"
+
+namespace doris::vectorized {
+
+enum class SetOperation { UNION, EXCEPT, INTERSECT };
+
+template <typename Set, typename Element>
+struct UnionAction;
+
+template <typename Set, typename Element>
+struct ExceptAction;
+
+template <typename Set, typename Element>
+struct IntersectAction;
+
+template <typename Set, typename Element, SetOperation operation>
+struct ActionImpl;
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::UNION> {
+    using Action = UnionAction<Set, Element>;
+};
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::EXCEPT> {
+    using Action = ExceptAction<Set, Element>;
+};
+
+template <typename Set, typename Element>
+struct ActionImpl<Set, Element, SetOperation::INTERSECT> {
+    using Action = IntersectAction<Set, Element>;
+};
+
+template <SetOperation operation, typename ColumnType>
+struct OpenSetImpl {
+    using Element = typename ColumnType::value_type;
+    using ElementNativeType = typename NativeType<Element>::Type;
+    using Set = HashSetWithStackMemory<ElementNativeType, DefaultHash<ElementNativeType>, 4>;
+    using Action = typename ActionImpl<Set, Element, operation>::Action;
+    Action action;
+    Set set;
+    Set result_set;
+
+    void apply(const ColumnArrayExecutionData& src, size_t off, size_t len,
+               ColumnArrayMutableData& dst, size_t* count,  bool left_or_right) {
+        const auto& src_data = assert_cast<const ColumnType&>(*src.nested_col).get_data();
+        auto& dst_data = assert_cast<ColumnType&>(*dst.nested_col).get_data();
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null(left_or_right)) {
+                    dst_data.push_back(Element());
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply(set, result_set, src_data[i], left_or_right)) {
+                    dst_data.push_back(src_data[i]);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void reset() {
+        set.clear();
+        result_set.clear();
+    }
+};
+
+template <SetOperation operation>
+struct OpenSetImpl<operation, ColumnString> {
+    using Set = HashSetWithStackMemory<StringRef, DefaultHash<StringRef>, 4>;
+    using Action = typename ActionImpl<Set, StringRef, operation>::Action;
+    Action action;
+    Set set;
+    Set result_set;
+
+    void apply(const ColumnArrayExecutionData& src, size_t off, size_t len,
+               ColumnArrayMutableData& dst, size_t* count, bool left_or_right) {
+        const auto& src_column = assert_cast<const ColumnString&>(*src.nested_col);
+        auto& dst_column = assert_cast<ColumnString&>(*dst.nested_col);
+        for (size_t i = off; i < off + len; ++i) {
+            if (src.nested_nullmap_data && src.nested_nullmap_data[i]) {
+                if (action.apply_null(left_or_right)) {
+                    dst_column.insert_default();
+                    dst.nested_nullmap_data->push_back(1);
+                    ++(*count);
+                }
+            } else  {
+                if (action.apply(set, result_set, src_column.get_data_at(i), left_or_right)) {
+                    dst_column.insert_from(src_column, i);
+                    if (dst.nested_nullmap_data) {
+                        dst.nested_nullmap_data->push_back(0);
+                    }
+                    ++(*count);
+                }
+            }
+        }
+    }
+
+    void reset() {
+        set.clear();
+        result_set.clear();
+    }
+};
+
+template <SetOperation operation>
+struct ArraySetImpl {
+public:
+    static DataTypePtr get_return_type(const DataTypes& arguments) {
+        const DataTypeArray* array_left =
+                check_and_get_data_type<DataTypeArray>(arguments[0].get());
+        // if any nested type of array arguments is nullable then return array with
+        // nullable nested type.
+        if (array_left->get_nested_type()->is_nullable()) {
+            return arguments[0];
+        }
+        return arguments[1];
+    }
+
+    static Status execute(ColumnPtr& res_ptr, const ColumnArrayExecutionData& left_data,
+                          const ColumnArrayExecutionData& right_data) {
+        MutableColumnPtr array_nested_column = nullptr;
+        ColumnArrayMutableData dst;
+        if (left_data.nested_nullmap_data || right_data.nested_nullmap_data) {
+            array_nested_column =
+                    ColumnNullable::create(left_data.nested_col->clone_empty(), ColumnUInt8::create());
+            dst.nullable_col = reinterpret_cast<ColumnNullable*>(array_nested_column.get());
+            dst.nested_nullmap_data = &dst.nullable_col->get_null_map_data();
+            dst.nested_col = dst.nullable_col->get_nested_column_ptr().get();
+        } else {
+            array_nested_column = left_data.nested_col->clone_empty();
+            dst.nested_col = array_nested_column.get();
+        }
+        auto dst_offsets_column = ColumnArray::ColumnOffsets::create();
+        dst.offsets_col = dst_offsets_column.get();
+        dst.offsets_ptr = &dst_offsets_column->get_data();
+
+        ColumnPtr res_column;
+        if (_execute_internal<ColumnString>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDate>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDateTime>(dst, left_data, right_data) ||
+            _execute_internal<ColumnUInt8>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt8>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt16>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt32>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt64>(dst, left_data, right_data) ||
+            _execute_internal<ColumnInt128>(dst, left_data, right_data) ||
+            _execute_internal<ColumnFloat32>(dst, left_data, right_data) ||
+            _execute_internal<ColumnFloat64>(dst, left_data, right_data) ||
+            _execute_internal<ColumnDecimal128>(dst, left_data, right_data)) {
+            res_column = assemble_column_array(dst);
+            if (res_column) {
+                res_ptr = std::move(res_column);
+                return Status::OK();
+            }
+        }
+        return Status::RuntimeError("Unexpected columns: {}, {}", left_data.nested_col->get_name(),
+                                    right_data.nested_col->get_name());
+    }
+
+private:
+    template <typename ColumnType>
+    static bool _execute_internal(ColumnArrayMutableData& dst, const ColumnArrayExecutionData& left_data,
+                                    const ColumnArrayExecutionData& right_data) {
+        using Impl = OpenSetImpl<operation, ColumnType>;
+        if (!check_column<ColumnType>(*left_data.nested_col)) {
+            return false;
+        }
+        constexpr auto apply_left_first = Impl::Action::apply_left_first;
+        size_t current = 0;
+        Impl impl;
+        for (size_t row = 0; row < left_data.offsets_ptr->size(); ++row) {
+            size_t count = 0;
+            size_t left_off = (*left_data.offsets_ptr)[row - 1];
+            size_t left_len = (*left_data.offsets_ptr)[row] - left_off;
+            size_t right_off = (*right_data.offsets_ptr)[row - 1];
+            size_t right_len = (*right_data.offsets_ptr)[row] - right_off;
+            if (apply_left_first) {
+                impl.apply(left_data, left_off, left_len, dst, &count, true);

Review Comment:
   Good suggestion. I will use this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org