You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by kx...@apache.org on 2023/06/26 12:35:53 UTC

[doris] branch branch-2.0 updated (fe51e6ffdb -> 932a38b4ec)

This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a change to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git


    from fe51e6ffdb [fix](nereids)scan node's smap should use materiazlied slots and project list as left and right expr list (#21142)
     new 3179a5519b [chore](block) temporarily disable DCHECK for column name equality in MutableBlock (#21116)
     new 507b98a0cb [fix](inverted index) fix build inverted index failed but not return immediately (#21165)
     new 266750f7ac [fix](nereids)change PushdownFilterThroughProject post processor from bottom up to top down rewrite (#21125)
     new 4bf8d7af2b [Bug](RuntimeFiter) Fix bf error change the murmurhash to crc32 in regression test p2 (#21167)
     new 5e8aa0aaab [bug](jdbc catalog) fix getPrimaryKeys fun bug (#21137)
     new cebc765235 [Improve](dynamic schema) support filtering invalid data (#21160)
     new 44280038bd [fix](inverted index)fix transaction id not unique for one index change job when light index change (#21180)
     new 8f79e264f1 [fix](nereids) set proper sort info to scan node to enable TopN-opt (#21148)
     new 932a38b4ec [Fix](inverted index) check inverted index file existence befor data compaction (#21173)

The 9 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 be/src/agent/be_exec_version_manager.h             |   1 +
 be/src/common/config.cpp                           |   2 +-
 be/src/olap/bloom_filter_predicate.h               |  44 +++-
 be/src/olap/compaction.cpp                         |  42 ++-
 .../segment_v2/inverted_index_compaction.cpp       |  23 +-
 .../rowset/segment_v2/inverted_index_compaction.h  |  12 +-
 be/src/olap/task/index_builder.cpp                 |   9 +-
 be/src/vec/columns/column_object.cpp               |  15 +-
 be/src/vec/columns/column_object.h                 |   3 +
 be/src/vec/columns/subcolumn_tree.h                |   2 +-
 be/src/vec/common/schema_util.cpp                  |  16 +-
 be/src/vec/core/block.cpp                          |   8 +-
 be/src/vec/exec/format/json/new_json_reader.cpp    |  24 +-
 be/src/vec/exec/format/json/new_json_reader.h      |   1 +
 .../org/apache/doris/alter/IndexChangeJob.java     |   4 +-
 .../java/org/apache/doris/analysis/SortInfo.java   |   7 +
 .../org/apache/doris/catalog/TableProperty.java    |   7 +
 .../java/org/apache/doris/common/FeNameFormat.java |   2 +-
 .../org/apache/doris/external/jdbc/JdbcClient.java |  20 +-
 .../doris/external/jdbc/JdbcMySQLClient.java       |  28 +-
 .../glue/translator/PhysicalPlanTranslator.java    |  75 +++++-
 .../post/PushdownFilterThroughProject.java         |  20 +-
 .../plans/physical/PhysicalAssertNumRows.java      |   3 +-
 .../trees/plans/physical/PhysicalCTEAnchor.java    |   3 +-
 .../trees/plans/physical/PhysicalFilter.java       |   8 +-
 .../trees/plans/physical/PhysicalGenerate.java     |   4 +-
 .../trees/plans/physical/PhysicalLimit.java        |   3 +-
 .../plans/physical/PhysicalOlapTableSink.java      |   5 +-
 .../plans/physical/PhysicalPartitionTopN.java      |   5 +-
 .../trees/plans/physical/PhysicalProject.java      |   8 +-
 .../trees/plans/physical/PhysicalRepeat.java       |   3 +-
 .../nereids/trees/plans/physical/PhysicalTopN.java |   3 +-
 .../trees/plans/physical/PhysicalWindow.java       |   4 +-
 ....java => PushdownFilterThroughProjectTest.java} |  54 ++--
 .../data/dynamic_table_p0/array_dimenssion.json    |   5 +
 .../data/dynamic_table_p0/invalid_name.json        |   2 +
 .../data/dynamic_table_p0/nested_filter.json       |   2 +
 .../suites/dynamic_table_p0/load.groovy            |  21 +-
 .../suites/dynamic_table_p0/sql/q01.sql            |  14 +-
 .../suites/dynamic_table_p0/sql/q02.sql            |  19 +-
 .../test_dytable_complex_data.groovy               | 291 ---------------------
 41 files changed, 370 insertions(+), 452 deletions(-)
 copy fe/fe-core/src/test/java/org/apache/doris/nereids/postprocess/{MergeProjectPostProcessTest.java => PushdownFilterThroughProjectTest.java} (68%)
 create mode 100644 regression-test/data/dynamic_table_p0/array_dimenssion.json
 create mode 100644 regression-test/data/dynamic_table_p0/invalid_name.json
 create mode 100644 regression-test/data/dynamic_table_p0/nested_filter.json
 delete mode 100644 regression-test/suites/dynamic_table_p0/test_dytable_complex_data.groovy


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 09/09: [Fix](inverted index) check inverted index file existence befor data compaction (#21173)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 932a38b4ec7739f38c10d403b5ae30fca48a895b
Author: airborne12 <ai...@gmail.com>
AuthorDate: Mon Jun 26 19:55:55 2023 +0800

    [Fix](inverted index) check inverted index file existence befor data compaction (#21173)
---
 be/src/olap/compaction.cpp                         | 42 ++++++++++++++++++++--
 .../segment_v2/inverted_index_compaction.cpp       | 23 ++++--------
 .../rowset/segment_v2/inverted_index_compaction.h  | 12 +++----
 3 files changed, 52 insertions(+), 25 deletions(-)

diff --git a/be/src/olap/compaction.cpp b/be/src/olap/compaction.cpp
index 5366020f35..5354605108 100644
--- a/be/src/olap/compaction.cpp
+++ b/be/src/olap/compaction.cpp
@@ -399,10 +399,17 @@ Status Compaction::do_compaction_impl(int64_t permits) {
                 [&src_segment_num, &dest_segment_num, &index_writer_path, &src_index_files,
                  &dest_index_files, &fs, &tablet_path, &trans_vec, &dest_segment_num_rows,
                  this](int32_t column_uniq_id) {
-                    compact_column(
+                    auto st = compact_column(
                             _cur_tablet_schema->get_inverted_index(column_uniq_id)->index_id(),
                             src_segment_num, dest_segment_num, src_index_files, dest_index_files,
                             fs, index_writer_path, tablet_path, trans_vec, dest_segment_num_rows);
+                    if (!st.ok()) {
+                        LOG(ERROR) << "failed to do index compaction"
+                                   << ". tablet=" << _tablet->full_name()
+                                   << ". column uniq id=" << column_uniq_id << ". index_id= "
+                                   << _cur_tablet_schema->get_inverted_index(column_uniq_id)
+                                              ->index_id();
+                    }
                 });
 
         LOG(INFO) << "succeed to do index compaction"
@@ -465,8 +472,37 @@ Status Compaction::construct_output_rowset_writer(RowsetWriterContext& ctx, bool
                 //NOTE: here src_rs may be in building index progress, so it would not contain inverted index info.
                 bool all_have_inverted_index = std::all_of(
                         _input_rowsets.begin(), _input_rowsets.end(), [&](const auto& src_rs) {
-                            return src_rs->tablet_schema()->get_inverted_index(unique_id) !=
-                                   nullptr;
+                            BetaRowsetSharedPtr rowset =
+                                    std::static_pointer_cast<BetaRowset>(src_rs);
+                            if (rowset == nullptr) {
+                                return false;
+                            }
+                            auto fs = rowset->rowset_meta()->fs();
+
+                            auto index_meta =
+                                    rowset->tablet_schema()->get_inverted_index(unique_id);
+                            if (index_meta == nullptr) {
+                                return false;
+                            }
+                            for (auto i = 0; i < rowset->num_segments(); i++) {
+                                auto segment_file = rowset->segment_file_path(i);
+                                std::string inverted_index_src_file_path =
+                                        InvertedIndexDescriptor::get_index_file_name(
+                                                segment_file, index_meta->index_id());
+                                bool exists = false;
+                                if (fs->exists(inverted_index_src_file_path, &exists) !=
+                                    Status::OK()) {
+                                    LOG(ERROR)
+                                            << inverted_index_src_file_path << " fs->exists error";
+                                    return false;
+                                }
+                                if (!exists) {
+                                    LOG(WARNING) << inverted_index_src_file_path
+                                                 << " is not exists, will skip index compaction";
+                                    return false;
+                                }
+                            }
+                            return true;
                         });
                 if (all_have_inverted_index &&
                     field_is_slice_type(_cur_tablet_schema->column_by_uid(unique_id).type())) {
diff --git a/be/src/olap/rowset/segment_v2/inverted_index_compaction.cpp b/be/src/olap/rowset/segment_v2/inverted_index_compaction.cpp
index 83347a3764..cbc3b4399d 100644
--- a/be/src/olap/rowset/segment_v2/inverted_index_compaction.cpp
+++ b/be/src/olap/rowset/segment_v2/inverted_index_compaction.cpp
@@ -24,12 +24,12 @@
 
 namespace doris {
 namespace segment_v2 {
-void compact_column(int32_t index_id, int src_segment_num, int dest_segment_num,
-                    std::vector<std::string> src_index_files,
-                    std::vector<std::string> dest_index_files, const io::FileSystemSPtr& fs,
-                    std::string index_writer_path, std::string tablet_path,
-                    std::vector<std::vector<std::pair<uint32_t, uint32_t>>> trans_vec,
-                    std::vector<uint32_t> dest_segment_num_rows) {
+Status compact_column(int32_t index_id, int src_segment_num, int dest_segment_num,
+                      std::vector<std::string> src_index_files,
+                      std::vector<std::string> dest_index_files, const io::FileSystemSPtr& fs,
+                      std::string index_writer_path, std::string tablet_path,
+                      std::vector<std::vector<std::pair<uint32_t, uint32_t>>> trans_vec,
+                      std::vector<uint32_t> dest_segment_num_rows) {
     lucene::store::Directory* dir =
             DorisCompoundDirectory::getDirectory(fs, index_writer_path.c_str(), false);
     auto index_writer = _CLNEW lucene::index::IndexWriter(dir, nullptr, true /* create */,
@@ -41,16 +41,6 @@ void compact_column(int32_t index_id, int src_segment_num, int dest_segment_num,
         // format: rowsetId_segmentId_indexId.idx
         std::string src_idx_full_name =
                 src_index_files[i] + "_" + std::to_string(index_id) + ".idx";
-        bool exists = false;
-        auto st = fs->exists(src_idx_full_name, &exists);
-        if (!st.ok()) {
-            LOG(ERROR) << src_idx_full_name << " fs->exists error:" << st;
-            return;
-        }
-        if (!exists) {
-            LOG(WARNING) << src_idx_full_name << " is not exists, will stop index compaction ";
-            return;
-        }
         DorisCompoundReader* reader = new DorisCompoundReader(
                 DorisCompoundDirectory::getDirectory(fs, tablet_path.c_str()),
                 src_idx_full_name.c_str());
@@ -90,6 +80,7 @@ void compact_column(int32_t index_id, int src_segment_num, int dest_segment_num,
 
     // delete temporary index_writer_path
     fs->delete_directory(index_writer_path.c_str());
+    return Status::OK();
 }
 } // namespace segment_v2
 } // namespace doris
diff --git a/be/src/olap/rowset/segment_v2/inverted_index_compaction.h b/be/src/olap/rowset/segment_v2/inverted_index_compaction.h
index a682b6111f..f615192b19 100644
--- a/be/src/olap/rowset/segment_v2/inverted_index_compaction.h
+++ b/be/src/olap/rowset/segment_v2/inverted_index_compaction.h
@@ -25,11 +25,11 @@
 namespace doris {
 
 namespace segment_v2 {
-void compact_column(int32_t index_id, int src_segment_num, int dest_segment_num,
-                    std::vector<std::string> src_index_files,
-                    std::vector<std::string> dest_index_files, const io::FileSystemSPtr& fs,
-                    std::string index_writer_path, std::string tablet_path,
-                    std::vector<std::vector<std::pair<uint32_t, uint32_t>>> trans_vec,
-                    std::vector<uint32_t> dest_segment_num_rows);
+Status compact_column(int32_t index_id, int src_segment_num, int dest_segment_num,
+                      std::vector<std::string> src_index_files,
+                      std::vector<std::string> dest_index_files, const io::FileSystemSPtr& fs,
+                      std::string index_writer_path, std::string tablet_path,
+                      std::vector<std::vector<std::pair<uint32_t, uint32_t>>> trans_vec,
+                      std::vector<uint32_t> dest_segment_num_rows);
 } // namespace segment_v2
 } // namespace doris


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 02/09: [fix](inverted index) fix build inverted index failed but not return immediately (#21165)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 507b98a0cb66ed60146d64449dd44633b6e75ad3
Author: YueW <45...@users.noreply.github.com>
AuthorDate: Mon Jun 26 14:05:12 2023 +0800

    [fix](inverted index) fix build inverted index failed but not return immediately (#21165)
---
 be/src/olap/task/index_builder.cpp | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

diff --git a/be/src/olap/task/index_builder.cpp b/be/src/olap/task/index_builder.cpp
index b7d93120aa..05ba099f90 100644
--- a/be/src/olap/task/index_builder.cpp
+++ b/be/src/olap/task/index_builder.cpp
@@ -333,7 +333,7 @@ Status IndexBuilder::_add_data(const std::string& column_name,
 }
 
 Status IndexBuilder::handle_inverted_index_data() {
-    LOG(INFO) << "begint to handle_inverted_index_data";
+    LOG(INFO) << "begin to handle_inverted_index_data";
     DCHECK(_input_rowsets.size() == _output_rowsets.size());
     for (auto i = 0; i < _output_rowsets.size(); ++i) {
         SegmentCacheHandle segment_cache_handle;
@@ -347,7 +347,7 @@ Status IndexBuilder::handle_inverted_index_data() {
 }
 
 Status IndexBuilder::do_build_inverted_index() {
-    LOG(INFO) << "begine to do_build_inverted_index, tablet=" << _tablet->tablet_id()
+    LOG(INFO) << "begin to do_build_inverted_index, tablet=" << _tablet->tablet_id()
               << ", is_drop_op=" << _is_drop_op;
     if (_alter_inverted_indexes.empty()) {
         return Status::OK();
@@ -403,6 +403,7 @@ Status IndexBuilder::do_build_inverted_index() {
         LOG(WARNING) << "failed to update_inverted_index_info. "
                      << "tablet=" << _tablet->tablet_id() << ", error=" << st;
         gc_output_rowset();
+        return st;
     }
 
     // create inverted index file for output rowset
@@ -411,6 +412,7 @@ Status IndexBuilder::do_build_inverted_index() {
         LOG(WARNING) << "failed to handle_inverted_index_data. "
                      << "tablet=" << _tablet->tablet_id() << ", error=" << st;
         gc_output_rowset();
+        return st;
     }
 
     // modify rowsets in memory
@@ -419,8 +421,9 @@ Status IndexBuilder::do_build_inverted_index() {
         LOG(WARNING) << "failed to modify rowsets in memory. "
                      << "tablet=" << _tablet->tablet_id() << ", error=" << st;
         gc_output_rowset();
+        return st;
     }
-    return st;
+    return Status::OK();
 }
 
 Status IndexBuilder::modify_rowsets(const Merger::Statistics* stats) {


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 01/09: [chore](block) temporarily disable DCHECK for column name equality in MutableBlock (#21116)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 3179a5519b8a57d06d45cf43c52ea29fd1ec8ca0
Author: Kang <kx...@gmail.com>
AuthorDate: Mon Jun 26 10:49:27 2023 +0800

    [chore](block) temporarily disable DCHECK for column name equality in MutableBlock (#21116)
    
    * tempororyly disable DCHECK for column name equality in MutableBlock::add_rows
    
    * num columns EQ to LE
---
 be/src/vec/core/block.cpp | 8 ++------
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/be/src/vec/core/block.cpp b/be/src/vec/core/block.cpp
index 517bfc411d..b59e9fa6c2 100644
--- a/be/src/vec/core/block.cpp
+++ b/be/src/vec/core/block.cpp
@@ -904,12 +904,10 @@ void MutableBlock::add_row(const Block* block, int row) {
 }
 
 void MutableBlock::add_rows(const Block* block, const int* row_begin, const int* row_end) {
-    DCHECK_EQ(columns(), block->columns());
+    DCHECK_LE(columns(), block->columns());
     auto& block_data = block->get_columns_with_type_and_name();
     for (size_t i = 0; i < _columns.size(); ++i) {
-        // DCHECK(_columns[i]->get_data_type() == block_data[i].column->get_data_type());
         DCHECK_EQ(_data_types[i]->get_name(), block_data[i].type->get_name());
-        DCHECK_EQ(_names[i], block_data[i].name);
         auto& dst = _columns[i];
         auto& src = *block_data[i].column.get();
         dst->insert_indices_from(src, row_begin, row_end);
@@ -917,12 +915,10 @@ void MutableBlock::add_rows(const Block* block, const int* row_begin, const int*
 }
 
 void MutableBlock::add_rows(const Block* block, size_t row_begin, size_t length) {
-    DCHECK_EQ(columns(), block->columns());
+    DCHECK_LE(columns(), block->columns());
     auto& block_data = block->get_columns_with_type_and_name();
     for (size_t i = 0; i < _columns.size(); ++i) {
-        // DCHECK(_columns[i]->get_data_type() == block_data[i].column->get_data_type());
         DCHECK_EQ(_data_types[i]->get_name(), block_data[i].type->get_name());
-        DCHECK_EQ(_names[i], block_data[i].name);
         auto& dst = _columns[i];
         auto& src = *block_data[i].column.get();
         dst->insert_range_from(src, row_begin, length);


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 08/09: [fix](nereids) set proper sort info to scan node to enable TopN-opt (#21148)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 8f79e264f11c4c755976467c4ee2ebfdabc7323c
Author: minghong <en...@gmail.com>
AuthorDate: Mon Jun 26 19:54:37 2023 +0800

    [fix](nereids) set proper sort info to scan node to enable TopN-opt (#21148)
---
 .../java/org/apache/doris/analysis/SortInfo.java   |  7 ++
 .../glue/translator/PhysicalPlanTranslator.java    | 75 +++++++++++++++++++++-
 2 files changed, 81 insertions(+), 1 deletion(-)

diff --git a/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java b/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java
index 0dc41c037d..a4d5962bc4 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/analysis/SortInfo.java
@@ -139,6 +139,13 @@ public class SortInfo {
         return materializedOrderingExprs;
     }
 
+    public void addMaterializedOrderingExpr(Expr expr) {
+        if (materializedOrderingExprs == null) {
+            materializedOrderingExprs = Lists.newArrayList();
+        }
+        materializedOrderingExprs.add(expr);
+    }
+
     public List<Expr> getSortTupleSlotExprs() {
         return sortTupleSlotExprs;
     }
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
index 10cb05acba..fa0bb2a303 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/glue/translator/PhysicalPlanTranslator.java
@@ -160,6 +160,7 @@ import org.apache.doris.planner.external.HiveScanNode;
 import org.apache.doris.planner.external.hudi.HudiScanNode;
 import org.apache.doris.planner.external.iceberg.IcebergScanNode;
 import org.apache.doris.planner.external.paimon.PaimonScanNode;
+import org.apache.doris.qe.ConnectContext;
 import org.apache.doris.tablefunction.TableValuedFunctionIf;
 import org.apache.doris.thrift.TColumn;
 import org.apache.doris.thrift.TFetchOption;
@@ -1069,7 +1070,25 @@ public class PhysicalPlanTranslator extends DefaultPlanVisitor<PlanFragment, Pla
                 PlanNode child = sortNode.getChild(0);
                 Preconditions.checkArgument(child instanceof OlapScanNode,
                         "topN opt expect OlapScanNode, but we get " + child);
-                ((OlapScanNode) child).setUseTopnOpt(true);
+                OlapScanNode scanNode = ((OlapScanNode) child);
+                scanNode.setUseTopnOpt(true);
+            }
+            // push sort to scan opt
+            if (sortNode.getChild(0) instanceof OlapScanNode) {
+                OlapScanNode scanNode = ((OlapScanNode) sortNode.getChild(0));
+                if (checkPushSort(sortNode, scanNode.getOlapTable())) {
+                    SortInfo sortInfo = sortNode.getSortInfo();
+                    scanNode.setSortInfo(sortInfo);
+                    scanNode.getSortInfo().setSortTupleSlotExprs(sortNode.getResolvedTupleExprs());
+                    for (Expr expr : sortInfo.getOrderingExprs()) {
+                        scanNode.getSortInfo().addMaterializedOrderingExpr(expr);
+                    }
+                    if (sortNode.getOffset() > 0) {
+                        scanNode.setSortLimit(sortNode.getLimit() + sortNode.getOffset());
+                    } else {
+                        scanNode.setSortLimit(sortNode.getLimit());
+                    }
+                }
             }
             addPlanRoot(currentFragment, sortNode, topN);
         } else {
@@ -1094,6 +1113,60 @@ public class PhysicalPlanTranslator extends DefaultPlanVisitor<PlanFragment, Pla
         return currentFragment;
     }
 
+    /**
+     * topN opt: using storage data ordering to accelerate topn operation.
+     * refer pr: optimize topn query if order by columns is prefix of sort keys of table (#10694)
+     */
+    public boolean checkPushSort(SortNode sortNode, OlapTable olapTable) {
+        // Ensure limit is less then threshold
+        if (sortNode.getLimit() <= 0
+                || sortNode.getLimit() > ConnectContext.get().getSessionVariable().topnOptLimitThreshold) {
+            return false;
+        }
+
+        // Ensure all isAscOrder is same, ande length != 0.
+        // Can't be zorder.
+        if (sortNode.getSortInfo().getIsAscOrder().stream().distinct().count() != 1
+                || olapTable.isZOrderSort()) {
+            return false;
+        }
+
+        // Tablet's order by key only can be the front part of schema.
+        // Like: schema: a.b.c.d.e.f.g order by key: a.b.c (no a,b,d)
+        // Do **prefix match** to check if order by key can be pushed down.
+        // olap order by key: a.b.c.d
+        // sort key: (a) (a,b) (a,b,c) (a,b,c,d) is ok
+        //           (a,c) (a,c,d), (a,c,b) (a,c,f) (a,b,c,d,e)is NOT ok
+        List<Expr> sortExprs = sortNode.getSortInfo().getOrderingExprs();
+        List<Boolean> nullsFirsts = sortNode.getSortInfo().getNullsFirst();
+        List<Boolean> isAscOrders = sortNode.getSortInfo().getIsAscOrder();
+        if (sortExprs.size() > olapTable.getDataSortInfo().getColNum()) {
+            return false;
+        }
+        for (int i = 0; i < sortExprs.size(); i++) {
+            // table key.
+            Column tableKey = olapTable.getFullSchema().get(i);
+            // sort slot.
+            Expr sortExpr = sortExprs.get(i);
+            if (sortExpr instanceof SlotRef) {
+                SlotRef slotRef = (SlotRef) sortExpr;
+                if (tableKey.equals(slotRef.getColumn())) {
+                    // ORDER BY DESC NULLS FIRST can not be optimized to only read file tail,
+                    // since NULLS is at file head but data is at tail
+                    if (tableKey.isAllowNull() && nullsFirsts.get(i) && !isAscOrders.get(i)) {
+                        return false;
+                    }
+                } else {
+                    return false;
+                }
+            } else {
+                return false;
+            }
+        }
+
+        return true;
+    }
+
     private SortNode translateSortNode(AbstractPhysicalSort<? extends Plan> sort, PlanNode childNode,
             PlanTranslatorContext context) {
         List<Expr> oldOrderingExprList = Lists.newArrayList();


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 04/09: [Bug](RuntimeFiter) Fix bf error change the murmurhash to crc32 in regression test p2 (#21167)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 4bf8d7af2ba30bebe97a38bb8b76c92914767df5
Author: HappenLee <ha...@hotmail.com>
AuthorDate: Mon Jun 26 16:39:45 2023 +0800

    [Bug](RuntimeFiter) Fix bf error change the murmurhash to crc32 in regression test p2 (#21167)
---
 be/src/agent/be_exec_version_manager.h |  1 +
 be/src/olap/bloom_filter_predicate.h   | 44 +++++++++++++++++++++++++---------
 2 files changed, 34 insertions(+), 11 deletions(-)

diff --git a/be/src/agent/be_exec_version_manager.h b/be/src/agent/be_exec_version_manager.h
index 0491a038c8..657ebab02d 100644
--- a/be/src/agent/be_exec_version_manager.h
+++ b/be/src/agent/be_exec_version_manager.h
@@ -55,6 +55,7 @@ private:
  * 2: start from doris 2.0
  *    a. function month/day/hour/minute/second's return type is changed to smaller type.
  *    b. in order to solve agg of sum/count is not compatibility during the upgrade process
+ *    c. change the string hash method in runtime filter
  *
 */
 inline const int BeExecVersionManager::max_be_exec_version = 2;
diff --git a/be/src/olap/bloom_filter_predicate.h b/be/src/olap/bloom_filter_predicate.h
index 99debfa94b..885927d3f5 100644
--- a/be/src/olap/bloom_filter_predicate.h
+++ b/be/src/olap/bloom_filter_predicate.h
@@ -63,6 +63,17 @@ private:
             DCHECK(null_map);
         }
 
+        uint24_t tmp_uint24_value;
+        auto get_cell_value = [&tmp_uint24_value](auto& data) {
+            if constexpr (std::is_same_v<std::decay_t<decltype(data)>, uint32_t> &&
+                          T == PrimitiveType::TYPE_DATE) {
+                memcpy((char*)(&tmp_uint24_value), (char*)(&data), sizeof(uint24_t));
+                return (const char*)&tmp_uint24_value;
+            } else {
+                return (const char*)&data;
+            }
+        };
+
         uint16_t new_size = 0;
         if (column.is_column_dictionary()) {
             auto* dict_col = reinterpret_cast<const vectorized::ColumnDictI32*>(&column);
@@ -90,6 +101,28 @@ private:
                     }
                 }
             }
+        } else if (is_string_type(T) && _be_exec_version >= 2) {
+            auto& pred_col =
+                    reinterpret_cast<
+                            const vectorized::PredicateColumnType<PredicateEvaluateType<T>>*>(
+                            &column)
+                            ->get_data();
+
+            auto pred_col_data = pred_col.data();
+            const bool is_dense_column = pred_col.size() == size;
+            for (uint16_t i = 0; i < size; i++) {
+                uint16_t idx = is_dense_column ? i : sel[i];
+                if constexpr (is_nullable) {
+                    if (!null_map[idx] &&
+                        _specific_filter->find_crc32_hash(get_cell_value(pred_col_data[idx]))) {
+                        sel[new_size++] = idx;
+                    }
+                } else {
+                    if (_specific_filter->find_crc32_hash(get_cell_value(pred_col_data[idx]))) {
+                        sel[new_size++] = idx;
+                    }
+                }
+            }
         } else if (IRuntimeFilter::enable_use_batch(_be_exec_version > 0, T)) {
             const auto& data =
                     reinterpret_cast<
@@ -99,17 +132,6 @@ private:
             new_size = _specific_filter->find_fixed_len_olap_engine((char*)data.data(), null_map,
                                                                     sel, size, data.size() != size);
         } else {
-            uint24_t tmp_uint24_value;
-            auto get_cell_value = [&tmp_uint24_value](auto& data) {
-                if constexpr (std::is_same_v<std::decay_t<decltype(data)>, uint32_t> &&
-                              T == PrimitiveType::TYPE_DATE) {
-                    memcpy((char*)(&tmp_uint24_value), (char*)(&data), sizeof(uint24_t));
-                    return (const char*)&tmp_uint24_value;
-                } else {
-                    return (const char*)&data;
-                }
-            };
-
             auto& pred_col =
                     reinterpret_cast<
                             const vectorized::PredicateColumnType<PredicateEvaluateType<T>>*>(


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 06/09: [Improve](dynamic schema) support filtering invalid data (#21160)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit cebc765235564088c20da2268e2711877755ccc6
Author: lihangyu <15...@163.com>
AuthorDate: Mon Jun 26 19:32:43 2023 +0800

    [Improve](dynamic schema) support filtering invalid data (#21160)
    
    * [Improve](dynamic schema) support filtering invalid data
    
    1. Support dynamic schema to filter illegal data.
    2. Expand the regular expression for ColumnName to support more column names.
    3. Be compatible with PropertyAnalyzer and support legacy tables.
    4. Default disable parse multi dimenssion array, since some bug unresolved
---
 be/src/common/config.cpp                           |   2 +-
 be/src/vec/columns/column_object.cpp               |  15 +-
 be/src/vec/columns/column_object.h                 |   3 +
 be/src/vec/columns/subcolumn_tree.h                |   2 +-
 be/src/vec/common/schema_util.cpp                  |  16 +-
 be/src/vec/exec/format/json/new_json_reader.cpp    |  24 +-
 be/src/vec/exec/format/json/new_json_reader.h      |   1 +
 .../org/apache/doris/catalog/TableProperty.java    |   7 +
 .../java/org/apache/doris/common/FeNameFormat.java |   2 +-
 .../data/dynamic_table_p0/array_dimenssion.json    |   5 +
 .../data/dynamic_table_p0/invalid_name.json        |   2 +
 .../data/dynamic_table_p0/nested_filter.json       |   2 +
 .../suites/dynamic_table_p0/load.groovy            |  21 +-
 .../suites/dynamic_table_p0/sql/q01.sql            |  14 +-
 .../suites/dynamic_table_p0/sql/q02.sql            |  19 +-
 .../test_dytable_complex_data.groovy               | 291 ---------------------
 16 files changed, 103 insertions(+), 323 deletions(-)

diff --git a/be/src/common/config.cpp b/be/src/common/config.cpp
index 2ea50568fc..ef88d9e0ff 100644
--- a/be/src/common/config.cpp
+++ b/be/src/common/config.cpp
@@ -994,7 +994,7 @@ DEFINE_Bool(inverted_index_compaction_enable, "false");
 // use num_broadcast_buffer blocks as buffer to do broadcast
 DEFINE_Int32(num_broadcast_buffer, "32");
 // semi-structure configs
-DEFINE_Bool(enable_parse_multi_dimession_array, "true");
+DEFINE_Bool(enable_parse_multi_dimession_array, "false");
 
 // Currently, two compaction strategies are implemented, SIZE_BASED and TIME_SERIES.
 // In the case of time series compaction, the execution of compaction is adjusted
diff --git a/be/src/vec/columns/column_object.cpp b/be/src/vec/columns/column_object.cpp
index 7830e3fe8f..050939f8fa 100644
--- a/be/src/vec/columns/column_object.cpp
+++ b/be/src/vec/columns/column_object.cpp
@@ -865,8 +865,10 @@ void ColumnObject::finalize() {
         if (is_nothing(getBaseTypeOfArray(least_common_type))) {
             continue;
         }
-        entry->data.finalize();
-        new_subcolumns.add(entry->path, entry->data);
+        if (!entry->data.data.empty()) {
+            entry->data.finalize();
+            new_subcolumns.add(entry->path, entry->data);
+        }
     }
     /// If all subcolumns were skipped add a dummy subcolumn,
     /// because Tuple type must have at least one element.
@@ -927,6 +929,15 @@ size_t ColumnObject::filter(const Filter& filter) {
     return num_rows;
 }
 
+void ColumnObject::revise_to(int target_num_rows) {
+    for (auto&& entry : subcolumns) {
+        if (entry->data.size() > target_num_rows) {
+            entry->data.pop_back(entry->data.size() - target_num_rows);
+        }
+    }
+    num_rows = target_num_rows;
+}
+
 template <typename ColumnInserterFn>
 void align_variant_by_name_and_type(ColumnObject& dst, const ColumnObject& src, size_t row_cnt,
                                     ColumnInserterFn inserter) {
diff --git a/be/src/vec/columns/column_object.h b/be/src/vec/columns/column_object.h
index 339db8c6b9..f8991092b2 100644
--- a/be/src/vec/columns/column_object.h
+++ b/be/src/vec/columns/column_object.h
@@ -298,6 +298,9 @@ public:
 
     void insert_default() override;
 
+    // Revise this column to specified num_rows
+    void revise_to(int num_rows);
+
     [[noreturn]] ColumnPtr replicate(const Offsets& offsets) const override {
         LOG(FATAL) << "should not call the method replicate in column object";
     }
diff --git a/be/src/vec/columns/subcolumn_tree.h b/be/src/vec/columns/subcolumn_tree.h
index 0a12593870..4ba0194eeb 100644
--- a/be/src/vec/columns/subcolumn_tree.h
+++ b/be/src/vec/columns/subcolumn_tree.h
@@ -96,7 +96,7 @@ public:
 
         Node* current_node = root.get();
         for (size_t i = 0; i < parts.size() - 1; ++i) {
-            assert(current_node->kind != Node::SCALAR);
+            // assert(current_node->kind != Node::SCALAR);
 
             auto it = current_node->children.find(
                     StringRef {parts[i].key.data(), parts[i].key.size()});
diff --git a/be/src/vec/common/schema_util.cpp b/be/src/vec/common/schema_util.cpp
index 798722d1b0..703ee1bc75 100644
--- a/be/src/vec/common/schema_util.cpp
+++ b/be/src/vec/common/schema_util.cpp
@@ -225,6 +225,9 @@ Status send_fetch_full_base_schema_view_rpc(FullBaseSchemaView* schema_view) {
     return Status::OK();
 }
 
+static const std::regex COLUMN_NAME_REGEX(
+        "^[_a-zA-Z@0-9\\s<>/][.a-zA-Z0-9_+-/><?@#$%^&*\"\\s,:]{0,255}$");
+
 // Do batch add columns schema change
 // only the base table supported
 Status send_add_columns_rpc(ColumnsWithTypeAndName column_type_names,
@@ -241,7 +244,18 @@ Status send_add_columns_rpc(ColumnsWithTypeAndName column_type_names,
     // TODO(lhy) more configurable
     req.__set_allow_type_conflict(true);
     req.__set_addColumns({});
+    // Deduplicate Column like `Level` and `level`
+    // TODO we will implement new version of dynamic column soon to handle this issue,
+    // also ignore column missmatch with regex
+    std::set<std::string> dedup;
     for (const auto& column_type_name : column_type_names) {
+        if (dedup.contains(to_lower(column_type_name.name))) {
+            continue;
+        }
+        if (!std::regex_match(column_type_name.name, COLUMN_NAME_REGEX)) {
+            continue;
+        }
+        dedup.insert(to_lower(column_type_name.name));
         TColumnDef col;
         get_column_def(column_type_name.type, column_type_name.name, &col);
         req.addColumns.push_back(col);
@@ -262,7 +276,7 @@ Status send_add_columns_rpc(ColumnsWithTypeAndName column_type_names,
                 fmt::format("Failed to do schema change, {}", res.status.error_msgs[0]));
     }
     size_t sz = res.allColumns.size();
-    if (sz < column_type_names.size()) {
+    if (sz < dedup.size()) {
         return Status::InternalError(
                 fmt::format("Unexpected result columns {}, expected at least {}",
                             res.allColumns.size(), column_type_names.size()));
diff --git a/be/src/vec/exec/format/json/new_json_reader.cpp b/be/src/vec/exec/format/json/new_json_reader.cpp
index 2ea4b5adc0..5c3fbd8c70 100644
--- a/be/src/vec/exec/format/json/new_json_reader.cpp
+++ b/be/src/vec/exec/format/json/new_json_reader.cpp
@@ -467,7 +467,13 @@ Status NewJsonReader::_parse_dynamic_json(bool* is_empty_row, bool* eof, Block&
     _bytes_read_counter += size;
     auto& dynamic_column = block.get_columns().back()->assume_mutable_ref();
     auto& column_object = assert_cast<vectorized::ColumnObject&>(dynamic_column);
+    bool filter_this_line = false;
     auto finalize_column = [&]() -> Status {
+        // Revise column object
+        if (filter_this_line) {
+            _counter->num_rows_filtered++;
+            column_object.revise_to(_cur_parsed_variant_rows);
+        }
         size_t batch_size = std::max(_state->batch_size(), (int)_MIN_BATCH_SIZE);
         if (column_object.size() >= batch_size || _reader_eof) {
             column_object.finalize();
@@ -478,6 +484,7 @@ Status NewJsonReader::_parse_dynamic_json(bool* is_empty_row, bool* eof, Block&
             // fill default values missing in static columns
             RETURN_IF_ERROR(schema_util::unfold_object(block.columns() - 1, block,
                                                        true /*cast to original column type*/));
+            _cur_parsed_variant_rows = 0;
         }
         return Status::OK();
     };
@@ -488,10 +495,23 @@ Status NewJsonReader::_parse_dynamic_json(bool* is_empty_row, bool* eof, Block&
         return Status::OK();
     }
 
-    RETURN_IF_CATCH_EXCEPTION(doris::vectorized::parse_json_to_variant(
-            column_object, StringRef {json_str, size}, _json_parser.get()));
+    try {
+        doris::vectorized::parse_json_to_variant(column_object, StringRef {json_str, size},
+                                                 _json_parser.get());
+        ++_cur_parsed_variant_rows;
+    } catch (const doris::Exception& e) {
+        if (e.code() == doris::ErrorCode::MEM_ALLOC_FAILED) {
+            return Status::MemoryLimitExceeded(fmt::format(
+                    "PreCatch error code:{}, {}, __FILE__:{}, __LINE__:{}, __FUNCTION__:{}",
+                    e.code(), e.to_string(), __FILE__, __LINE__, __PRETTY_FUNCTION__));
+        } else {
+            filter_this_line = true;
+        }
+    }
+
     // TODO correctly handle data quality error
     RETURN_IF_ERROR(finalize_column());
+
     return Status::OK();
 }
 
diff --git a/be/src/vec/exec/format/json/new_json_reader.h b/be/src/vec/exec/format/json/new_json_reader.h
index 039b3db380..720da50397 100644
--- a/be/src/vec/exec/format/json/new_json_reader.h
+++ b/be/src/vec/exec/format/json/new_json_reader.h
@@ -265,6 +265,7 @@ private:
     std::unique_ptr<simdjson::ondemand::parser> _ondemand_json_parser = nullptr;
     // column to default value string map
     std::unordered_map<std::string, std::string> _col_default_value_map;
+    int32_t _cur_parsed_variant_rows = 0;
 };
 
 } // namespace vectorized
diff --git a/fe/fe-core/src/main/java/org/apache/doris/catalog/TableProperty.java b/fe/fe-core/src/main/java/org/apache/doris/catalog/TableProperty.java
index d4d1b529c0..e094355a5a 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/catalog/TableProperty.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/catalog/TableProperty.java
@@ -190,6 +190,13 @@ public class TableProperty implements Writable {
     public TableProperty buildStoreRowColumn() {
         storeRowColumn = Boolean.parseBoolean(
                 properties.getOrDefault(PropertyAnalyzer.PROPERTIES_STORE_ROW_COLUMN, "false"));
+        // Remove deprecated prefix and try again
+        String deprecatedPrefix = "deprecated_";
+        if (!storeRowColumn && PropertyAnalyzer.PROPERTIES_STORE_ROW_COLUMN.startsWith(deprecatedPrefix)) {
+            storeRowColumn = Boolean.parseBoolean(
+                properties.getOrDefault(
+                    PropertyAnalyzer.PROPERTIES_STORE_ROW_COLUMN.substring(deprecatedPrefix.length()), "false"));
+        }
         return this;
     }
 
diff --git a/fe/fe-core/src/main/java/org/apache/doris/common/FeNameFormat.java b/fe/fe-core/src/main/java/org/apache/doris/common/FeNameFormat.java
index 483855d8d6..94140e1cae 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/common/FeNameFormat.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/common/FeNameFormat.java
@@ -31,7 +31,7 @@ public class FeNameFormat {
     private static final String LABEL_REGEX = "^[-_A-Za-z0-9:]{1,128}$";
     private static final String COMMON_NAME_REGEX = "^[a-zA-Z][a-zA-Z0-9_]{0,63}$";
     private static final String TABLE_NAME_REGEX = "^[a-zA-Z][a-zA-Z0-9_]*$";
-    private static final String COLUMN_NAME_REGEX = "^[_a-zA-Z@0-9][.a-zA-Z0-9_+-/><?@#$%^&*]{0,255}$";
+    private static final String COLUMN_NAME_REGEX = "^[_a-zA-Z@0-9\\s<>/][.a-zA-Z0-9_+-/><?@#$%^&*\"\\s,:]{0,255}$";
 
     private static final String UNICODE_LABEL_REGEX = "^[-_A-Za-z0-9:\\p{L}]{1,128}$";
     private static final String UNICODE_COMMON_NAME_REGEX = "^[a-zA-Z\\p{L}][a-zA-Z0-9_\\p{L}]{0,63}$";
diff --git a/regression-test/data/dynamic_table_p0/array_dimenssion.json b/regression-test/data/dynamic_table_p0/array_dimenssion.json
new file mode 100644
index 0000000000..b0ad29983f
--- /dev/null
+++ b/regression-test/data/dynamic_table_p0/array_dimenssion.json
@@ -0,0 +1,5 @@
+{"a" : [[1]]}
+{"a" : [1]}
+{"a" : [1]}
+{"a" : [2]}
+{"a" : 10}
diff --git a/regression-test/data/dynamic_table_p0/invalid_name.json b/regression-test/data/dynamic_table_p0/invalid_name.json
new file mode 100644
index 0000000000..7a215e96a1
--- /dev/null
+++ b/regression-test/data/dynamic_table_p0/invalid_name.json
@@ -0,0 +1,2 @@
+{"a": {"\\\\\\\xxxx" : 1024}}
+{"a": {"b": {"c" : 123}}}
diff --git a/regression-test/data/dynamic_table_p0/nested_filter.json b/regression-test/data/dynamic_table_p0/nested_filter.json
new file mode 100644
index 0000000000..ef7b56abe8
--- /dev/null
+++ b/regression-test/data/dynamic_table_p0/nested_filter.json
@@ -0,0 +1,2 @@
+{"a": {"b" : 1024}}
+{"a": {"b": {"c" : 123}}}
diff --git a/regression-test/suites/dynamic_table_p0/load.groovy b/regression-test/suites/dynamic_table_p0/load.groovy
index 766642060b..e4169b262c 100644
--- a/regression-test/suites/dynamic_table_p0/load.groovy
+++ b/regression-test/suites/dynamic_table_p0/load.groovy
@@ -27,6 +27,8 @@ suite("regression_test_dynamic_table", "dynamic_table"){
             set 'read_json_by_line', read_flag
             set 'format', format_flag
             set 'read_json_by_line', read_flag
+            set 'read_json_by_line', read_flag
+            set 'max_filter_ratio', '1'
             if (rand_id) {
                 set 'columns', 'id= rand() * 100000'
             }
@@ -47,7 +49,7 @@ suite("regression_test_dynamic_table", "dynamic_table"){
                     assertEquals("fail", json.Status.toLowerCase())
                 } else {
                     assertEquals("success", json.Status.toLowerCase())
-                    assertEquals(json.NumberTotalRows, json.NumberLoadedRows + json.NumberUnselectedRows)
+                    // assertEquals(json.NumberTotalRows, json.NumberLoadedRows + json.NumberUnselectedRows + json.NumberFilteredRows)
                     assertTrue(json.NumberLoadedRows > 0 && json.LoadBytes > 0)
                 }
             }
@@ -113,15 +115,16 @@ suite("regression_test_dynamic_table", "dynamic_table"){
         load_json_data.call(table_name, 'true', 'json', 'true', src_json, 'true')
         sleep(1000)
     }
-    json_load("btc_transactions.json", "test_btc_json")
+    // TODO: MultiDimension Array is not supported now
+    // json_load("btc_transactions.json", "test_btc_json")
     json_load("ghdata_sample.json", "test_ghdata_json")
-    json_load("nbagames_sample.json", "test_nbagames_json")
+    // json_load("nbagames_sample.json", "test_nbagames_json")
     json_load_nested("es_nested.json", "test_es_nested_json")
-    json_load_unique("btc_transactions.json", "test_btc_json")
+    // json_load_unique("btc_transactions.json", "test_btc_json")
     json_load_unique("ghdata_sample.json", "test_ghdata_json")
-    json_load_unique("nbagames_sample.json", "test_nbagames_json")
+    // json_load_unique("nbagames_sample.json", "test_nbagames_json")
     sql """insert into test_ghdata_json_unique select * from test_ghdata_json"""
-    sql """insert into test_btc_json_unique select * from test_btc_json"""
+    // sql """insert into test_btc_json_unique select * from test_btc_json"""
 
     // abnormal cases
     table_name = "abnormal_cases" 
@@ -137,12 +140,14 @@ suite("regression_test_dynamic_table", "dynamic_table"){
             DISTRIBUTED BY HASH(`qid`) BUCKETS 5 
             properties("replication_num" = "1", "deprecated_dynamic_schema" = "true");
     """
-    load_json_data.call(table_name, 'true', 'json', 'true', "invalid_dimension.json", 'false')
-    load_json_data.call(table_name, 'true', 'json', 'true', "invalid_format.json", 'false')
+    load_json_data.call(table_name, 'true', 'json', 'true', "invalid_dimension.json", 'true')
+    load_json_data.call(table_name, 'true', 'json', 'true', "invalid_format.json", 'true')
     load_json_data.call(table_name, 'true', 'json', 'true', "floating_point.json", 'true')
     load_json_data.call(table_name, 'true', 'json', 'true', "floating_point2.json", 'true')
     load_json_data.call(table_name, 'true', 'json', 'true', "floating_point3.json", 'true')
     load_json_data.call(table_name, 'true', 'json', 'true', "uppercase.json", 'true')
+    load_json_data.call(table_name, 'true', 'json', 'true', "nested_filter.json", 'true')
+    load_json_data.call(table_name, 'true', 'json', 'true', "array_dimenssion.json", 'false')
 
     // load more
     table_name = "gharchive";
diff --git a/regression-test/suites/dynamic_table_p0/sql/q01.sql b/regression-test/suites/dynamic_table_p0/sql/q01.sql
index 14f730426e..3c5e3ac3ca 100644
--- a/regression-test/suites/dynamic_table_p0/sql/q01.sql
+++ b/regression-test/suites/dynamic_table_p0/sql/q01.sql
@@ -1,7 +1,7 @@
-SELECT count() FROM test_btc_json;
-SELECT avg(fee) FROM test_btc_json;
-SELECT avg(size(`inputs.prev_out.spending_outpoints.n`))  FROM test_btc_json;
-SELECT avg(size(`inputs.prev_out.spending_outpoints.tx_index`))  FROM test_btc_json;
-select `inputs.prev_out.spending_outpoints.tx_index`, fee from test_btc_json order by fee,hash;
-select `out.tx_index`[-1] from test_btc_json order by  hash,`out.tx_index`[-1];
-select `out.tx_index`, fee, `out.value`[1] from test_btc_json where array_contains(`out.value`, 2450939412);
\ No newline at end of file
+-- SELECT count() FROM test_btc_json;
+-- SELECT avg(fee) FROM test_btc_json;
+-- SELECT avg(size(`inputs.prev_out.spending_outpoints.n`))  FROM test_btc_json;
+-- SELECT avg(size(`inputs.prev_out.spending_outpoints.tx_index`))  FROM test_btc_json;
+-- select `inputs.prev_out.spending_outpoints.tx_index`, fee from test_btc_json order by fee,hash;
+-- select `out.tx_index`[-1] from test_btc_json order by  hash,`out.tx_index`[-1];
+-- select `out.tx_index`, fee, `out.value`[1] from test_btc_json where array_contains(`out.value`, 2450939412);
\ No newline at end of file
diff --git a/regression-test/suites/dynamic_table_p0/sql/q02.sql b/regression-test/suites/dynamic_table_p0/sql/q02.sql
index 2c7c9c6294..f0b0dc0c46 100644
--- a/regression-test/suites/dynamic_table_p0/sql/q02.sql
+++ b/regression-test/suites/dynamic_table_p0/sql/q02.sql
@@ -1,9 +1,10 @@
-select count() from test_nbagames_json;
-select max(`teams.results.orb`[1]) from test_nbagames_json;
-select sum(cast(element_at(`teams.results.ft_pct`, 1) as double)) from test_nbagames_json;
-select sum(cast(`teams.results.ft_pct`[1] as double)) from test_nbagames_json where size(`teams.results.ft_pct`) = 2;
-select sum(cast(`teams.results.fg3a`[1] as int)) from test_nbagames_json;
-select sum(cast(`teams.results.fg3a`[2] as int)) from test_nbagames_json;
-select `teams.results.ft_pct` from test_nbagames_json where `teams.results.ft_pct`[1] = ".805";
-select sum(`teams.home`[1]) from test_nbagames_json;
-select `teams.results.ft_pct` from test_nbagames_json where `teams.results.ft_pct`[1] = ".805" order by `teams.results.ft_pct`[2];
+-- select count() from test_nbagames_json;
+-- select max(`teams.results.orb`[1]) from test_nbagames_json;
+-- select sum(cast(element_at(`teams.results.ft_pct`, 1) as double)) from test_nbagames_json;
+-- select sum(cast(`teams.results.ft_pct`[1] as double)) from test_nbagames_json where size(`teams.results.ft_pct`) = 2;
+-- select sum(cast(`teams.results.fg3a`[1] as int)) from test_nbagames_json;
+-- select sum(cast(`teams.results.fg3a`[2] as int)) from test_nbagames_json;
+-- select `teams.results.ft_pct` from test_nbagames_json where `teams.results.ft_pct`[1] = ".805";
+-- select sum(`teams.home`[1]) from test_nbagames_json;
+-- select `teams.results.ft_pct` from test_nbagames_json where `teams.results.ft_pct`[1] = ".805" order by `teams.results.ft_pct`[2];
+-- 
\ No newline at end of file
diff --git a/regression-test/suites/dynamic_table_p0/test_dytable_complex_data.groovy b/regression-test/suites/dynamic_table_p0/test_dytable_complex_data.groovy
deleted file mode 100644
index 36fa410b8f..0000000000
--- a/regression-test/suites/dynamic_table_p0/test_dytable_complex_data.groovy
+++ /dev/null
@@ -1,291 +0,0 @@
-// Licensed to the Apache Software Foundation (ASF) under one
-// or more contributor license agreements.  See the NOTICE file
-// distributed with this work for additional information
-// regarding copyright ownership.  The ASF licenses this file
-// to you under the Apache License, Version 2.0 (the
-// "License"); you may not use this file except in compliance
-// with the License.  You may obtain a copy of the License at
-//
-//   http://www.apache.org/licenses/LICENSE-2.0
-//
-// Unless required by applicable law or agreed to in writing,
-// software distributed under the License is distributed on an
-// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
-// KIND, either express or implied.  See the License for the
-// specific language governing permissions and limitations
-// under the License.
-
-suite("test_dynamic_table", "dynamic_table"){
-    // prepare test table
-    def load_json_data = {table_name, vec_flag, format_flag, read_flag, file_name, expect_success ->
-        // load the json data
-        streamLoad {
-            table "${table_name}"
-            // set http request header params
-            set 'read_json_by_line', read_flag
-            set 'format', format_flag
-            set 'read_json_by_line', read_flag
-            file file_name // import json file
-            time 10000 // limit inflight 10s
-
-            // if declared a check callback, the default check condition will ignore.
-            // So you must check all condition
-            check { result, exception, startTime, endTime ->
-                if (exception != null) {
-                        throw exception
-                }
-                logger.info("Stream load ${file_name} result: ${result}".toString())
-                def json = parseJson(result)
-                if (expect_success == "false") {
-                    assertEquals("fail", json.Status.toLowerCase())
-                } else {
-                    assertEquals("success", json.Status.toLowerCase())
-                    assertEquals(json.NumberTotalRows, json.NumberLoadedRows + json.NumberUnselectedRows)
-                    assertTrue(json.NumberLoadedRows > 0 && json.LoadBytes > 0)
-                }
-            }
-        }
-    }
-
-    def real_res = "true"
-    def res = "null"
-    def wait_for_alter_finish = "null"
-    def check_time = 30
-    def colume_set = ""
-    def key = ""
-    def test_create_and_load_dynamic_table = { table_name, data_model, replica_num, columes, columes_type, src_json, expect_success ->
-        //create table
-        sql "DROP TABLE IF EXISTS ${table_name}"
-        colume_set = ""
-        key = ""
-        for (def col=0; col<columes.size(); col++){
-            if(columes[col].contains(".")){
-                colume_set += "`${columes[col]}` ${columes_type[col]}, "
-                key += "`${columes[col]}`"
-            }else{
-                colume_set += "${columes[col]} ${columes_type[col]}, "
-                key += "${columes[col]}"
-            }
-            if(col < columes.size() - 1){
-                key += ", "
-            }
-        }
-
-        try {
-            sql """
-                CREATE TABLE IF NOT EXISTS ${table_name} (
-                    ${colume_set}
-                )
-                ${data_model} KEY(${key})
-                DISTRIBUTED BY HASH(`${columes[0]}`) BUCKETS 10
-                properties("replication_num" = "${replica_num}", "deprecated_dynamic_schema" = "true");
-            """
-        }catch(Exception ex) {
-            logger.info("create ${table_name} fail, catch exception: ${ex}".toString())
-            real_res = "false"
-        }finally{
-            assertEquals(expect_success, real_res)
-            if(expect_success == "false"){
-                logger.info("${table_name} expect fail")
-                return
-            }
-        }
-
-        //stream load src_json
-        load_json_data.call(table_name, 'true', 'json', 'true', src_json, 'true')
-        sleep(1000)
-        //def select_res = sql "select * from ${table_name} order by `${columes[0]}`"
-        def select_res = sql "select `${columes[0]}` from ${table_name} order by `${columes[0]}`"
-        logger.info("after stream load ${table_name}, select result: ${select_res}".toString())
-
-        //check data in table and check table schema
-        def select_res_now = "true"
-        for(i = 0; i < 5; i++){
-            //select_res_now = sql "select * from ${table_name} order by `${columes[0]}`"
-            select_res_now = sql "select `${columes[0]}` from ${table_name} order by `${columes[0]}`"
-            //logger.info("after alter schema, it's ${i} time select,  select result: ${select_res}".toString())
-            assertEquals(select_res, select_res_now)
-            sleep(3000)
-        }
-    }
-
-    def timeout = 180000
-    def delta_time = 10000
-    def alter_res = "null"
-    def useTime = 0
-    def wait_for_latest_op_on_table_finish = { table_name, OpTimeout ->
-        for(int t = delta_time; t <= OpTimeout; t += delta_time){
-            alter_res = sql """SHOW ALTER TABLE COLUMN WHERE TableName = "${table_name}" ORDER BY CreateTime DESC LIMIT 1;"""
-            alter_res = alter_res.toString()
-            if(alter_res.contains("FINISHED")) {
-                sleep(3000) // wait change table state to normal
-                logger.info(table_name + " latest alter job finished, detail: " + alter_res)
-                break
-            }
-            useTime = t
-            sleep(delta_time)
-        }
-        assertTrue(useTime <= OpTimeout)
-    }
-
-    def wait_for_build_index_on_partition_finish = { table_name, OpTimeout ->
-        for(int t = delta_time; t <= OpTimeout; t += delta_time){
-            alter_res = sql """SHOW BUILD INDEX WHERE TableName = "${table_name}";"""
-            expected_finished_num = alter_res.size();
-            finished_num = 0;
-            for (int i = 0; i < expected_finished_num; i++) {
-                logger.info(table_name + " build index job state: " + alter_res[i][7] + i)
-                if (alter_res[i][7] == "FINISHED") {
-                    ++finished_num;
-                }
-            }
-            if (finished_num == expected_finished_num) {
-                logger.info(table_name + " all build index jobs finished, detail: " + alter_res)
-                break
-            } else {
-                finished_num = 0;
-            }
-            useTime = t
-            sleep(delta_time)
-        }
-        assertTrue(useTime <= OpTimeout, "wait_for_latest_build_index_on_partition_finish timeout")
-    }
-
-    def index_res = ""
-    def create_index = { table_name, colume_name, index_name, index_type, expect_success ->
-        // create index
-        try{
-            real_res = "success"
-            if(index_type == "null"){
-                index_res = sql """
-                create index ${index_name} on ${table_name}(`${colume_name}`) using inverted
-                """
-            }else{
-                index_res = sql """
-                create index ${index_name} on ${table_name}(`${colume_name}`) using inverted PROPERTIES("parser"="${index_type}")
-                """
-            }
-            logger.info("create index res: ${index_res} \n".toString())
-            wait_for_latest_op_on_table_finish(table_name, timeout)
-
-            index_res = sql """ build index ${index_name} on ${table_name} """
-            logger.info("build index res: ${index_res} \n".toString())
-            wait_for_build_index_on_partition_finish(table_name, timeout)
-
-        }catch(Exception ex){
-            logger.info("create create index ${index_name} on ${table_name}(`${colume_name}`) using inverted(${index_type}) fail, catch exception: ${ex} \n".toString())
-            real_res = "false"
-        }finally{
-            assertEquals(expect_success, real_res)
-            if(expect_success == "false"){
-                logger.info("${table_name} expect fail")
-                return
-            }
-        }
-    }
-
-    def drop_index = { table_name, colume_name, index_name, expect_success ->
-        // create index
-        try{
-            sql """
-                drop index ${index_name} on ${table_name};
-            """
-            wait_for_latest_op_on_table_finish(table_name, timeout)
-
-        }catch(Exception ex){
-            logger.info("drop index ${index_name} on ${table_name}, catch exception: ${ex} \n".toString())
-            real_res = "false"
-        }finally{
-            assertEquals(expect_success, real_res)
-            if(expect_success == "false"){
-                logger.info("${table_name} expect fail")
-                return
-            }
-        }
-    }
-
-    //start test
-    String[] data_models = ["DUPLICATE"]
-    int[] replica_num = [1]
-    def expect_success = "true"
-    def feishu_fix_columes = ["id", "content.post.zh_cn.title", "msg_type"]
-    def feishu_fix_columes_type = ["BIGINT", "VARCHAR(100)", "CHAR(50)"]
-    def github_fix_columes = ["repo.id"]
-    def github_fix_columes_type = ["BIGINT"]
-    def table_name = ["feishu", "github"]
-    ArrayList<String> table_names = new ArrayList<>()
-    //step1: create table
-    for (def j=0; j < data_models.size(); j++){
-        if( data_models[j] == "AGGREGATE" ){
-            expect_success = "false"
-        }
-
-        for(def k=0; k < replica_num.size(); k++){
-            // expect create table
-            for(def t=0; t< table_name.size(); t++){
-                if(table_name[t] == "feishu"){
-                    table_names.add("dytable_complex_feishu_${replica_num[k]}_${data_models[j]}")
-                    test_create_and_load_dynamic_table("dytable_complex_feishu_${replica_num[k]}_${data_models[j]}", data_models[j], replica_num[k], feishu_fix_columes, feishu_fix_columes_type, "dynamic_feishu_alarm.json", expect_success)
-                } else if(table_name[t] == "github"){
-                    table_names.add("dytable_complex_github_${replica_num[k]}_${data_models[j]}")
-                    test_create_and_load_dynamic_table("dytable_complex_github_${replica_num[k]}_${data_models[j]}", data_models[j], replica_num[k], github_fix_columes, github_fix_columes_type, "dynamic_github_events.json", expect_success)
-                }
-            }
-        }
-    }
-    // expect create table false
-    test_create_and_load_dynamic_table("test_dytable_complex_data_feishu_array_key_colume", "DUPLICATE", 3, ["content.post.zh_cn.content.tag"], ["ARRAY<ARRAY<text>>"], "dynamic_feishu_alarm.json", "false")
-    logger.info("recording tables: ${table_names}".toString())
-
-
-    def test_round = 3
-    for(def tb=0; tb < table_names.size(); tb++){
-        for(def round = 0; round < test_round; round++){
-            if((round % test_round) == 1){
-                if(table_names[tb].contains("feishu")) {
-                    create_index("${table_names[tb]}", "content.post.zh_cn.title", "title_idx", "english", "success")
-                    create_index("${table_names[tb]}", "msg_type", "msg_idx", "null", "success")
-                    //select index colume mix select
-                    qt_sql """ select * from ${table_names[tb]} where msg_type="post" or `content.post.zh_cn.title` match_all "BUILD_FINISHED" order by `content.post.zh_cn.title`, id limit 30; """
-                    qt_sql """ select * from ${table_names[tb]} where msg_type in ("post", "get") and `content.post.zh_cn.title` match_any "BUILD_FINISHED" order by `content.post.zh_cn.title`, id limit 30; """
-                    qt_sql """ select `content.post.zh_cn.content.herf` from ${table_names[tb]} where msg_type in ("post") and `content.post.zh_cn.title` match_any "FINISHED" order by `content.post.zh_cn.title`,id limit 30; """
-                    qt_sql """ select count() from ${table_names[tb]} where msg_type="post" and `content.post.zh_cn.title` != "BUILD_FINISHED" group by `content.post.zh_cn.title`; """
-                    // qt_sql """ select `content.post.zh_cn.title`,`content.post.zh_cn.content.herf` from dytable_complex_feishu_3_DUPLICATE where msg_type="post" group by `content.post.zh_cn.content.herf`, `content.post.zh_cn.title` order by `content.post.zh_cn.title`;"""
-                }else if(table_names[tb].contains("github")) {
-                    create_index("${table_names[tb]}", "actor.id", "actorid_idx", "null", "success")
-                    create_index("${table_names[tb]}", "payload.pull_request.title", "title_idx", "english", "success")
-                    // index colume select
-                    //qt_sql """ select * from ${table_names[tb]} where `actor.id`=93110249 or `payload.pull_request.title`="" order by `actor.id` limit 100; """
-                    qt_sql """select `repo.name` from ${table_names[tb]} where `actor.id`=93110249 or `payload.pull_request.title`="" order by `actor.id`; """
-                    // index colume and  simple colume mix select
-                    //qt_sql """ select * from ${table_names[tb]} where `actor.id`!= 93110249 order by `actor.id` limit 100;"""
-                    qt_sql """select `repo.name`, type from ${table_names[tb]} where `actor.id`!= 93110249 order by `actor.id`, `repo.name` limit 100;"""
-                    qt_sql """ select * from ${table_names[tb]} where `actor.id`!= 93110249 order by `actor.id`, `repo.name`, type limit 10;"""
-                    // index colume and common array colume mix select
-                    qt_sql """ select `repo.name`, count() from ${table_names[tb]} where `payload.pull_request.title`="" and `repo.id`=444318240 GROUP BY `repo.name` order by `repo.name`;"""
-                    qt_sql """ select `repo.name`, count() from ${table_names[tb]} where `actor.id` != 93110249 GROUP BY `repo.name` order by `repo.name`;"""
-
-            }else if((round % test_round) == 2){
-                if(table_names[tb].contains("feishu")) {
-                    drop_index("${table_names[tb]}", "content.post.zh_cn.title", "title_idx", "success")
-                    drop_index("${table_names[tb]}", "msg_type", "msg_idx", "success")
-                }else if(table_names[tb].contains("github")) {
-                    drop_index("${table_names[tb]}", "actor.id", "actorid_idx", "success")
-                    drop_index("${table_names[tb]}", "payload.pull_request.title", "title_idx", "success")
-                }
-            }else{
-                if(table_names[tb].contains("feishu")) {
-                    qt_sql """ select count() from ${table_names[tb]} WHERE msg_type="post" group by msg_type; """
-                    qt_sql """ select * from ${table_names[tb]} WHERE msg_type="post" and content.post.zh_cn.title="BUILD_FINISHED" order by `content.post.zh_cn.title` limit 50;; """
-                    qt_sql """ select count() from ${table_names[tb]} where msg_type="post" and `content.post.zh_cn.title` != "BUILD_FINISHED" group by `content.post.zh_cn.title`; """
-                    // qt_sql """ select `content.post.zh_cn.content.herf` from where msg_type="post" group by `content.post.zh_cn.content.herf` order by `content.post.zh_cn.title`;"""
-                }else if(table_names[tb].contains("github")) {
-                    qt_sql """ SELECT count() FROM ${table_names[tb]}  WHERE type = 'WatchEvent' GROUP BY payload.action;"""
-                    qt_sql """ SELECT `repo.name`, count() AS stars FROM ${table_names[tb]} WHERE type = 'WatchEvent' AND year(created_at) = '2015' GROUP BY repo.name ORDER BY `repo.name`, `actor.id` DESC LIMIT 50;"""
-                    qt_sql """ SELECT element_at(`payload.commits.author.email`, 1) from ${table_names[tb]} WHERE size(`payload.commits.author.email`) > 0 and size(`payload.commits.author.email`) <= 3 order by `actor.id`; """
-                    }
-                }
-            }
-        }
-    }
-}


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 07/09: [fix](inverted index)fix transaction id not unique for one index change job when light index change (#21180)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 44280038bd0181e66654d6f44fba89bb465b0ba9
Author: YueW <45...@users.noreply.github.com>
AuthorDate: Mon Jun 26 19:54:05 2023 +0800

    [fix](inverted index)fix transaction id not unique for one index change job when light index change (#21180)
---
 fe/fe-core/src/main/java/org/apache/doris/alter/IndexChangeJob.java | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/fe/fe-core/src/main/java/org/apache/doris/alter/IndexChangeJob.java b/fe/fe-core/src/main/java/org/apache/doris/alter/IndexChangeJob.java
index d1293484cb..fecfe4b99b 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/alter/IndexChangeJob.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/alter/IndexChangeJob.java
@@ -124,6 +124,8 @@ public class IndexChangeJob implements Writable {
 
         this.createTimeMs = System.currentTimeMillis();
         this.jobState = JobState.WAITING_TXN;
+        this.watershedTxnId = Env.getCurrentGlobalTransactionMgr()
+                        .getTransactionIDGenerator().getNextTransactionId();
     }
 
     public long getJobId() {
@@ -243,8 +245,6 @@ public class IndexChangeJob implements Writable {
 
     protected void runWaitingTxnJob() throws AlterCancelException {
         Preconditions.checkState(jobState == JobState.WAITING_TXN, jobState);
-        this.watershedTxnId = Env.getCurrentGlobalTransactionMgr()
-                        .getTransactionIDGenerator().getNextTransactionId();
         try {
             if (!isPreviousLoadFinished()) {
                 LOG.info("wait transactions before {} to be finished, inverted index job: {}", watershedTxnId, jobId);


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 03/09: [fix](nereids)change PushdownFilterThroughProject post processor from bottom up to top down rewrite (#21125)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 266750f7accda38d5a0bd15468126d8b8a2992a7
Author: starocean999 <40...@users.noreply.github.com>
AuthorDate: Mon Jun 26 15:34:41 2023 +0800

    [fix](nereids)change PushdownFilterThroughProject post processor from bottom up to top down rewrite (#21125)
    
    1. pass physicalProperties in withChildren function
    2. use top down traverse  in PushdownFilterThroughProject post processor
---
 .../post/PushdownFilterThroughProject.java         |  20 ++--
 .../plans/physical/PhysicalAssertNumRows.java      |   3 +-
 .../trees/plans/physical/PhysicalCTEAnchor.java    |   3 +-
 .../trees/plans/physical/PhysicalFilter.java       |   8 +-
 .../trees/plans/physical/PhysicalGenerate.java     |   4 +-
 .../trees/plans/physical/PhysicalLimit.java        |   3 +-
 .../plans/physical/PhysicalOlapTableSink.java      |   5 +-
 .../plans/physical/PhysicalPartitionTopN.java      |   5 +-
 .../trees/plans/physical/PhysicalProject.java      |   8 +-
 .../trees/plans/physical/PhysicalRepeat.java       |   3 +-
 .../nereids/trees/plans/physical/PhysicalTopN.java |   3 +-
 .../trees/plans/physical/PhysicalWindow.java       |   4 +-
 .../PushdownFilterThroughProjectTest.java          | 112 +++++++++++++++++++++
 13 files changed, 155 insertions(+), 26 deletions(-)

diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/PushdownFilterThroughProject.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/PushdownFilterThroughProject.java
index 350b053f0e..3eba798acb 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/PushdownFilterThroughProject.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/processor/post/PushdownFilterThroughProject.java
@@ -30,17 +30,15 @@ public class PushdownFilterThroughProject extends PlanPostProcessor {
     @Override
     public Plan visitPhysicalFilter(PhysicalFilter<? extends Plan> filter, CascadesContext context) {
         Plan child = filter.child();
-        Plan newChild = child.accept(this, context);
-        if (!(newChild instanceof PhysicalProject)) {
-            return filter;
+        if (!(child instanceof PhysicalProject)) {
+            return filter.withChildren(child.accept(this, context));
         }
-        PhysicalProject<? extends Plan> project = (PhysicalProject<? extends Plan>) newChild;
-        return project.withChildren(
-                new PhysicalFilter<>(
-                        ExpressionUtils.replace(filter.getConjuncts(), project.getAliasToProducer()),
-                        filter.getLogicalProperties(),
-                        project.child()
-                )
-        );
+
+        PhysicalProject<? extends Plan> project = (PhysicalProject<? extends Plan>) child;
+        PhysicalFilter<? extends Plan> newFilter = filter.withConjunctsAndChild(
+                ExpressionUtils.replace(filter.getConjuncts(), project.getAliasToProducer()),
+                project.child());
+
+        return project.withChildren(newFilter.accept(this, context));
     }
 }
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalAssertNumRows.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalAssertNumRows.java
index 0869c75d7d..de3ded2977 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalAssertNumRows.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalAssertNumRows.java
@@ -105,7 +105,8 @@ public class PhysicalAssertNumRows<CHILD_TYPE extends Plan> extends PhysicalUnar
     @Override
     public PhysicalAssertNumRows<Plan> withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1);
-        return new PhysicalAssertNumRows<>(assertNumRowsElement, getLogicalProperties(), children.get(0));
+        return new PhysicalAssertNumRows<>(assertNumRowsElement, groupExpression,
+                getLogicalProperties(), physicalProperties, statistics, children.get(0));
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalCTEAnchor.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalCTEAnchor.java
index 8ffb9de845..4d5dd8b16e 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalCTEAnchor.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalCTEAnchor.java
@@ -107,7 +107,8 @@ public class PhysicalCTEAnchor<
     @Override
     public PhysicalCTEAnchor<Plan, Plan> withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 2);
-        return new PhysicalCTEAnchor<>(cteId, getLogicalProperties(), children.get(0), children.get(1));
+        return new PhysicalCTEAnchor<>(cteId, groupExpression, getLogicalProperties(), physicalProperties,
+            statistics, children.get(0), children.get(1));
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalFilter.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalFilter.java
index 866fdda4c6..899a35ff20 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalFilter.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalFilter.java
@@ -104,7 +104,8 @@ public class PhysicalFilter<CHILD_TYPE extends Plan> extends PhysicalUnary<CHILD
     @Override
     public PhysicalFilter<Plan> withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1);
-        return new PhysicalFilter<>(conjuncts, getLogicalProperties(), children.get(0));
+        return new PhysicalFilter<>(conjuncts, groupExpression, getLogicalProperties(), physicalProperties,
+                statistics, children.get(0));
     }
 
     @Override
@@ -124,6 +125,11 @@ public class PhysicalFilter<CHILD_TYPE extends Plan> extends PhysicalUnary<CHILD
                 statistics, child());
     }
 
+    public PhysicalFilter<Plan> withConjunctsAndChild(Set<Expression> conjuncts, Plan child) {
+        return new PhysicalFilter<>(conjuncts, groupExpression, getLogicalProperties(), physicalProperties,
+                statistics, child);
+    }
+
     @Override
     public String shapeInfo() {
         StringBuilder builder = new StringBuilder();
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalGenerate.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalGenerate.java
index 862deb96d2..ec9887fb07 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalGenerate.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalGenerate.java
@@ -127,8 +127,8 @@ public class PhysicalGenerate<CHILD_TYPE extends Plan> extends PhysicalUnary<CHI
     @Override
     public PhysicalGenerate<Plan> withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1);
-        return new PhysicalGenerate<>(generators, generatorOutput,
-                getLogicalProperties(), children.get(0));
+        return new PhysicalGenerate<>(generators, generatorOutput, groupExpression,
+                getLogicalProperties(), physicalProperties, statistics, children.get(0));
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalLimit.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalLimit.java
index d98e394d82..8d5fb2a6ee 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalLimit.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalLimit.java
@@ -103,7 +103,8 @@ public class PhysicalLimit<CHILD_TYPE extends Plan> extends PhysicalUnary<CHILD_
     @Override
     public Plan withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1);
-        return new PhysicalLimit<>(limit, offset, phase, getLogicalProperties(), children.get(0));
+        return new PhysicalLimit<>(limit, offset, phase, groupExpression, getLogicalProperties(),
+                physicalProperties, statistics, children.get(0));
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalOlapTableSink.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalOlapTableSink.java
index 6b9c39fff1..f2c092515e 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalOlapTableSink.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalOlapTableSink.java
@@ -111,8 +111,9 @@ public class PhysicalOlapTableSink<CHILD_TYPE extends Plan> extends PhysicalUnar
     @Override
     public Plan withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1, "PhysicalOlapTableSink only accepts one child");
-        return new PhysicalOlapTableSink<>(database, targetTable, partitionIds, cols, singleReplicaLoad,
-                getLogicalProperties(), children.get(0));
+        return new PhysicalOlapTableSink<>(database, targetTable, partitionIds, cols,
+                singleReplicaLoad, groupExpression, getLogicalProperties(), physicalProperties,
+                statistics, children.get(0));
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalPartitionTopN.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalPartitionTopN.java
index 7666825f4e..2d214e4af4 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalPartitionTopN.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalPartitionTopN.java
@@ -150,8 +150,9 @@ public class PhysicalPartitionTopN<CHILD_TYPE extends Plan> extends PhysicalUnar
     @Override
     public PhysicalPartitionTopN<Plan> withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1);
-        return new PhysicalPartitionTopN<>(function, partitionKeys, orderKeys, hasGlobalLimit, partitionLimit,
-            getLogicalProperties(), children.get(0));
+        return new PhysicalPartitionTopN<>(function, partitionKeys, orderKeys, hasGlobalLimit,
+                partitionLimit, groupExpression, getLogicalProperties(), physicalProperties,
+                statistics, children.get(0));
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalProject.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalProject.java
index 3362d88ee6..f6e4d4dfdf 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalProject.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalProject.java
@@ -103,7 +103,13 @@ public class PhysicalProject<CHILD_TYPE extends Plan> extends PhysicalUnary<CHIL
     @Override
     public PhysicalProject<Plan> withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1);
-        return new PhysicalProject<>(projects, getLogicalProperties(), children.get(0));
+        return new PhysicalProject<Plan>(projects,
+                groupExpression,
+                getLogicalProperties(),
+                physicalProperties,
+                statistics,
+                children.get(0)
+        );
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalRepeat.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalRepeat.java
index 67b525214c..fdbe4beb6c 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalRepeat.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalRepeat.java
@@ -147,7 +147,8 @@ public class PhysicalRepeat<CHILD_TYPE extends Plan> extends PhysicalUnary<CHILD
     @Override
     public PhysicalRepeat<Plan> withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1);
-        return new PhysicalRepeat<>(groupingSets, outputExpressions, getLogicalProperties(), children.get(0));
+        return new PhysicalRepeat<>(groupingSets, outputExpressions, groupExpression,
+                getLogicalProperties(), physicalProperties, statistics, children.get(0));
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalTopN.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalTopN.java
index 4a58f5d9e9..06922cdbd9 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalTopN.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalTopN.java
@@ -110,7 +110,8 @@ public class PhysicalTopN<CHILD_TYPE extends Plan> extends AbstractPhysicalSort<
     @Override
     public PhysicalTopN<Plan> withChildren(List<Plan> children) {
         Preconditions.checkArgument(children.size() == 1);
-        return new PhysicalTopN<>(orderKeys, limit, offset, phase, getLogicalProperties(), children.get(0));
+        return new PhysicalTopN<>(orderKeys, limit, offset, phase, groupExpression,
+                getLogicalProperties(), physicalProperties, statistics, children.get(0));
     }
 
     @Override
diff --git a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalWindow.java b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalWindow.java
index 8ac180fce8..82f548cdf0 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalWindow.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/nereids/trees/plans/physical/PhysicalWindow.java
@@ -127,8 +127,8 @@ public class PhysicalWindow<CHILD_TYPE extends Plan> extends PhysicalUnary<CHILD
     @Override
     public Plan withChildren(List<Plan> children) {
         Preconditions.checkState(children.size() == 1);
-        return new PhysicalWindow<>(windowFrameGroup, requireProperties, Optional.empty(),
-                getLogicalProperties(), children.get(0));
+        return new PhysicalWindow<>(windowFrameGroup, requireProperties, groupExpression,
+                getLogicalProperties(), physicalProperties, statistics, children.get(0));
     }
 
     @Override
diff --git a/fe/fe-core/src/test/java/org/apache/doris/nereids/postprocess/PushdownFilterThroughProjectTest.java b/fe/fe-core/src/test/java/org/apache/doris/nereids/postprocess/PushdownFilterThroughProjectTest.java
new file mode 100644
index 0000000000..8c0e701e99
--- /dev/null
+++ b/fe/fe-core/src/test/java/org/apache/doris/nereids/postprocess/PushdownFilterThroughProjectTest.java
@@ -0,0 +1,112 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+package org.apache.doris.nereids.postprocess;
+
+import org.apache.doris.catalog.KeysType;
+import org.apache.doris.catalog.OlapTable;
+import org.apache.doris.nereids.CascadesContext;
+import org.apache.doris.nereids.processor.post.PushdownFilterThroughProject;
+import org.apache.doris.nereids.properties.LogicalProperties;
+import org.apache.doris.nereids.trees.expressions.Alias;
+import org.apache.doris.nereids.trees.expressions.EqualTo;
+import org.apache.doris.nereids.trees.expressions.Expression;
+import org.apache.doris.nereids.trees.expressions.NamedExpression;
+import org.apache.doris.nereids.trees.expressions.Slot;
+import org.apache.doris.nereids.trees.expressions.SlotReference;
+import org.apache.doris.nereids.trees.expressions.literal.Literal;
+import org.apache.doris.nereids.trees.plans.ObjectId;
+import org.apache.doris.nereids.trees.plans.PreAggStatus;
+import org.apache.doris.nereids.trees.plans.physical.PhysicalFilter;
+import org.apache.doris.nereids.trees.plans.physical.PhysicalOlapScan;
+import org.apache.doris.nereids.trees.plans.physical.PhysicalPlan;
+import org.apache.doris.nereids.trees.plans.physical.PhysicalProject;
+import org.apache.doris.nereids.types.IntegerType;
+import org.apache.doris.nereids.util.PlanConstructor;
+
+import com.google.common.collect.ImmutableList;
+import com.google.common.collect.Lists;
+import com.google.common.collect.Sets;
+import mockit.Injectable;
+import org.junit.jupiter.api.Assertions;
+import org.junit.jupiter.api.Test;
+
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+import java.util.Optional;
+import java.util.Set;
+
+public class PushdownFilterThroughProjectTest {
+    /**
+     * filter(y=0)
+     *    |
+     * proj2(x as y, col2 as z, col3)
+     *    |
+     * proj3(col1 as x, col2, col3)
+     *    |
+     * SCAN(col1, col2, col3)
+     *
+     * transform to
+     *
+     * proj2(x as y, col2 as z, col3)
+     *    |
+     * proj3(col1 as x, col2, col3)
+     *    |
+     * filter(col1=0)
+     *    |
+     * SCAN(col1, col2, col3)
+     *
+     */
+    @Test
+    public void testPushFilter(@Injectable LogicalProperties placeHolder,
+            @Injectable CascadesContext ctx) {
+        OlapTable t1 = PlanConstructor.newOlapTable(0, "t1", 0, KeysType.DUP_KEYS);
+        List<String> qualifier = new ArrayList<>();
+        qualifier.add("test");
+        List<Slot> t1Output = new ArrayList<>();
+        SlotReference a = new SlotReference("a", IntegerType.INSTANCE);
+        SlotReference b = new SlotReference("b", IntegerType.INSTANCE);
+        SlotReference c = new SlotReference("c", IntegerType.INSTANCE);
+        t1Output.add(a);
+        t1Output.add(b);
+        t1Output.add(c);
+        LogicalProperties t1Properties = new LogicalProperties(() -> t1Output);
+        PhysicalOlapScan scan = new PhysicalOlapScan(ObjectId.createGenerator().getNextId(), t1,
+                qualifier, 0L, Collections.emptyList(), Collections.emptyList(), null,
+                PreAggStatus.on(), ImmutableList.of(), Optional.empty(), t1Properties);
+        Alias x = new Alias(a, "x");
+        List<NamedExpression> projList3 = Lists.newArrayList(x, b, c);
+        PhysicalProject proj3 = new PhysicalProject(projList3, placeHolder, scan);
+        Alias y = new Alias(x.toSlot(), "y");
+        Alias z = new Alias(b, "z");
+        List<NamedExpression> projList2 = Lists.newArrayList(y, z, c);
+        PhysicalProject proj2 = new PhysicalProject(projList2, placeHolder, proj3);
+        Set<Expression> conjuncts = Sets.newHashSet();
+        conjuncts.add(new EqualTo(y.toSlot(), Literal.of(0)));
+        PhysicalFilter filter = new PhysicalFilter(conjuncts, proj2.getLogicalProperties(), proj2);
+
+        PushdownFilterThroughProject processor = new PushdownFilterThroughProject();
+        PhysicalPlan newPlan = (PhysicalPlan) filter.accept(processor, ctx);
+        Assertions.assertTrue(newPlan instanceof PhysicalProject);
+        Assertions.assertTrue(newPlan.child(0) instanceof PhysicalProject);
+        Assertions.assertTrue(newPlan.child(0).child(0) instanceof PhysicalFilter);
+        List<Expression> newFilterConjuncts =
+                ((PhysicalFilter<?>) newPlan.child(0).child(0)).getExpressions();
+        Assertions.assertEquals(newFilterConjuncts.get(0).child(0), a);
+    }
+}


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[doris] 05/09: [bug](jdbc catalog) fix getPrimaryKeys fun bug (#21137)

Posted by kx...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

kxiao pushed a commit to branch branch-2.0
in repository https://gitbox.apache.org/repos/asf/doris.git

commit 5e8aa0aaab433d0dc4758f5e5d4c6ee40ab0b917
Author: zy-kkk <zh...@gmail.com>
AuthorDate: Mon Jun 26 17:13:50 2023 +0800

    [bug](jdbc catalog) fix getPrimaryKeys fun bug (#21137)
---
 .../org/apache/doris/external/jdbc/JdbcClient.java | 20 ++++------------
 .../doris/external/jdbc/JdbcMySQLClient.java       | 28 +++++++++-------------
 2 files changed, 16 insertions(+), 32 deletions(-)

diff --git a/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcClient.java b/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcClient.java
index 505686da7e..4abe061d57 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcClient.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcClient.java
@@ -292,7 +292,6 @@ public abstract class JdbcClient {
             String catalogName = getCatalogName(conn);
             tableName = modifyTableNameIfNecessary(tableName);
             rs = getColumns(databaseMetaData, catalogName, dbName, tableName);
-            List<String> primaryKeys = getPrimaryKeys(dbName, tableName);
             while (rs.next()) {
                 if (isTableModified(tableName, rs.getString("TABLE_NAME"))) {
                     continue;
@@ -301,7 +300,11 @@ public abstract class JdbcClient {
                 field.setColumnName(rs.getString("COLUMN_NAME"));
                 field.setDataType(rs.getInt("DATA_TYPE"));
                 field.setDataTypeName(rs.getString("TYPE_NAME"));
-                field.setKey(primaryKeys.contains(field.getColumnName()));
+                /*
+                   We used this method to retrieve the key column of the JDBC table, but since we only tested mysql,
+                   we kept the default key behavior in the parent class and only overwrite it in the mysql subclass
+                */
+                field.setKey(true);
                 field.setColumnSize(rs.getInt("COLUMN_SIZE"));
                 field.setDecimalDigits(rs.getInt("DECIMAL_DIGITS"));
                 field.setNumPrecRadix(rs.getInt("NUM_PREC_RADIX"));
@@ -389,19 +392,6 @@ public abstract class JdbcClient {
         return databaseMetaData.getColumns(catalogName, schemaName, tableName, null);
     }
 
-    /**
-     * We used this method to retrieve the key column of the JDBC table, but since we only tested mysql,
-     * we kept the default key behavior in the parent class and only overwrite it in the mysql subclass
-     */
-    protected List<String> getPrimaryKeys(String dbName, String tableName) {
-        List<String> primaryKeys = Lists.newArrayList();
-        List<JdbcFieldSchema> columns = getJdbcColumnsInfo(dbName, tableName);
-        for (JdbcFieldSchema column : columns) {
-            primaryKeys.add(column.getColumnName());
-        }
-        return primaryKeys;
-    }
-
     @Data
     protected static class JdbcFieldSchema {
         protected String columnName;
diff --git a/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcMySQLClient.java b/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcMySQLClient.java
index 9863bcc1c2..c5ceefb89d 100644
--- a/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcMySQLClient.java
+++ b/fe/fe-core/src/main/java/org/apache/doris/external/jdbc/JdbcMySQLClient.java
@@ -136,7 +136,7 @@ public class JdbcMySQLClient extends JdbcClient {
             String catalogName = getCatalogName(conn);
             tableName = modifyTableNameIfNecessary(tableName);
             rs = getColumns(databaseMetaData, catalogName, dbName, tableName);
-            List<String> primaryKeys = getPrimaryKeys(dbName, tableName);
+            List<String> primaryKeys = getPrimaryKeys(databaseMetaData, catalogName, dbName, tableName);
             boolean needGetDorisColumns = true;
             Map<String, String> mapFieldtoType = null;
             while (rs.next()) {
@@ -201,24 +201,18 @@ public class JdbcMySQLClient extends JdbcClient {
         return dorisTableSchema;
     }
 
-    @Override
-    protected List<String> getPrimaryKeys(String dbName, String tableName) {
-        List<String> primaryKeys = Lists.newArrayList();
-        Connection conn = null;
+    protected List<String> getPrimaryKeys(DatabaseMetaData databaseMetaData, String catalogName,
+                                          String dbName, String tableName) throws SQLException {
         ResultSet rs = null;
-        try {
-            conn = getConnection();
-            DatabaseMetaData databaseMetaData = conn.getMetaData();
-            rs = databaseMetaData.getPrimaryKeys(dbName, null, tableName);
-            while (rs.next()) {
-                String columnName = rs.getString("COLUMN_NAME");
-                primaryKeys.add(columnName);
-            }
-        } catch (SQLException e) {
-            throw new JdbcClientException("Failed to get primary keys for table", e);
-        } finally {
-            close(rs, conn);
+        List<String> primaryKeys = Lists.newArrayList();
+
+        rs = databaseMetaData.getPrimaryKeys(dbName, null, tableName);
+        while (rs.next()) {
+            String columnName = rs.getString("COLUMN_NAME");
+            primaryKeys.add(columnName);
         }
+        rs.close();
+
         return primaryKeys;
     }
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org