You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/03 16:10:05 UTC

[GitHub] [incubator-doris] deardeng opened a new pull request, #9358: fix #9351 can't load parquet file with column name case sensitive wit…

deardeng opened a new pull request, #9358:
URL: https://github.com/apache/incubator-doris/pull/9358

   …h Doris column
   
   # Proposed changes
   
   Issue Number: close #9351 
   
   ## Problem Summary:
   
   Describe the overview of changes.
   1. change slotDescByName and exprMap to case insensitive map
   2. change the realColName acquisition method
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: No
   3. Has unit tests been added: No Need
   4. Has document been added or modified: No Need
   5. Does it need to update dependencies: No
   6. Are there any changes that cannot be rolled back: No
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
morningman commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867349007


##########
fe/fe-core/src/main/java/org/apache/doris/planner/StreamLoadScanNode.java:
##########
@@ -66,8 +66,8 @@ public class StreamLoadScanNode extends LoadScanNode {
     private TupleDescriptor srcTupleDesc;
     private TBrokerScanRange brokerScanRange;
 
-    private Map<String, SlotDescriptor> slotDescByName = Maps.newHashMap();
-    private Map<String, Expr> exprsByName = Maps.newHashMap();
+    private Map<String, SlotDescriptor> slotDescByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);

Review Comment:
   What I mean is it would be better to add this comment in source file, so that other developer can understand it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867756033


##########
fe/fe-core/src/main/java/org/apache/doris/planner/StreamLoadScanNode.java:
##########
@@ -66,8 +66,8 @@ public class StreamLoadScanNode extends LoadScanNode {
     private TupleDescriptor srcTupleDesc;
     private TBrokerScanRange brokerScanRange;
 
-    private Map<String, SlotDescriptor> slotDescByName = Maps.newHashMap();
-    private Map<String, Expr> exprsByName = Maps.newHashMap();
+    private Map<String, SlotDescriptor> slotDescByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xy720 commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
xy720 commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867347721


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){

Review Comment:
   How about :
   ```
   if (tbl.getColumn(columnName) == null || importColumnDesc.getExpr() == null){
       realColName = columnName;
   } else {
       realColName = tbl.getColumn(columnName).getName();
   }
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
morningman commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867316239


##########
fe/fe-core/src/main/java/org/apache/doris/planner/StreamLoadScanNode.java:
##########
@@ -66,8 +66,8 @@ public class StreamLoadScanNode extends LoadScanNode {
     private TupleDescriptor srcTupleDesc;
     private TBrokerScanRange brokerScanRange;
 
-    private Map<String, SlotDescriptor> slotDescByName = Maps.newHashMap();
-    private Map<String, Expr> exprsByName = Maps.newHashMap();
+    private Map<String, SlotDescriptor> slotDescByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);

Review Comment:
   Better add comment to explain why we need CASE_INSENSITIVE_ORDER



##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){
+                realColName = columnName;
+            } else {
+                    realColName = tbl.getColumn(columnName).getName();

Review Comment:
   `tbl.getColumn(columnName)` may be null here.



##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();

Review Comment:
   Better add comment to explain this logic.
   Give an example is better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman merged pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
morningman merged PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867606706


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){

Review Comment:
   yes, it works and I change it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867606594


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){
+                realColName = columnName;
+            } else {
+                    realColName = tbl.getColumn(columnName).getName();

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#issuecomment-1123464160

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867345641


##########
fe/fe-core/src/main/java/org/apache/doris/planner/StreamLoadScanNode.java:
##########
@@ -66,8 +66,8 @@ public class StreamLoadScanNode extends LoadScanNode {
     private TupleDescriptor srcTupleDesc;
     private TBrokerScanRange brokerScanRange;
 
-    private Map<String, SlotDescriptor> slotDescByName = Maps.newHashMap();
-    private Map<String, Expr> exprsByName = Maps.newHashMap();
+    private Map<String, SlotDescriptor> slotDescByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);

Review Comment:
   For example, the column name 「A」 in the table and the mapping '(a) set (A = a)' in load sql, Slotdescbyname stores「a」, If Slotdescbyname case sensitive, later will use 「a」to get table's 「A」 column info, will throw exception.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867343045


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){

Review Comment:
   such case, 
   
   CREATE TABLE `record` (
    `id` varchar(50) NOT NULL ,
    `SS` varchar(3) NULL 
   ) ENGINE=OLAP
   UNIQUE KEY(`id`)
   DISTRIBUTED BY HASH(`id`) BUCKETS 10;
   
   LOAD LABEL test.record
   ( 
    DATA INFILE ("hdfs://172.0.0.9/tmp/part0.parq")
    INTO TABLE record format as parquet
    (id, ss) 
    SET
    ( 
   	id = id, 
   	SS = ss
    )
   ) WITH BROKER 'Broker_Doris' ( "username" = "hadoop" );
   
   copiedColumnExprs has entrys ("id", "ss", "id = id", "SS = ss"), 
   exprMap has entrys ("id = id", "SS = ss"), 
   tmpSet has keys ("id", "ss")
   
   1. if don't check tmpSet.contains(columnName), realName will be SS. realName(SS) will be sent to BE. BE uses realName(SS) to match parquet file's SS column, and an error will be occur,parquet file's meta has only ss column
   2. if use tmpSet.contains(columnName), realName will be ss, can match parquet's meta info.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867343045


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){

Review Comment:
   such case, 
   
   CREATE TABLE `record` (
    `id` varchar(50) NOT NULL ,
    `SS` varchar(3) NULL 
   ) ENGINE=OLAP
   UNIQUE KEY(`id`)
   DISTRIBUTED BY HASH(`id`) BUCKETS 10;
   
   LOAD LABEL test.record
   ( 
    DATA INFILE ("hdfs://172.0.0.9/tmp/part0.parq")
    INTO TABLE record format as parquet
    (id, ss) 
    SET
    ( 
   	id = id, 
   	SS = ss
    )
   ) WITH BROKER 'Broker_Doris' ( "username" = "hadoop" );
   
   copiedColumnExprs has entrys ("id", "ss", "id = id", "SS = ss"), 
   exprMap has entrys ("id = id", "SS = ss"), 
   tmpSet has keys ("id", "ss")
   
   1. if don't check tmpSet.contains(columnName), realName will be SS. realName(SS) will be sent to BE. BE uses realName(SS) to match parquet file's SS column, and an error will be occur,parquet file's meta has only ss column
   2. if use tmpSet.contains(columnName), realName will be ss



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xy720 commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
xy720 commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867322207


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){

Review Comment:
   It didn't make sense.
   Only two case make tbl.getColumn(columnName) != null
   1、The columnName match.
   2、The columnName match with case insensitive.
   And you already change exprMap to case insensitive map, so checking tmpSet.contains(columnName) seems no effect at all.



##########
fe/fe-core/src/main/java/org/apache/doris/planner/StreamLoadScanNode.java:
##########
@@ -66,8 +66,8 @@ public class StreamLoadScanNode extends LoadScanNode {
     private TupleDescriptor srcTupleDesc;
     private TBrokerScanRange brokerScanRange;
 
-    private Map<String, SlotDescriptor> slotDescByName = Maps.newHashMap();
-    private Map<String, Expr> exprsByName = Maps.newHashMap();
+    private Map<String, SlotDescriptor> slotDescByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);
+    private Map<String, Expr> exprsByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);

Review Comment:
   ```suggestion
       private final Map<String, Expr> exprsByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/planner/StreamLoadScanNode.java:
##########
@@ -66,8 +66,8 @@ public class StreamLoadScanNode extends LoadScanNode {
     private TupleDescriptor srcTupleDesc;
     private TBrokerScanRange brokerScanRange;
 
-    private Map<String, SlotDescriptor> slotDescByName = Maps.newHashMap();
-    private Map<String, Expr> exprsByName = Maps.newHashMap();
+    private Map<String, SlotDescriptor> slotDescByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);

Review Comment:
   ```suggestion
       private final Map<String, SlotDescriptor> slotDescByName = Maps.newTreeMap(String.CASE_INSENSITIVE_ORDER);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng closed pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng closed pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column
URL: https://github.com/apache/incubator-doris/pull/9358


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867343511


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){
+                realColName = columnName;
+            } else {
+                    realColName = tbl.getColumn(columnName).getName();

Review Comment:
   if tbl.getColumn(columnName) eq null, it will into line 1061, not here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] deardeng commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
deardeng commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867343672


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();

Review Comment:
   ok, I add some comments



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
morningman commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867349084


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){
+                realColName = columnName;
+            } else {
+                    realColName = tbl.getColumn(columnName).getName();

Review Comment:
   My mistake, but please fix the code format(indent)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] xy720 commented on a diff in pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
xy720 commented on code in PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#discussion_r867347721


##########
fe/fe-core/src/main/java/org/apache/doris/load/Load.java:
##########
@@ -1045,12 +1045,23 @@ private static void initColumns(Table tbl, List<ImportColumnDesc> columnExprs,
             return;
         }
 
+        Set<String> tmpSet = Sets.newHashSet();
+        for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
+            if (importColumnDesc.getExpr() == null) {
+                tmpSet.add(importColumnDesc.getColumnName());
+            }
+        }
+
         // init slot desc add expr map, also transform hadoop functions
         for (ImportColumnDesc importColumnDesc : copiedColumnExprs) {
             // make column name case match with real column name
             String columnName = importColumnDesc.getColumnName();
-            String realColName = tbl.getColumn(columnName) == null ? columnName
-                    : tbl.getColumn(columnName).getName();
+            String realColName;
+            if (tbl.getColumn(columnName) == null || tmpSet.contains(columnName) ){

Review Comment:
   How about :
   if (tbl.getColumn(columnName) == null || importColumnDesc.getExpr() == null){
       realColName = columnName;
   } else {
       realColName = tbl.getColumn(columnName).getName();
   }
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9358: fix #9351 can't load parquet file with column name case sensitive with Doris column

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9358:
URL: https://github.com/apache/incubator-doris/pull/9358#issuecomment-1123464213

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org