You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/04/22 02:18:06 UTC

[GitHub] [incubator-doris] hf200012 opened a new pull request #5689: Data export function, add export to specify certain columns

hf200012 opened a new pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689


   EXPORT TABLE db.tbl 
   TO "hdfs://namenode:8020/tmp/doris_20213" 
   PROPERTIES
   (
       "columns"="city_name,date",
       "column_separator"=",",
       "exec_mem_limit"="2147483648",
       "timeout" = "3600"
   )
   WITH BROKER "broker_name_2"
   (
     "username" = "",
     "password" = ""
   );
   
   The data export function adds a parameter “columns”, which is used to specify the column names in the export table, which can be multiple columns, separated by commas, and the column names are not case sensitive
   If this parameter is not filled in, all columns of the table will be exported by default
   
   
   ## Proposed changes
   
   Describe the big picture of your changes here to communicate to the maintainers why we should accept this pull request. If it fixes a bug or resolves a feature request, be sure to link to that issue.
   
   ## Types of changes
   
   What types of changes does your code introduce to Doris?
   _Put an `x` in the boxes that apply_
   
   - [ ] Bugfix (non-breaking change which fixes an issue)
   - [ ] New feature (non-breaking change which adds functionality)
   - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
   - [ ] Documentation Update (if none of the other choices apply)
   - [ ] Code refactor (Modify the code structure, format the code, etc...)
   
   ## Checklist
   
   _Put an `x` in the boxes that apply. You can also fill these out after creating the PR. If you're unsure about any of them, don't hesitate to ask. We're here to help! This is simply a reminder of what we are going to look for before merging your code._
   
   - [ ] I have created an issue on (Fix #ISSUE) and described the bug/feature there in detail
   - [ ] Compiling and unit tests pass locally with my changes
   - [ ] I have added tests that prove my fix is effective or that my feature works
   - [ ] If these changes need document changes, I have updated the document
   - [ ] Any dependent changes have been merged
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on a change in pull request #5689: Data export function, add export to specify certain columns

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on a change in pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689#discussion_r618078005



##########
File path: fe/fe-core/src/main/java/org/apache/doris/analysis/LoadStmt.java
##########
@@ -114,6 +114,9 @@
     public static final String KEY_IN_PARAM_SEQUENCE_COL = "sequence_col";
     public static final String KEY_IN_PARAM_BACKEND_ID = "backend_id";
 
+    //export
+    public static final String EXPORT_KEY_IN_PARAM_COLUMNS = "columns";

Review comment:
       use  KEY_IN_PARAM_COLUMNS instead

##########
File path: fe/fe-core/src/main/java/org/apache/doris/analysis/ExportStmt.java
##########
@@ -17,6 +17,7 @@
 
 package org.apache.doris.analysis;
 
+import com.google.common.base.Splitter;

Review comment:
       please pay attention to import sequence

##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -235,8 +238,8 @@ public void setJob(ExportStmt stmt) throws UserException {
         }
     }
 
-    private void genExecFragment() throws UserException {
-        registerToDesc();
+    private void genExecFragment(ExportStmt stmt) throws UserException {

Review comment:
       ```suggestion
       private void genExecFragment() throws UserException {
   ```

##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -220,7 +223,7 @@ public void setJob(ExportStmt stmt) throws UserException {
             }
             this.tableId = exportTable.getId();
             this.tableName = stmt.getTblName();
-            genExecFragment();
+            genExecFragment(stmt);

Review comment:
       ```suggestion
               genExecFragment();
   ```

##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -252,17 +255,25 @@ private void genExecFragment() throws UserException {
         plan();
     }
 
-    private void registerToDesc() {
+    private void registerToDesc(ExportStmt stmt) {
         TableRef ref = new TableRef(tableName, null, partitions == null ? null : new PartitionNames(false, partitions));
         BaseTableRef tableRef = new BaseTableRef(ref, exportTable, tableName);
         exportTupleDesc = desc.createTupleDescriptor();
         exportTupleDesc.setTable(exportTable);
         exportTupleDesc.setRef(tableRef);
+        this.exportColumns = stmt.getColumns();

Review comment:
       please set it in public void setJob(ExportStmt stmt) throws UserException

##########
File path: fe/fe-core/src/main/java/org/apache/doris/analysis/ExportStmt.java
##########
@@ -264,6 +271,13 @@ private void checkProperties(Map<String, String> properties) throws UserExceptio
                 properties, ExportStmt.DEFAULT_COLUMN_SEPARATOR));
         this.lineDelimiter = Separator.convertSeparator(PropertyAnalyzer.analyzeLineDelimiter(
                 properties, ExportStmt.DEFAULT_LINE_DELIMITER));
+        if(properties.containsKey(LoadStmt.EXPORT_KEY_IN_PARAM_COLUMNS)){

Review comment:
       code format

##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -252,17 +255,25 @@ private void genExecFragment() throws UserException {
         plan();
     }
 
-    private void registerToDesc() {
+    private void registerToDesc(ExportStmt stmt) {

Review comment:
       ```suggestion
       private void registerToDesc() {
   ```

##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -252,17 +255,25 @@ private void genExecFragment() throws UserException {
         plan();
     }
 
-    private void registerToDesc() {
+    private void registerToDesc(ExportStmt stmt) {
         TableRef ref = new TableRef(tableName, null, partitions == null ? null : new PartitionNames(false, partitions));
         BaseTableRef tableRef = new BaseTableRef(ref, exportTable, tableName);
         exportTupleDesc = desc.createTupleDescriptor();
         exportTupleDesc.setTable(exportTable);
         exportTupleDesc.setRef(tableRef);
+        this.exportColumns = stmt.getColumns();
         for (Column col : exportTable.getBaseSchema()) {
-            SlotDescriptor slot = desc.addSlotDescriptor(exportTupleDesc);
-            slot.setIsMaterialized(true);
-            slot.setColumn(col);
-            slot.setIsNullable(col.isAllowNull());
+            if(!this.exportColumns.isEmpty() && this.exportColumns.contains(col.getName().toLowerCase())) {

Review comment:
       code format




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on a change in pull request #5689: Data export function, add export to specify certain columns

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on a change in pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689#discussion_r619934316



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -740,6 +762,7 @@ public void readFields(DataInput in) throws IOException {
                 this.properties.put(propertyKey, propertyValue);
             }
         }
+        this.columns = this.properties.get(LoadStmt.KEY_IN_PARAM_COLUMNS);

Review comment:
       ```suggestion
           this.columns = this.properties.get(LoadStmt.KEY_IN_PARAM_COLUMNS);
          if (!Strings.isNullOrEmpty(this.columns)) {
               Splitter split = Splitter.on(',').trimResults().omitEmptyStrings();
               this.exportColumns = split.splitToList(stmt.getColumns().toLowerCase());
           }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on a change in pull request #5689: Data export function, add export to specify certain columns

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on a change in pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689#discussion_r619066831



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -43,12 +43,7 @@
 import org.apache.doris.catalog.PrimitiveType;
 import org.apache.doris.catalog.Table;
 import org.apache.doris.catalog.Type;
-import org.apache.doris.common.Config;
-import org.apache.doris.common.DdlException;
-import org.apache.doris.common.FeMetaVersion;
-import org.apache.doris.common.Pair;
-import org.apache.doris.common.Status;
-import org.apache.doris.common.UserException;
+import org.apache.doris.common.*;

Review comment:
       remove .*




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on a change in pull request #5689: Data export function, add export to specify certain columns

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on a change in pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689#discussion_r618907929



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -680,6 +697,7 @@ public void write(DataOutput out) throws IOException {
         Text.writeString(out, exportPath);
         Text.writeString(out, columnSeparator);
         Text.writeString(out, lineDelimiter);
+        Text.writeString(out, columns);

Review comment:
       In fact, there is no need to modify the logic here. You only need to initialize the columns after reading the properties. 
   Also, even if columns are to be persisted, they cannot actually be placed in this position.

##########
File path: docs/zh-CN/administrator-guide/export-manual.md
##########
@@ -122,6 +123,7 @@ WITH BROKER "hdfs"
 ```
 
 * `column_separator`:列分隔符。默认为 `\t`。支持不可见字符,比如 '\x07'。
+* columns:要导出的列,使用英文状态逗号隔开,如果不填这个参数默认是导出表的所有列

Review comment:
       可以把英文注释也加一下~




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 merged pull request #5689: Data export function, add export to specify certain columns

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 merged pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on a change in pull request #5689: Data export function, add export to specify certain columns

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on a change in pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689#discussion_r619934316



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -740,6 +762,7 @@ public void readFields(DataInput in) throws IOException {
                 this.properties.put(propertyKey, propertyValue);
             }
         }
+        this.columns = this.properties.get(LoadStmt.KEY_IN_PARAM_COLUMNS);

Review comment:
       ```suggestion
           this.columns = this.properties.get(LoadStmt.KEY_IN_PARAM_COLUMNS);
          if (!Strings.isNullOrEmpty(this.columns)) {
               Splitter split = Splitter.on(',').trimResults().omitEmptyStrings();
               this.exportColumns = split.splitToList(stmt.getColumns().toLowerCase());
           }
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on a change in pull request #5689: Data export function, add export to specify certain columns

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on a change in pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689#discussion_r618295795



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -167,6 +167,9 @@
     private OriginStatement origStmt;
     protected Map<String, String> sessionVariables = Maps.newHashMap();
 
+    private List<String> exportColumns ;

Review comment:
       If you store columns as a separate attribute in the export job from properties, you need to consider persistence.
   Either reload the columns attribute during replay.
   Or just persist the columns object directly.
   My suggestion is not to modify the persistence logic. Re-parse columns after persistence.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] EmmyMiao87 commented on a change in pull request #5689: Data export function, add export to specify certain columns

Posted by GitBox <gi...@apache.org>.
EmmyMiao87 commented on a change in pull request #5689:
URL: https://github.com/apache/incubator-doris/pull/5689#discussion_r618320920



##########
File path: fe/fe-core/src/main/java/org/apache/doris/load/ExportJob.java
##########
@@ -255,15 +259,15 @@ private void genExecFragment(ExportStmt stmt) throws UserException {
         plan();
     }
 
-    private void registerToDesc(ExportStmt stmt) {
+    private void registerToDesc() {
         TableRef ref = new TableRef(tableName, null, partitions == null ? null : new PartitionNames(false, partitions));
         BaseTableRef tableRef = new BaseTableRef(ref, exportTable, tableName);
         exportTupleDesc = desc.createTupleDescriptor();
         exportTupleDesc.setTable(exportTable);
         exportTupleDesc.setRef(tableRef);
-        this.exportColumns = stmt.getColumns();
         for (Column col : exportTable.getBaseSchema()) {
-            if(!this.exportColumns.isEmpty() && this.exportColumns.contains(col.getName().toLowerCase())) {
+            String colName = col.getName().toLowerCase();
+            if (!this.exportColumns.isEmpty() && this.exportColumns.contains(colName)) {

Review comment:
       ```suggestion
               if (this.exportColumns !=null && this.exportColumns.contains(colName)) {
   ```
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org