You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/16 09:46:21 UTC

[GitHub] [incubator-doris] Jibing-Li opened a new pull request, #9593: Support create hive table without column info.

Jibing-Li opened a new pull request, #9593:
URL: https://github.com/apache/incubator-doris/pull/9593

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   Describe the overview of changes.
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] qidaye commented on a diff in pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
qidaye commented on code in PR #9593:
URL: https://github.com/apache/incubator-doris/pull/9593#discussion_r874314680


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +89,92 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {
+            for (FieldSchema field : HiveMetaStoreClientHelper.getSchema(this)) {
+                field.getType();
+                hiveMetastoreSchema.add(new Column(field.getName(), convertToDorisType(field.getType()),
+                        true, null, true, null, field.getComment()));
+            }
+        } catch (DdlException e) {
+            LOG.warn("Failed to get schema of hive table. DB {}, Table {}. {}",
+                    this.hiveDb, this.hiveTable, e.getMessage());
+            return null;
+        }
+        fullSchema = hiveMetastoreSchema;
+        return fullSchema;
+    }
+
+    public Column getColumn(String name) {

Review Comment:
   Unused function.



##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +89,92 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {
+            for (FieldSchema field : HiveMetaStoreClientHelper.getSchema(this)) {
+                field.getType();
+                hiveMetastoreSchema.add(new Column(field.getName(), convertToDorisType(field.getType()),
+                        true, null, true, null, field.getComment()));
+            }
+        } catch (DdlException e) {
+            LOG.warn("Failed to get schema of hive table. DB {}, Table {}. {}",
+                    this.hiveDb, this.hiveTable, e.getMessage());
+            return null;
+        }
+        fullSchema = hiveMetastoreSchema;
+        return fullSchema;
+    }
+
+    public Column getColumn(String name) {
+        if (isLocalSchema) {
+            return nameToColumn.get(name);
+        }
+        Column col = null;
+        if (fullSchema == null || fullSchema.size() == 0) {
+            getBaseSchema(true);
+        }
+        for (Column column : fullSchema) {
+            if (column.getName().equals(name)) {
+                return column;
+            }
+        }
+        return col;
+    }
+
+    private Type convertToDorisType(String hiveType) {

Review Comment:
   `boolean`, `timestamp` are missing.



##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +89,92 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {
+            for (FieldSchema field : HiveMetaStoreClientHelper.getSchema(this)) {
+                field.getType();

Review Comment:
   unused code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] Jibing-Li closed pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
Jibing-Li closed pull request #9593: [Feature]Support create hive table without column info.
URL: https://github.com/apache/doris/pull/9593


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Jibing-Li commented on a diff in pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
Jibing-Li commented on code in PR #9593:
URL: https://github.com/apache/incubator-doris/pull/9593#discussion_r874585063


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +89,92 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {
+            for (FieldSchema field : HiveMetaStoreClientHelper.getSchema(this)) {
+                field.getType();
+                hiveMetastoreSchema.add(new Column(field.getName(), convertToDorisType(field.getType()),
+                        true, null, true, null, field.getComment()));
+            }
+        } catch (DdlException e) {
+            LOG.warn("Failed to get schema of hive table. DB {}, Table {}. {}",
+                    this.hiveDb, this.hiveTable, e.getMessage());
+            return null;
+        }
+        fullSchema = hiveMetastoreSchema;
+        return fullSchema;
+    }
+
+    public Column getColumn(String name) {
+        if (isLocalSchema) {
+            return nameToColumn.get(name);
+        }
+        Column col = null;
+        if (fullSchema == null || fullSchema.size() == 0) {
+            getBaseSchema(true);
+        }
+        for (Column column : fullSchema) {
+            if (column.getName().equals(name)) {
+                return column;
+            }
+        }
+        return col;
+    }
+
+    private Type convertToDorisType(String hiveType) {

Review Comment:
   added



##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +89,92 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {
+            for (FieldSchema field : HiveMetaStoreClientHelper.getSchema(this)) {
+                field.getType();

Review Comment:
   Removed



##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +89,92 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {
+            for (FieldSchema field : HiveMetaStoreClientHelper.getSchema(this)) {
+                field.getType();
+                hiveMetastoreSchema.add(new Column(field.getName(), convertToDorisType(field.getType()),
+                        true, null, true, null, field.getComment()));
+            }
+        } catch (DdlException e) {
+            LOG.warn("Failed to get schema of hive table. DB {}, Table {}. {}",
+                    this.hiveDb, this.hiveTable, e.getMessage());
+            return null;
+        }
+        fullSchema = hiveMetastoreSchema;
+        return fullSchema;
+    }
+
+    public Column getColumn(String name) {

Review Comment:
   This is function is used in describe table and query statement. It is override the function in Table



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] qidaye commented on a diff in pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
qidaye commented on code in PR #9593:
URL: https://github.com/apache/incubator-doris/pull/9593#discussion_r874609723


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +89,98 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {
+            for (FieldSchema field : HiveMetaStoreClientHelper.getSchema(this)) {
+                hiveMetastoreSchema.add(new Column(field.getName(), convertToDorisType(field.getType()),
+                        true, null, true, null, field.getComment()));
+            }
+        } catch (DdlException e) {
+            LOG.warn("Failed to get schema of hive table. DB {}, Table {}. {}",
+                    this.hiveDb, this.hiveTable, e.getMessage());
+            return null;
+        }
+        fullSchema = hiveMetastoreSchema;
+        return fullSchema;
+    }
+
+    @Override
+    public Column getColumn(String name) {
+        if (isLocalSchema) {
+            return nameToColumn.get(name);
+        }
+        Column col = null;
+        if (fullSchema == null || fullSchema.size() == 0) {
+            getBaseSchema(true);
+        }
+        for (Column column : fullSchema) {
+            if (column.getName().equals(name)) {
+                return column;
+            }
+        }
+        return col;
+    }
+
+    private Type convertToDorisType(String hiveType) {
+        String lowerCaseType = hiveType.toLowerCase();
+        if (lowerCaseType.equals("boolean")) {
+            return Type.BOOLEAN;
+        }
+        if (lowerCaseType.equals("tinyint")) {
+            return Type.TINYINT;
+        }
+        if (lowerCaseType.equals("smallint")) {
+            return Type.SMALLINT;
+        }
+        if (lowerCaseType.equals("int")) {
+            return Type.INT;
+        }
+        if (lowerCaseType.equals("bigint")) {
+            return Type.BIGINT;
+        }
+        if (lowerCaseType.startsWith("char")) {
+            ScalarType type = ScalarType.createType(PrimitiveType.CHAR);
+            Matcher match = digitPattern.matcher(lowerCaseType);
+            if (match.find()) {
+                type.setLength(Integer.parseInt(match.group(1)));
+            }
+            return type;
+        }
+        if (lowerCaseType.startsWith("varchar")) {
+            ScalarType type = ScalarType.createType(PrimitiveType.VARCHAR);
+            Matcher match = digitPattern.matcher(lowerCaseType);
+            if (match.find()) {
+                type.setLength(Integer.parseInt(match.group(1)));
+            }
+            return type;
+        }
+        if (lowerCaseType.startsWith("decimal")) {
+            Matcher match = digitPattern.matcher(lowerCaseType);
+            int precision = ScalarType.DEFAULT_PRECISION;
+            int scale = ScalarType.DEFAULT_SCALE;
+            if (match.find()) {
+                precision = Integer.parseInt(match.group(1));
+            }
+            if (match.find()) {
+                scale = Integer.parseInt(match.group(1));
+            }
+            return ScalarType.createDecimalV2Type(precision, scale);
+        }
+        if (lowerCaseType.equals("date")) {
+            return Type.DATE;
+        }
+        if (lowerCaseType.equals("datetime")) {

Review Comment:
   ```suggestion
           if (lowerCaseType.equals("timestamp")) {
   ```



##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +89,92 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {
+            for (FieldSchema field : HiveMetaStoreClientHelper.getSchema(this)) {
+                field.getType();
+                hiveMetastoreSchema.add(new Column(field.getName(), convertToDorisType(field.getType()),
+                        true, null, true, null, field.getComment()));
+            }
+        } catch (DdlException e) {
+            LOG.warn("Failed to get schema of hive table. DB {}, Table {}. {}",
+                    this.hiveDb, this.hiveTable, e.getMessage());
+            return null;
+        }
+        fullSchema = hiveMetastoreSchema;
+        return fullSchema;
+    }
+
+    public Column getColumn(String name) {

Review Comment:
   OK



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Jibing-Li commented on pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
Jibing-Li commented on PR #9593:
URL: https://github.com/apache/incubator-doris/pull/9593#issuecomment-1128638599

   > How many steps are you planning to implement this in? I don't see the sync logic after schema change.
   
   This is the only PR to implement this. We don't need to sync the changed schema, it is read the schema from HMS every time. So it is automatically synced. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] Jibing-Li commented on pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
Jibing-Li commented on PR #9593:
URL: https://github.com/apache/incubator-doris/pull/9593#issuecomment-1129804041

   > > > How many steps are you planning to implement this in? I don't see the sync logic after schema change.
   > > 
   > > 
   > > This is the only PR to implement this. We don't need to sync the changed schema, it is read the schema from HMS every time. So it is automatically synced.
   > 
   > It only sync once when creating the table. When the table schema is changed in Hive for a existed table in Doris, the schema in Doris also need to be updated. How to handle this case? Drop and recreate the table in Doris?
   
   In the code, we don't keep the schema in Catalog, the schema is fetched from HMS every time when getBaseSchema is called, so it is always consistent with the Hive schema. getBaseSchema is called when user execute describe or query statement. It was tested and the behavior looks correct.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] qidaye commented on pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
qidaye commented on PR #9593:
URL: https://github.com/apache/incubator-doris/pull/9593#issuecomment-1129858826

   > In the code, we don't keep the schema in Catalog, the schema is fetched from HMS every time when getBaseSchema is called, so it is always consistent with the Hive schema. getBaseSchema is called when user execute describe or query statement. It was tested and the behavior looks correct.
   
   Is this the right way to go to HMS to get the schema for every query? 
   Or can we retry and get the new schema when the query goes wrong, or expose an refresh interface for user to update it manually, like the iceberg external table?
   
   @morningman @yangzhg can you take a look at this?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] qidaye commented on pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
qidaye commented on PR #9593:
URL: https://github.com/apache/incubator-doris/pull/9593#issuecomment-1128663615

   > > How many steps are you planning to implement this in? I don't see the sync logic after schema change.
   > 
   > This is the only PR to implement this. We don't need to sync the changed schema, it is read the schema from HMS every time. So it is automatically synced.
   
   It only sync once when creating the table. When the table schema is changed in Hive for a existed table in Doris, the schema in Doris also need to be updated. 
   How to handle this case?  Drop and recreate the table in Doris?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman commented on a diff in pull request #9593: [Feature]Support create hive table without column info.

Posted by GitBox <gi...@apache.org>.
morningman commented on code in PR #9593:
URL: https://github.com/apache/incubator-doris/pull/9593#discussion_r876014072


##########
fe/fe-core/src/main/java/org/apache/doris/catalog/HiveTable.java:
##########
@@ -74,6 +90,98 @@ public Map<String, String> getHiveProperties() {
         return hiveProperties;
     }
 
+    @Override
+    public List<Column> getBaseSchema(boolean full) {
+        if (isLocalSchema) {
+            return super.getBaseSchema(full);
+        }
+        List<Column> hiveMetastoreSchema = Lists.newArrayList();
+        try {

Review Comment:
   You need to consider the concurrent issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org