You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "kaka11chen (via GitHub)" <gi...@apache.org> on 2023/06/11 16:13:00 UTC

[GitHub] [doris] kaka11chen opened a new pull request, #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

kaka11chen opened a new pull request, #20679:
URL: https://github.com/apache/doris/pull/20679

   ## Proposed changes
   
   After supporting insert-only transactional hive full acid tables https://github.com/apache/doris/pull/19518, https://github.com/apache/doris/pull/19419, this PR support transactional hive full acid tables.
   
   - Support hive3 transactional hive full acid tables.
   - Hive2 transactional hive full acid tables need to run major compactions.
   
   ## Further comments
   Regression test only test with hive2 transactional hive full acid tables now, because the docker image version is hive 2.x, will add hive 3.x. docker later.
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] kaka11chen commented on a diff in pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "kaka11chen (via GitHub)" <gi...@apache.org>.
kaka11chen commented on code in PR #20679:
URL: https://github.com/apache/doris/pull/20679#discussion_r1226267783


##########
fe/fe-core/src/main/java/org/apache/doris/planner/external/HiveScanNode.java:
##########
@@ -192,12 +195,50 @@ private void getFileSplitByPartitions(HiveMetaStoreCache cache, List<HivePartiti
                 for (HiveMetaStoreCache.HiveFileStatus status : fileCacheValue.getFiles()) {
                     allFiles.addAll(splitFile(status.getPath(), status.getBlockSize(),
                             status.getBlockLocations(), status.getLength(), status.getModificationTime(),
-                            isSplittable, fileCacheValue.getPartitionValues()));
+                            isSplittable, fileCacheValue.getPartitionValues(), fileCacheValue.getAcidInfo()));
                 }
             }
         }
     }
 
+    private List<Split> splitFile(Path path, long blockSize, BlockLocation[] blockLocations, long length,

Review Comment:
   ok



##########
fe/fe-core/src/main/java/org/apache/doris/common/util/BrokerUtil.java:
##########
@@ -114,17 +114,19 @@ public static String printBroker(String brokerName, TNetworkAddress address) {
 
     public static List<String> parseColumnsFromPath(String filePath, List<String> columnsFromPath)
             throws UserException {
-        return parseColumnsFromPath(filePath, columnsFromPath, true);
+        return parseColumnsFromPath(filePath, columnsFromPath, true, false);
     }
 
     public static List<String> parseColumnsFromPath(
             String filePath,
             List<String> columnsFromPath,
-            boolean caseSensitive)
+            boolean caseSensitive,
+            boolean isACID)
             throws UserException {
         if (columnsFromPath == null || columnsFromPath.isEmpty()) {
             return Collections.emptyList();
         }
+        int pathCount = isACID ? 3 : 2;

Review Comment:
   ok



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586829720

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586235437

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586834153

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] kaka11chen commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "kaka11chen (via GitHub)" <gi...@apache.org>.
kaka11chen commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586234352

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] kaka11chen commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "kaka11chen (via GitHub)" <gi...@apache.org>.
kaka11chen commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586825146

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] yiguolei merged pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "yiguolei (via GitHub)" <gi...@apache.org>.
yiguolei merged PR #20679:
URL: https://github.com/apache/doris/pull/20679


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] morningman commented on a diff in pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "morningman (via GitHub)" <gi...@apache.org>.
morningman commented on code in PR #20679:
URL: https://github.com/apache/doris/pull/20679#discussion_r1226168072


##########
fe/fe-core/src/main/java/org/apache/doris/planner/external/HiveScanNode.java:
##########
@@ -192,12 +195,50 @@ private void getFileSplitByPartitions(HiveMetaStoreCache cache, List<HivePartiti
                 for (HiveMetaStoreCache.HiveFileStatus status : fileCacheValue.getFiles()) {
                     allFiles.addAll(splitFile(status.getPath(), status.getBlockSize(),
                             status.getBlockLocations(), status.getLength(), status.getModificationTime(),
-                            isSplittable, fileCacheValue.getPartitionValues()));
+                            isSplittable, fileCacheValue.getPartitionValues(), fileCacheValue.getAcidInfo()));
                 }
             }
         }
     }
 
+    private List<Split> splitFile(Path path, long blockSize, BlockLocation[] blockLocations, long length,

Review Comment:
   Better to find a way to extract the common part of `splitFile` and the same method defined in `FileScanNode`, or it is hard to maintain.



##########
fe/fe-core/src/main/java/org/apache/doris/common/util/BrokerUtil.java:
##########
@@ -114,17 +114,19 @@ public static String printBroker(String brokerName, TNetworkAddress address) {
 
     public static List<String> parseColumnsFromPath(String filePath, List<String> columnsFromPath)
             throws UserException {
-        return parseColumnsFromPath(filePath, columnsFromPath, true);
+        return parseColumnsFromPath(filePath, columnsFromPath, true, false);
     }
 
     public static List<String> parseColumnsFromPath(
             String filePath,
             List<String> columnsFromPath,
-            boolean caseSensitive)
+            boolean caseSensitive,
+            boolean isACID)
             throws UserException {
         if (columnsFromPath == null || columnsFromPath.isEmpty()) {
             return Collections.emptyList();
         }
+        int pathCount = isACID ? 3 : 2;

Review Comment:
   Add comment explain `3` and `2`, better give an example



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] kaka11chen commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "kaka11chen (via GitHub)" <gi...@apache.org>.
kaka11chen commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586394016

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586396128

   clang-tidy review says "All clean, LGTM! :+1:"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] kaka11chen commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "kaka11chen (via GitHub)" <gi...@apache.org>.
kaka11chen commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586946977

   run buildall


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [doris] github-actions[bot] commented on pull request #20679: [Feature][Fix](multi-catalog) Implements transactional hive full acid tables.

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #20679:
URL: https://github.com/apache/doris/pull/20679#issuecomment-1586834222

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org