You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by "wsjz (via GitHub)" <gi...@apache.org> on 2023/06/29 06:35:33 UTC

[GitHub] [doris] wsjz commented on a diff in pull request #21238: [fix](multi-catalog)fix obj file cache and dlf iceberg catalog

wsjz commented on code in PR #21238:
URL: https://github.com/apache/doris/pull/21238#discussion_r1246169572


##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/dlf/DLFCatalog.java:
##########
@@ -38,4 +47,26 @@ protected TableOperations newTableOps(TableIdentifier tableIdentifier) {
         String tableName = tableIdentifier.name();
         return new DLFTableOperations(this.conf, this.clients, this.fileIO, this.uid, dbName, tableName);
     }
+
+    protected FileIO initializeFileIO(Map<String, String> properties, Configuration hadoopConf) {
+        // read from converted properties or default by old s3 aws properties
+        String endpoint = properties.getOrDefault(Constants.ENDPOINT_KEY, properties.get(S3Properties.Env.ENDPOINT));
+        CloudCredential credential = new CloudCredential();
+        credential.setAccessKey(properties.getOrDefault(OssProperties.ACCESS_KEY,
+                    properties.get(S3Properties.Env.ACCESS_KEY)));
+        credential.setSecretKey(properties.getOrDefault(OssProperties.SECRET_KEY,
+                    properties.get(S3Properties.Env.SECRET_KEY)));
+        if (properties.containsKey(OssProperties.SESSION_TOKEN)
+                || properties.containsKey(S3Properties.Env.TOKEN)) {
+            credential.setSessionToken(properties.getOrDefault(OssProperties.SESSION_TOKEN,
+                    properties.get(S3Properties.Env.TOKEN)));
+        }
+        String region = properties.getOrDefault(OssProperties.REGION, properties.get(S3Properties.Env.REGION));
+        // s3 file io just supports s3-like endpoint
+        String s3Endpoint = endpoint.replace(region, "s3." + region);
+        URI endpointUri = URI.create(s3Endpoint);
+        FileIO io = new S3FileIO(() -> S3Util.buildS3Client(endpointUri, region, credential));

Review Comment:
   I find s3 file io is faster than hadoop file io



##########
fe/fe-core/src/main/java/org/apache/doris/datasource/iceberg/HiveCompatibleCatalog.java:
##########
@@ -57,7 +57,7 @@ public void initialize(String name, FileIO fileIO,
     protected FileIO initializeFileIO(Map<String, String> properties, Configuration hadoopConf) {
         String fileIOImpl = properties.get(CatalogProperties.FILE_IO_IMPL);
         if (fileIOImpl == null) {
-            FileIO io = new S3FileIO();
+            FileIO io = new HadoopFileIO(hadoopConf);

Review Comment:
   s3 file need some custom configuration, so hadoop io is used in superclass by default, we can add better implementations to derived class just like the implementation in dlf catalog
   .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org