You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@paimon.apache.org by "wg1026688210 (via GitHub)" <gi...@apache.org> on 2023/04/12 14:59:57 UTC

[GitHub] [incubator-paimon] wg1026688210 opened a new pull request, #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

wg1026688210 opened a new pull request, #887:
URL: https://github.com/apache/incubator-paimon/pull/887

   Currently, HiveCatalog specify the `warehouse` and the `location` of database/table  to HMS when creating database/table
   ,causing that HMS access the `location`  by filesystem of hive.
   for resolve it, the pr make some changes 
   
   -  do not specify the `warehouse`  config  and the `location ` of database and table 
   -  and handoff the file access to fileio 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174520739


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -97,18 +100,31 @@ public class HiveCatalog extends AbstractCatalog {
     private final String clientClassName;
     private final IMetaStoreClient client;
 
+    private LocationHelper locationHelper;
+
     public HiveCatalog(FileIO fileIO, HiveConf hiveConf, String clientClassName) {
-        super(fileIO);
-        this.hiveConf = hiveConf;
-        this.clientClassName = clientClassName;
-        this.client = createClient(hiveConf, clientClassName);
+        this(fileIO, hiveConf, clientClassName, Collections.emptyMap());
     }
 
     public HiveCatalog(
             FileIO fileIO, HiveConf hiveConf, String clientClassName, Map<String, String> options) {
         super(fileIO, options);
         this.hiveConf = hiveConf;
         this.clientClassName = clientClassName;
+
+        boolean needLocationInProperties =
+                hiveConf.getBoolean(
+                        LOCATION_IN_PROPERTIES.key(), LOCATION_IN_PROPERTIES.defaultValue());
+        if (needLocationInProperties) {
+            locationHelper = new TBPropertiesLocationHelper(hiveConf);
+        } else {
+            // set the warehouse location to the hiveConf
+            String warehouse = hiveConf.get(CatalogOptions.WAREHOUSE.key());
+            hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse);
+
+            locationHelper = new StorageLocationHelper(hiveConf);

Review Comment:
   if we pass warehouse to `HiveCatalog` at the above conversation , we can specify the warehouse directly .So that  we don't need pass `hiveConf`  to `LocationHelper`  at the both `LocationHelper`  separately . Shall we retain `hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse)`  outside the `LocationHelper`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#issuecomment-1519207465

   > Thanks @wg1026688210 ! It looks better!
   > 
   > Can we also update documentation for hive catalog? In `create-catalog` page.
   
   done


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] JingsongLi commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "JingsongLi (via GitHub)" <gi...@apache.org>.
JingsongLi commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1172061406


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -150,12 +153,16 @@ public boolean databaseExists(String databaseName) {
     public void createDatabase(String name, boolean ignoreIfExists)
             throws DatabaseAlreadyExistException {
         try {
+            Path databasePath = databasePath(name);

Review Comment:
   Move this after `client.createDatabase`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [PR] [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog [incubator-paimon]

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#issuecomment-1874148670

   @wg1026688210 hi, can you explain what is `causing that HMS access the location by filesystem of hive`, I fell very confued about the role of `location-in-properties`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] JingsongLi commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "JingsongLi (via GitHub)" <gi...@apache.org>.
JingsongLi commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174490523


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -97,18 +100,31 @@ public class HiveCatalog extends AbstractCatalog {
     private final String clientClassName;
     private final IMetaStoreClient client;
 
+    private LocationHelper locationHelper;
+
     public HiveCatalog(FileIO fileIO, HiveConf hiveConf, String clientClassName) {
-        super(fileIO);
-        this.hiveConf = hiveConf;
-        this.clientClassName = clientClassName;
-        this.client = createClient(hiveConf, clientClassName);
+        this(fileIO, hiveConf, clientClassName, Collections.emptyMap());
     }
 
     public HiveCatalog(
             FileIO fileIO, HiveConf hiveConf, String clientClassName, Map<String, String> options) {
         super(fileIO, options);
         this.hiveConf = hiveConf;
         this.clientClassName = clientClassName;
+
+        boolean needLocationInProperties =
+                hiveConf.getBoolean(
+                        LOCATION_IN_PROPERTIES.key(), LOCATION_IN_PROPERTIES.defaultValue());
+        if (needLocationInProperties) {
+            locationHelper = new TBPropertiesLocationHelper(hiveConf);
+        } else {
+            // set the warehouse location to the hiveConf
+            String warehouse = hiveConf.get(CatalogOptions.WAREHOUSE.key());
+            hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse);
+
+            locationHelper = new StorageLocationHelper(hiveConf);

Review Comment:
   We can put these two lines to the constructor of `StorageLocationHelper`.
   ```
   String warehouse = hiveConf.get(CatalogOptions.WAREHOUSE.key());
   hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse);
   ```



##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalogOptions.java:
##########
@@ -40,5 +40,12 @@ public final class HiveCatalogOptions {
                     .withDescription(
                             "File directory of the core-site.xml、hdfs-site.xml、yarn-site.xml、mapred-site.xml. Currently, only local file system paths are supported.");
 
+    public static final ConfigOption<Boolean> LOCATION_IN_PROPERTIES =
+            ConfigOptions.key("location-in-properties")
+                    .booleanType()
+                    .defaultValue(false)
+                    .withDescription(
+                            "setting the location in properties of hive table/database, you can try configing 'location-in-properties'= 'true'  if you don't want that accessing the location of hive table/database by the filesystem of hive,when using object store as the warehouse of hive catalog.");

Review Comment:
   We should avoid single line that are too long



##########
paimon-common/src/main/java/org/apache/paimon/utils/LocationKeyExtractor.java:
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.paimon.utils;
+
+import java.util.Properties;
+
+/**
+ * declaring the name of the key in the parameters of the Hive metastore table, which indicates
+ * where the Paimon table is stored.
+ */
+public class LocationKeyExtractor {

Review Comment:
   Can we merge this class into `TBPropertiesLocationHelper`?



##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalogFactory.java:
##########
@@ -69,7 +69,7 @@ public Catalog create(FileIO fileIO, Path warehouse, CatalogContext context) {
         // always using user-set parameters overwrite hive-site.xml parameters
         context.options().toMap().forEach(hiveConf::set);
         hiveConf.set(HiveConf.ConfVars.METASTOREURIS.varname, uri);
-        hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse.toUri().toString());
+        hiveConf.set(CatalogOptions.WAREHOUSE.key(), warehouse.toUri().toString());

Review Comment:
   Can we delete this line?
   We can just pass warehouse to `HiveCatalog`.
   And just pass warehouse to `LoctionHelper`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] JingsongLi commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "JingsongLi (via GitHub)" <gi...@apache.org>.
JingsongLi commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174722850


##########
docs/content/how-to/creating-catalogs.md:
##########
@@ -127,6 +132,11 @@ After `spark-sql` is started, you can switch to the `default` database of the `p
 USE paimon.default;
 ```
 
+If you are using a object storage , and you don't want that the location of paimon table/database is accessed by the filesystem of hive,

Review Comment:
   Maybe we can just copy these sentence to after `tab`? So both Flink and Spark can contain it.



##########
paimon-hive/paimon-hive-common/src/main/java/org/apache/paimon/hive/LocationKeyExtractor.java:
##########
@@ -0,0 +1,37 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.paimon.hive;
+
+import java.util.Properties;
+
+/**
+ * declaring the name of the key in the parameters of the Hive metastore table, which indicates
+ * where the Paimon table is stored.
+ */
+public class LocationKeyExtractor {
+    // special at the tbproperties with the name paimon_meta_location.
+    public static final String TBPROPERTIES_LOCATION_KEY = "paimon_meta_location";
+    public static final String STORAGE_LOCATION_KEY = "location";

Review Comment:
   Can we remove this `STORAGE_LOCATION_KEY` and `getLocation`?
   They should be introduced in next PR instead of this PR.



##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/TBPropertiesLocationHelper.java:
##########
@@ -0,0 +1,60 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.paimon.hive;
+
+import org.apache.paimon.fs.FileIO;
+import org.apache.paimon.fs.Path;
+
+import org.apache.hadoop.hive.metastore.api.Database;
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import java.io.IOException;
+import java.util.HashMap;
+
+/** Helper for Setting Location in Hive Table Properties. */
+public final class TBPropertiesLocationHelper implements LocationHelper {
+
+    public TBPropertiesLocationHelper() {}
+
+    @Override
+    public void createPathIfRequired(Path path, FileIO fileIO) throws IOException {
+        if (!fileIO.exists(path)) {
+            fileIO.mkdirs(path);
+        }
+    }
+
+    @Override
+    public void dropPathIfRequired(Path path, FileIO fileIO) throws IOException {
+        if (fileIO.exists(path)) {
+            fileIO.delete(path, true);
+        }
+    }
+
+    @Override
+    public void specifyTableLocation(Table table, String location) {
+        table.getParameters().put(LocationKeyExtractor.TBPROPERTIES_LOCATION_KEY, location);
+    }
+
+    @Override
+    public void specifyDatabaseLocation(Path path, Database database) {
+        HashMap<String, String> properties = new HashMap<>();

Review Comment:
   check if `database.getParameters != null`, `properties,putAll(database.getParameters)`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174520739


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -97,18 +100,31 @@ public class HiveCatalog extends AbstractCatalog {
     private final String clientClassName;
     private final IMetaStoreClient client;
 
+    private LocationHelper locationHelper;
+
     public HiveCatalog(FileIO fileIO, HiveConf hiveConf, String clientClassName) {
-        super(fileIO);
-        this.hiveConf = hiveConf;
-        this.clientClassName = clientClassName;
-        this.client = createClient(hiveConf, clientClassName);
+        this(fileIO, hiveConf, clientClassName, Collections.emptyMap());
     }
 
     public HiveCatalog(
             FileIO fileIO, HiveConf hiveConf, String clientClassName, Map<String, String> options) {
         super(fileIO, options);
         this.hiveConf = hiveConf;
         this.clientClassName = clientClassName;
+
+        boolean needLocationInProperties =
+                hiveConf.getBoolean(
+                        LOCATION_IN_PROPERTIES.key(), LOCATION_IN_PROPERTIES.defaultValue());
+        if (needLocationInProperties) {
+            locationHelper = new TBPropertiesLocationHelper(hiveConf);
+        } else {
+            // set the warehouse location to the hiveConf
+            String warehouse = hiveConf.get(CatalogOptions.WAREHOUSE.key());
+            hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse);
+
+            locationHelper = new StorageLocationHelper(hiveConf);

Review Comment:
   if we pass warehouse to `HiveCatalog` at the above conversation , we can specify the warehouse directly .So that  we don't need pass `hiveConf`  to `LocationHelper`,  for  specifying the warehouse at the both `LocationHelper` separately . Shall we retain these two lines  outside the `LocationHelper`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174520543


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalogFactory.java:
##########
@@ -69,7 +69,7 @@ public Catalog create(FileIO fileIO, Path warehouse, CatalogContext context) {
         // always using user-set parameters overwrite hive-site.xml parameters
         context.options().toMap().forEach(hiveConf::set);
         hiveConf.set(HiveConf.ConfVars.METASTOREURIS.varname, uri);
-        hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse.toUri().toString());
+        hiveConf.set(CatalogOptions.WAREHOUSE.key(), warehouse.toUri().toString());

Review Comment:
   It's better for passing warehouse to HiveCatalog for a better readable indeed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174520739


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -97,18 +100,31 @@ public class HiveCatalog extends AbstractCatalog {
     private final String clientClassName;
     private final IMetaStoreClient client;
 
+    private LocationHelper locationHelper;
+
     public HiveCatalog(FileIO fileIO, HiveConf hiveConf, String clientClassName) {
-        super(fileIO);
-        this.hiveConf = hiveConf;
-        this.clientClassName = clientClassName;
-        this.client = createClient(hiveConf, clientClassName);
+        this(fileIO, hiveConf, clientClassName, Collections.emptyMap());
     }
 
     public HiveCatalog(
             FileIO fileIO, HiveConf hiveConf, String clientClassName, Map<String, String> options) {
         super(fileIO, options);
         this.hiveConf = hiveConf;
         this.clientClassName = clientClassName;
+
+        boolean needLocationInProperties =
+                hiveConf.getBoolean(
+                        LOCATION_IN_PROPERTIES.key(), LOCATION_IN_PROPERTIES.defaultValue());
+        if (needLocationInProperties) {
+            locationHelper = new TBPropertiesLocationHelper(hiveConf);
+        } else {
+            // set the warehouse location to the hiveConf
+            String warehouse = hiveConf.get(CatalogOptions.WAREHOUSE.key());
+            hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse);
+
+            locationHelper = new StorageLocationHelper(hiveConf);

Review Comment:
   if we pass warehouse to `HiveCatalog` at the above conversation , we can specify the warehouse directly .So that  we don't need pass `hiveConf`  to `LocationHelper`  at the both `LocationHelper`  separately . Shall we retain `hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse);`  outside the `LocationHelper`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174520543


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalogFactory.java:
##########
@@ -69,7 +69,7 @@ public Catalog create(FileIO fileIO, Path warehouse, CatalogContext context) {
         // always using user-set parameters overwrite hive-site.xml parameters
         context.options().toMap().forEach(hiveConf::set);
         hiveConf.set(HiveConf.ConfVars.METASTOREURIS.varname, uri);
-        hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse.toUri().toString());
+        hiveConf.set(CatalogOptions.WAREHOUSE.key(), warehouse.toUri().toString());

Review Comment:
   It's indeed better for passing warehouse to `HiveCatalog` directly  for a better readability. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#issuecomment-1505464789

   a draft for discussions


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174527902


##########
paimon-common/src/main/java/org/apache/paimon/utils/LocationKeyExtractor.java:
##########
@@ -0,0 +1,38 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.paimon.utils;
+
+import java.util.Properties;
+
+/**
+ * declaring the name of the key in the parameters of the Hive metastore table, which indicates
+ * where the Paimon table is stored.
+ */
+public class LocationKeyExtractor {

Review Comment:
   Another pr #886  will use this  class  for  reading  hive extern table. Shall we  migrate it to `paimon-hive-common` for a smaller scope.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174520739


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -97,18 +100,31 @@ public class HiveCatalog extends AbstractCatalog {
     private final String clientClassName;
     private final IMetaStoreClient client;
 
+    private LocationHelper locationHelper;
+
     public HiveCatalog(FileIO fileIO, HiveConf hiveConf, String clientClassName) {
-        super(fileIO);
-        this.hiveConf = hiveConf;
-        this.clientClassName = clientClassName;
-        this.client = createClient(hiveConf, clientClassName);
+        this(fileIO, hiveConf, clientClassName, Collections.emptyMap());
     }
 
     public HiveCatalog(
             FileIO fileIO, HiveConf hiveConf, String clientClassName, Map<String, String> options) {
         super(fileIO, options);
         this.hiveConf = hiveConf;
         this.clientClassName = clientClassName;
+
+        boolean needLocationInProperties =
+                hiveConf.getBoolean(
+                        LOCATION_IN_PROPERTIES.key(), LOCATION_IN_PROPERTIES.defaultValue());
+        if (needLocationInProperties) {
+            locationHelper = new TBPropertiesLocationHelper(hiveConf);
+        } else {
+            // set the warehouse location to the hiveConf
+            String warehouse = hiveConf.get(CatalogOptions.WAREHOUSE.key());
+            hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse);
+
+            locationHelper = new StorageLocationHelper(hiveConf);

Review Comment:
   if we pass warehouse to `HiveCatalog` at the above conversation , we can specify the warehouse directly .So that  we don't need pass `hiveConf` for  specifying the warehouse at the both `LocationHelper` separately . Shall we retain these two lines  outside the `LocationHelper`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] wg1026688210 commented on a diff in pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "wg1026688210 (via GitHub)" <gi...@apache.org>.
wg1026688210 commented on code in PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887#discussion_r1174520739


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -97,18 +100,31 @@ public class HiveCatalog extends AbstractCatalog {
     private final String clientClassName;
     private final IMetaStoreClient client;
 
+    private LocationHelper locationHelper;
+
     public HiveCatalog(FileIO fileIO, HiveConf hiveConf, String clientClassName) {
-        super(fileIO);
-        this.hiveConf = hiveConf;
-        this.clientClassName = clientClassName;
-        this.client = createClient(hiveConf, clientClassName);
+        this(fileIO, hiveConf, clientClassName, Collections.emptyMap());
     }
 
     public HiveCatalog(
             FileIO fileIO, HiveConf hiveConf, String clientClassName, Map<String, String> options) {
         super(fileIO, options);
         this.hiveConf = hiveConf;
         this.clientClassName = clientClassName;
+
+        boolean needLocationInProperties =
+                hiveConf.getBoolean(
+                        LOCATION_IN_PROPERTIES.key(), LOCATION_IN_PROPERTIES.defaultValue());
+        if (needLocationInProperties) {
+            locationHelper = new TBPropertiesLocationHelper(hiveConf);
+        } else {
+            // set the warehouse location to the hiveConf
+            String warehouse = hiveConf.get(CatalogOptions.WAREHOUSE.key());
+            hiveConf.set(HiveConf.ConfVars.METASTOREWAREHOUSE.varname, warehouse);
+
+            locationHelper = new StorageLocationHelper(hiveConf);

Review Comment:
   if we pass warehouse to `HiveCatalog` at the above conversation , we can specify the warehouse directly .So that  we don't need pass `hiveConf`  to `LocationHelper`  at the both `LocationHelper`  separately . Shall we retain these two lines  outside the `LocationHelper`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] JingsongLi merged pull request #887: [hive] Avoid using hive own FileSystem to access the location of paimon table when using HiveCatalog

Posted by "JingsongLi (via GitHub)" <gi...@apache.org>.
JingsongLi merged PR #887:
URL: https://github.com/apache/incubator-paimon/pull/887


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org