You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@paimon.apache.org by "SteNicholas (via GitHub)" <gi...@apache.org> on 2023/03/30 08:39:51 UTC

[GitHub] [incubator-paimon] SteNicholas opened a new pull request, #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

SteNicholas opened a new pull request, #772:
URL: https://github.com/apache/incubator-paimon/pull/772

   ### Purpose
   
   `HiveCatalog#dropTable` interface should delete table directory to avoid schema in filesystem exists, especially for external table which doesn't delete directory in filesystem so that recreating the same table cause exception that schema in filesystem exists.
   
   Fix #694.
   
   ### Tests
   
   `HiveCatalog#dropTable` forces to delete table directory to avoid schema in filesystem exists.
   
   ### API and Format 
   
   *(Does this change affect API or storage format)*
   
   ### Documentation
   
   *(Does this change introduce a new feature)*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] SteNicholas closed pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas closed pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists
URL: https://github.com/apache/incubator-paimon/pull/772


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] SteNicholas closed pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas closed pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists
URL: https://github.com/apache/incubator-paimon/pull/772


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] JingsongLi merged pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "JingsongLi (via GitHub)" <gi...@apache.org>.
JingsongLi merged PR #772:
URL: https://github.com/apache/incubator-paimon/pull/772


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] SteNicholas commented on a diff in pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on code in PR #772:
URL: https://github.com/apache/incubator-paimon/pull/772#discussion_r1158036232


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -208,6 +208,16 @@ public void dropTable(Identifier identifier, boolean ignoreIfNotExists)
                     identifier.getDatabaseName(), identifier.getObjectName(), true, false, true);
         } catch (TException e) {
             throw new RuntimeException("Failed to drop table " + identifier.getFullName(), e);
+        } finally {
+            // Deletes table directory to avoid schema in filesystem exists.
+            Path path = getDataTableLocation(identifier);
+            try {
+                if (fileIO.exists(path)) {
+                    fileIO.deleteDirectoryQuietly(path);

Review Comment:
   @FangYongs, `paimonTableExists` depends on whether the table in hive exist, not the table on filesystem. Therefore, when the table in hive isn't delete, `paimonTableExists` also return true and the table in hive could also be dropped.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] FangYongs commented on a diff in pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "FangYongs (via GitHub)" <gi...@apache.org>.
FangYongs commented on code in PR #772:
URL: https://github.com/apache/incubator-paimon/pull/772#discussion_r1153940926


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -208,6 +208,16 @@ public void dropTable(Identifier identifier, boolean ignoreIfNotExists)
                     identifier.getDatabaseName(), identifier.getObjectName(), true, false, true);
         } catch (TException e) {
             throw new RuntimeException("Failed to drop table " + identifier.getFullName(), e);
+        } finally {
+            // Deletes table directory to avoid schema in filesystem exists.
+            Path path = getDataTableLocation(identifier);
+            try {
+                if (fileIO.exists(path)) {
+                    fileIO.deleteDirectoryQuietly(path);

Review Comment:
   If hive client fails to `delete table` and schema in filesystem is deleted, paimon cannot delete the table in hive anymore due to the validation in line 198, will it happen?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] FangYongs commented on a diff in pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "FangYongs (via GitHub)" <gi...@apache.org>.
FangYongs commented on code in PR #772:
URL: https://github.com/apache/incubator-paimon/pull/772#discussion_r1159302221


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -208,6 +208,16 @@ public void dropTable(Identifier identifier, boolean ignoreIfNotExists)
                     identifier.getDatabaseName(), identifier.getObjectName(), true, false, true);
         } catch (TException e) {
             throw new RuntimeException("Failed to drop table " + identifier.getFullName(), e);
+        } finally {
+            // Deletes table directory to avoid schema in filesystem exists.
+            Path path = getDataTableLocation(identifier);
+            try {
+                if (fileIO.exists(path)) {
+                    fileIO.deleteDirectoryQuietly(path);

Review Comment:
   Can we give more information in line 191? Users can still list tables if we delete files here. But when users query data from the table, there will be exception with message "There is no paimond in ...". Should we need to suggest users to drop the table again?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] SteNicholas commented on pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on PR #772:
URL: https://github.com/apache/incubator-paimon/pull/772#issuecomment-1498440926

   @FangYongs, could you take a look at above comment?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] SteNicholas closed pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas closed pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists
URL: https://github.com/apache/incubator-paimon/pull/772


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] FangYongs commented on pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "FangYongs (via GitHub)" <gi...@apache.org>.
FangYongs commented on PR #772:
URL: https://github.com/apache/incubator-paimon/pull/772#issuecomment-1499863415

   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] Zouxxyy commented on pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "Zouxxyy (via GitHub)" <gi...@apache.org>.
Zouxxyy commented on PR #772:
URL: https://github.com/apache/incubator-paimon/pull/772#issuecomment-1535694750

   @SteNicholas Why do we delete data when drop external tables?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [incubator-paimon] SteNicholas commented on a diff in pull request #772: [hive] Dropping table deletes table directory to avoid schema in filesystem exists

Posted by "SteNicholas (via GitHub)" <gi...@apache.org>.
SteNicholas commented on code in PR #772:
URL: https://github.com/apache/incubator-paimon/pull/772#discussion_r1159484604


##########
paimon-hive/paimon-hive-catalog/src/main/java/org/apache/paimon/hive/HiveCatalog.java:
##########
@@ -208,6 +208,16 @@ public void dropTable(Identifier identifier, boolean ignoreIfNotExists)
                     identifier.getDatabaseName(), identifier.getObjectName(), true, false, true);
         } catch (TException e) {
             throw new RuntimeException("Failed to drop table " + identifier.getFullName(), e);
+        } finally {
+            // Deletes table directory to avoid schema in filesystem exists.
+            Path path = getDataTableLocation(identifier);
+            try {
+                if (fileIO.exists(path)) {
+                    fileIO.deleteDirectoryQuietly(path);

Review Comment:
   @FangYongs, I think this again. It's better to delete table directory in filesystem after dropping hive table successfully. Because we should keep the table consistency between which in filesystem and which in Hive metastore, otherwise the `listTables` behavior is inconsistent.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@paimon.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org