You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Leoyzen (via GitHub)" <gi...@apache.org> on 2023/02/14 03:03:40 UTC

[GitHub] [hudi] Leoyzen opened a new issue, #7936: [SUPPORT]Flink HiveCatalog should respect 'managed_table' options to avoid deleting data unexpectable.

Leoyzen opened a new issue, #7936:
URL: https://github.com/apache/hudi/issues/7936

   
   **Describe the problem you faced**
   
   Currently it is unacceptable when using drop table statement which the table managed by hive catalog will unexpectedly deleting data.
   
   There is also an options named "hoodie.datasource.hive_sync.create_managed_table" but not been honored by hive catalog.
   The catalog currently always deleting the data, which should be changed to honored by the "managed_table" option.
   
   ```JAVA HoodieHiveCatalog.java
     @Override
     public void dropTable(ObjectPath tablePath, boolean ignoreIfNotExists)
         throws TableNotExistException, CatalogException {
       checkNotNull(tablePath, "Table path cannot be null");
   
       try {
         client.dropTable(
             tablePath.getDatabaseName(),
             tablePath.getObjectName(),
             // Indicate whether associated data should be deleted.
             // Set to 'true' for now because Flink tables shouldn't have data in Hive. Can
             // be changed later if necessary
             true,
             ignoreIfNotExists);
       } catch (NoSuchObjectException e) {
         if (!ignoreIfNotExists) {
           throw new TableNotExistException(getName(), tablePath);
         }
       } catch (TException e) {
         throw new HoodieCatalogException(
             String.format("Failed to drop table %s", tablePath.getFullName()), e);
       }
     }
   ```
   
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. create hive catalog in flink.
   2. drop the table in hive catalog by "DROP TABLE hive_catalog.testing_table;"
   
   **Expected behavior**
   
   A clear and concise description of what you expected to happen.
   
   **Environment Description**
   
   * Hudi version : 0.13.0
   
   * Spark version : N/A
   
   * Hive version : 3.1.3
   
   * Hadoop version : 3.1.2
   
   * Storage (HDFS/S3/GCS..) : OSS
   
   * Running on Docker? (yes/no) : no
   
   
   **Additional context**
   
   
   **Stacktrace**
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [hudi] danny0405 commented on issue #7936: [SUPPORT]Flink HiveCatalog should respect 'managed_table' options to avoid deleting data unexpectable.

Posted by "danny0405 (via GitHub)" <gi...@apache.org>.
danny0405 commented on issue #7936:
URL: https://github.com/apache/hudi/issues/7936#issuecomment-1429487379

   I see a fix patch: https://github.com/apache/hudi/pull/7940


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org