You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2021/06/22 17:50:18 UTC

[GitHub] [iceberg] hankfanchiu opened a new pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

hankfanchiu opened a new pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722


   The `HiveCatalog#listTables` method could be optimized, by filtering for Hive tables of the `"ICEBERG"` type directly in the Hive MetaStore instead of from client-side.
   
   Concretely:
   
   1. Replace the use of `HiveMetaStoreClient#getAllTables` with [`HiveMetaStoreClient#listTableNamesByFilter`](https://hive.apache.org/javadocs/r2.2.0/api/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.html#listTableNamesByFilter-java.lang.String-java.lang.String-short-), passing the filter statement `hive_filter_field_params__table_type = "ICEBERG"`;
   
   2. Remove the now redundant client-side filtering of Hive `Table`s.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-866648578


   @hankfanchiu: Thanks for the patch!
   Could you please check the failures in the CI? I think you could reproduce them locally by running
   ```
   ./gradlew build -x test
   ```
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-868520714


   Will be offline for 2 weeks, please be patient with me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hankfanchiu commented on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
hankfanchiu commented on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-867363019


   The `HiveTableTest > testListTables` and `TestFlinkCatalogDatabase > testDefaultDatabase` test cases both failed with the following snippet:
   
   ```
   Caused by:
   MetaException(message:Exception thrown when executing query : SELECT A0.TBL_NAME FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID INNER JOIN TABLE_PARAMS C0 ON A0.TBL_ID = C0.TBL_ID WHERE C0.PARAM_KEY = 'table_type' AND B0."NAME" = ? AND C0.PARAM_VALUE = ?)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_names_by_filter_result$get_table_names_by_filter_resultStandardScheme.read(ThriftHiveMetastore.java:57499)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_names_by_filter_result$get_table_names_by_filter_resultStandardScheme.read(ThriftHiveMetastore.java:57467)
   ...
   ```
   
   (ex: https://github.com/apache/iceberg/pull/2722/checks?check_run_id=2897257275#step:6:309)
   
   I believe that this exception is due to the currently unresolved bug: [HIVE-21614](https://issues.apache.org/jira/browse/HIVE-21614).
   
   Specifically, the comparison against the `TABLE_PARAMS.PARAM_VALUE` column -- a `CLOB` type -- is not supported by the Derby database used in tests.
   
   Like what was done in https://issues.apache.org/jira/browse/HIVE-12274?focusedCommentId=15931690&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15931690, perhaps we could replace the `=` operator with the `LIKE` operator, i.e.:
   
   ```java
   String filter = String.format("%s%s LIKE \"%s\"",
       hive_metastoreConstants.HIVE_FILTER_FIELD_PARAMS,
       BaseMetastoreTableOperations.TABLE_TYPE_PROP,
       BaseMetastoreTableOperations.ICEBERG_TABLE_TYPE_VALUE.toUpperCase(Locale.ENGLISH));
   ```
   
   Doesn't seem ideal, though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
kbendick commented on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-866429994


   Hi @hankfanchiu, thank you for your contribution.
   
   Do you know if this will be supported by older versions of HMS?
   
   That's a vague statement, but if you know when this feature was added (if not always supported) to HMS, that would be good to add here. I'm not sure how far back in terms of supporting HMS versions we go but probably farther than developers would like. 😉 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hankfanchiu edited a comment on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
hankfanchiu edited a comment on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-867363019


   The `HiveTableTest > testListTables` and `TestFlinkCatalogDatabase > testDefaultDatabase` test cases both failed with the following snippet:
   
   ```
   Caused by:
   MetaException(message:Exception thrown when executing query : SELECT A0.TBL_NAME FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID INNER JOIN TABLE_PARAMS C0 ON A0.TBL_ID = C0.TBL_ID WHERE C0.PARAM_KEY = 'table_type' AND B0."NAME" = ? AND C0.PARAM_VALUE = ?)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_names_by_filter_result$get_table_names_by_filter_resultStandardScheme.read(ThriftHiveMetastore.java:57499)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_names_by_filter_result$get_table_names_by_filter_resultStandardScheme.read(ThriftHiveMetastore.java:57467)
   ...
   ```
   
   (ex: https://github.com/apache/iceberg/pull/2722/checks?check_run_id=2897257275#step:6:309)
   
   From `build/testlogs/iceberg-hive-metastore.log` ([test logs](https://github.com/apache/iceberg/suites/3068491470/artifacts/69902232)):
   
   ```
   StdErr [pool-36-thread-2] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 11: source:127.0.0.1 get_table_names_by_filter: db = hivedb, filter = hive_filter_field_params__table_type = "ICEBERG"
   StdErr [pool-36-thread-2] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=runner	ip=127.0.0.1	cmd=source:127.0.0.1 get_table_names_by_filter: db = hivedb, filter = hive_filter_field_params__table_type = "ICEBERG"	
   StdErr [pool-36-thread-2] ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler - Retrying HMSHandler after 2000 ms (attempt 2 of 10) with error: javax.jdo.JDOException: Exception thrown when executing query : SELECT A0.TBL_NAME FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID INNER JOIN TABLE_PARAMS C0 ON A0.TBL_ID = C0.TBL_ID WHERE C0.PARAM_KEY = 'table_type' AND B0."NAME" = ? AND C0.PARAM_VALUE = ?
   StdErr 	at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:677)
   StdErr 	at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391)
   StdErr 	at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:279)
   StdErr 	at org.apache.hadoop.hive.metastore.ObjectStore.listTableNamesByFilter(ObjectStore.java:3374)
   StdErr 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   StdErr 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   StdErr 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   StdErr 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   StdErr 	at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
   StdErr 	at com.sun.proxy.$Proxy14.listTableNamesByFilter(Unknown Source)
   StdErr 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_names_by_filter(HiveMetaStore.java:2167)
   StdErr 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   StdErr 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   StdErr 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   StdErr 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   StdErr 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
   StdErr 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
   StdErr 	at com.sun.proxy.$Proxy20.get_table_names_by_filter(Unknown Source)
   StdErr 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_names_by_filter.getResult(ThriftHiveMetastore.java:11577)
   StdErr 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_names_by_filter.getResult(ThriftHiveMetastore.java:11561)
   StdErr 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
   StdErr 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   StdErr 	at org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48)
   StdErr 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
   StdErr 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   StdErr 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   StdErr 	at java.base/java.lang.Thread.run(Thread.java:829)
   StdErr NestedThrowablesStackTrace:
   StdErr java.sql.SQLSyntaxErrorException: Comparisons between 'CLOB (UCS_BASIC)' and 'CLOB (UCS_BASIC)' are not supported. Types must be comparable. String types must also have matching collation. If collation does not match, a possible solution is to cast operands to force them to the default collation (e.g. SELECT tablename FROM sys.systables WHERE CAST(tablename AS VARCHAR(128)) = 'T1')
   ```
   
   These failures are due to a currently unresolved bug: [HIVE-21614](https://issues.apache.org/jira/browse/HIVE-21614).
   
   Specifically, the comparison against the `TABLE_PARAMS.PARAM_VALUE` column -- a `CLOB` type -- is not supported by the Derby database used in tests.
   
   Like what was done in https://issues.apache.org/jira/browse/HIVE-12274?focusedCommentId=15931690&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15931690, perhaps we could replace the `=` operator with the `LIKE` operator, i.e.:
   
   ```java
   String filter = String.format("%s%s LIKE \"%s\"",
       hive_metastoreConstants.HIVE_FILTER_FIELD_PARAMS,
       BaseMetastoreTableOperations.TABLE_TYPE_PROP,
       BaseMetastoreTableOperations.ICEBERG_TABLE_TYPE_VALUE.toUpperCase(Locale.ENGLISH));
   ```
   
   Doesn't seem ideal, though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hankfanchiu commented on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
hankfanchiu commented on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-881176764


   > Would you mind fixing this in Hive code
   
   apache/hive#2484


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hankfanchiu edited a comment on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
hankfanchiu edited a comment on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-866503004


   > Do you know if this will be supported by older versions of HMS?
   
   Yup, the `HiveMetaStoreClient#listTableNamesByFilter` API having the support for filtering by table parameters (via `hive_filter_field_params__`) was introduced in 0.8.0. See apache/hive@8d143c6c5b43e4cc3afa5985bd31903f31252d7f ([HIVE-2226](https://issues.apache.org/jira/browse/HIVE-2226)).
   
   > if you know when this feature was added (if not always supported) to HMS, that would be good to add here. I'm not sure how far back in terms of supporting HMS versions we go but probably farther than developers would like.
   
   Thanks for pointing this out! I've updated the PR description.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] pvary commented on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
pvary commented on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-867420525


   @hankfanchiu: Would you mind fixing this in Hive code, and backport it to branch-2, branch-2.3 there? I think it would be a good to fix this, but we can not do it in Iceberg code alone.
   
   You could count on my review in Hive as well. 😄 
   
   Thanks,
   Peter


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hankfanchiu commented on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
hankfanchiu commented on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-866503004


   > Do you know if this will be supported by older versions of HMS?
   
   Yup, the `HiveMetaStoreClient#listTableNamesByFilter` API having the support for filtering by table parameters (via `hive_filter_field_params__`) was introduced in 0.8.0. See apache/hive@8d143c6c5b43e4cc3afa5985bd31903f31252d7f ([HIVE-2226](https://issues.apache.org/jira/browse/HIVE-2226)).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hankfanchiu edited a comment on pull request #2722: Hive: Push filtering for Iceberg table type to Hive MetaStore when listing tables

Posted by GitBox <gi...@apache.org>.
hankfanchiu edited a comment on pull request #2722:
URL: https://github.com/apache/iceberg/pull/2722#issuecomment-867363019


   The `HiveTableTest > testListTables` and `TestFlinkCatalogDatabase > testDefaultDatabase` test cases both failed with the following snippet:
   
   ```
   Caused by:
   MetaException(message:Exception thrown when executing query : SELECT A0.TBL_NAME FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID INNER JOIN TABLE_PARAMS C0 ON A0.TBL_ID = C0.TBL_ID WHERE C0.PARAM_KEY = 'table_type' AND B0."NAME" = ? AND C0.PARAM_VALUE = ?)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_names_by_filter_result$get_table_names_by_filter_resultStandardScheme.read(ThriftHiveMetastore.java:57499)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_names_by_filter_result$get_table_names_by_filter_resultStandardScheme.read(ThriftHiveMetastore.java:57467)
   ...
   ```
   
   (ex: https://github.com/apache/iceberg/pull/2722/checks?check_run_id=2897257275#step:6:309)
   
   From `build/testlogs/iceberg-hive-metastore.log` ([test logs](https://github.com/apache/iceberg/suites/3068491470/artifacts/69902232)):
   
   ```
   StdErr [pool-36-thread-2] INFO org.apache.hadoop.hive.metastore.HiveMetaStore - 11: source:127.0.0.1 get_table_names_by_filter: db = hivedb, filter = hive_filter_field_params__table_type = "ICEBERG"
   StdErr [pool-36-thread-2] INFO org.apache.hadoop.hive.metastore.HiveMetaStore.audit - ugi=runner	ip=127.0.0.1	cmd=source:127.0.0.1 get_table_names_by_filter: db = hivedb, filter = hive_filter_field_params__table_type = "ICEBERG"	
   StdErr [pool-36-thread-2] ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler - Retrying HMSHandler after 2000 ms (attempt 2 of 10) with error: javax.jdo.JDOException: Exception thrown when executing query : SELECT A0.TBL_NAME FROM TBLS A0 LEFT OUTER JOIN DBS B0 ON A0.DB_ID = B0.DB_ID INNER JOIN TABLE_PARAMS C0 ON A0.TBL_ID = C0.TBL_ID WHERE C0.PARAM_KEY = 'table_type' AND B0."NAME" = ? AND C0.PARAM_VALUE = ?
   StdErr 	at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:677)
   StdErr 	at org.datanucleus.api.jdo.JDOQuery.executeInternal(JDOQuery.java:391)
   StdErr 	at org.datanucleus.api.jdo.JDOQuery.executeWithMap(JDOQuery.java:279)
   StdErr 	at org.apache.hadoop.hive.metastore.ObjectStore.listTableNamesByFilter(ObjectStore.java:3374)
   StdErr 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   StdErr 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   StdErr 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   StdErr 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   StdErr 	at org.apache.hadoop.hive.metastore.RawStoreProxy.invoke(RawStoreProxy.java:101)
   StdErr 	at com.sun.proxy.$Proxy14.listTableNamesByFilter(Unknown Source)
   StdErr 	at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table_names_by_filter(HiveMetaStore.java:2167)
   StdErr 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   StdErr 	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
   StdErr 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   StdErr 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
   StdErr 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
   StdErr 	at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
   StdErr 	at com.sun.proxy.$Proxy20.get_table_names_by_filter(Unknown Source)
   StdErr 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_names_by_filter.getResult(ThriftHiveMetastore.java:11577)
   StdErr 	at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_names_by_filter.getResult(ThriftHiveMetastore.java:11561)
   StdErr 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
   StdErr 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
   StdErr 	at org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48)
   StdErr 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286)
   StdErr 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
   StdErr 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
   StdErr 	at java.base/java.lang.Thread.run(Thread.java:829)
   StdErr NestedThrowablesStackTrace:
   StdErr java.sql.SQLSyntaxErrorException: Comparisons between 'CLOB (UCS_BASIC)' and 'CLOB (UCS_BASIC)' are not supported. Types must be comparable. String types must also have matching collation. If collation does not match, a possible solution is to cast operands to force them to the default collation (e.g. SELECT tablename FROM sys.systables WHERE CAST(tablename AS VARCHAR(128)) = 'T1')
   ```
   
   These failures are due to a currently unresolved bug: [HIVE-21614](https://issues.apache.org/jira/browse/HIVE-21614).
   
   Specifically, the comparison against the `TABLE_PARAMS.PARAM_VALUE` column -- a `CLOB` type -- is not supported by the Derby database used in tests.
   
   Like what was done in https://issues.apache.org/jira/browse/HIVE-12274?focusedCommentId=15931690&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-15931690, perhaps we could replace the `=` operator with the `LIKE` operator, i.e.:
   
   ```diff
   - String filter = String.format("%s%s = \"%s\"",
   + String filter = String.format("%s%s LIKE \"%s\"",
         hive_metastoreConstants.HIVE_FILTER_FIELD_PARAMS,
         BaseMetastoreTableOperations.TABLE_TYPE_PROP,
         BaseMetastoreTableOperations.ICEBERG_TABLE_TYPE_VALUE.toUpperCase(Locale.ENGLISH));
   ```
   
   Doesn't seem ideal, though.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org