You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (Jira)" <ji...@apache.org> on 2020/02/04 04:25:00 UTC
[jira] [Resolved] (SPARK-30709) Spark 2.3 to Spark 2.4 Upgrade. Problems reading HIVE partitioned tables.

     [ https://issues.apache.org/jira/browse/SPARK-30709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-30709.
----------------------------------
    Resolution: Invalid

> Spark 2.3 to Spark 2.4 Upgrade. Problems reading HIVE partitioned tables.
> -------------------------------------------------------------------------
>
>                 Key: SPARK-30709
>                 URL: https://issues.apache.org/jira/browse/SPARK-30709
>             Project: Spark
>          Issue Type: Question
>          Components: SQL
>    Affects Versions: 2.4.0
>         Environment: PRE- Production
>            Reporter: Carlos Mario
>            Priority: Major
>              Labels: SQL, Spark
>
> Hello
> We recently updated our preproduction environment from Spark 2.3 to Spark 2.4.0
> Along time we have created a big amount of tables in Hive Metastore, partitioned by 2 fields one of them String and the other one BigInt.
> We were reading this tables with Spark 2.3 with no problem, but after upgrading to Spark 2.4 we get the following log every time we run our SW:
> <log>
> log_filterBIGINT.out:
>  Caused by: MetaException(message:Filtering is supported only on partition keys of type string) Caused by: MetaException(message:Filtering is supported only on partition keys of type string) Caused by: MetaException(message:Filtering is supported only on partition keys of type string)
>  
> hadoop-cmf-hive-HIVEMETASTORE-isblcsmsttc0001.scisb.isban.corp.log.out.1:
>  
> 2020-01-10 09:36:05,781 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-138]: MetaException(message:Filtering is supported only on partition keys of type string)
> 2020-01-10 11:19:19,208 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-187]: MetaException(message:Filtering is supported only on partition keys of type string)
> 2020-01-10 11:19:54,780 ERROR org.apache.hadoop.hive.metastore.RetryingHMSHandler: [pool-5-thread-167]: MetaException(message:Filtering is supported only on partition keys of type string)
>  </log>
>  
> We know the best practice from Spark point of view is to use 'STRING' type for partition columns, but we need to explore a solution we'll be able to deploy with ease, due to the big amount of tables created with a bigiint type column partition.
>  
> As a first solution we tried to set the  spark.sql.hive.manageFilesourcePartitions parameter to false in the Spark Submmit, but after reruning the SW the error stood still.
>  
> Is there anyone in the community who experienced the same problem? What was the solution for it? 
>  
> Kind Regards and thanks in advance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org