You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Reynold Xin (JIRA)" <ji...@apache.org> on 2016/11/05 07:58:58 UTC
[jira] [Resolved] (SPARK-17983) Can't filter over mixed case
parquet columns of converted Hive tables
[ https://issues.apache.org/jira/browse/SPARK-17983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Reynold Xin resolved SPARK-17983.
---------------------------------
Resolution: Fixed
Assignee: Wenchen Fan
Fix Version/s: 2.1.0
> Can't filter over mixed case parquet columns of converted Hive tables
> ---------------------------------------------------------------------
>
> Key: SPARK-17983
> URL: https://issues.apache.org/jira/browse/SPARK-17983
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 2.1.0
> Reporter: Eric Liang
> Assignee: Wenchen Fan
> Priority: Critical
> Fix For: 2.1.0
>
>
> We should probably revive https://github.com/apache/spark/pull/14750 in order to fix this issue and related classes of issues.
> The only other alternatives are (1) reconciling on-disk schemas with metastore schema at planning time, which seems pretty messy, and (2) fixing all the datasources to support case-insensitive matching, which also has issues.
> Reproduction:
> {code}
> private def setupPartitionedTable(tableName: String, dir: File): Unit = {
> spark.range(5).selectExpr("id as normalCol", "id as partCol1", "id as partCol2").write
> .partitionBy("partCol1", "partCol2")
> .mode("overwrite")
> .parquet(dir.getAbsolutePath)
> spark.sql(s"""
> |create external table $tableName (normalCol long)
> |partitioned by (partCol1 int, partCol2 int)
> |stored as parquet
> |location "${dir.getAbsolutePath}"""".stripMargin)
> spark.sql(s"msck repair table $tableName")
> }
> test("filter by mixed case col") {
> withTable("test") {
> withTempDir { dir =>
> setupPartitionedTable("test", dir)
> val df = spark.sql("select * from test where normalCol = 3")
> assert(df.count() == 1)
> }
> }
> }
> {code}
> cc [~cloud_fan]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org