You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Attila Zsolt Piros (Jira)" <ji...@apache.org> on 2022/09/19 08:36:00 UTC
[jira] [Comment Edited] (PHOENIX-6668) Spark3 connector cannot distinguish column name cases
[ https://issues.apache.org/jira/browse/PHOENIX-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606479#comment-17606479 ]
Attila Zsolt Piros edited comment on PHOENIX-6668 at 9/19/22 8:35 AM:
----------------------------------------------------------------------
[~stoty] Thanks for explaining the purpose of the quotes here.
I have run some more tests and I think "spark.sql.caseSensitive=true" is working but in that case in the given schema all the column names must match without quotes, so please try to run it this way:
{noformat}
...
val schema = StructType(Seq(
StructField("ID", LongType, nullable = false),
StructField("TABLE1_ID", LongType, nullable = true),
StructField("t2col1", StringType, nullable = true)
))
spark.sqlContext.sql("set spark.sql.caseSensitive=true")
val df = spark.sqlContext.createDataFrame(rdd1, schema)
...
{noformat}
was (Author: attilapiros):
[~stoty] Thanks for explaining the purpose of the quotes here.
I have run some more tests and I think "spark.sql.caseSensitive=true" is working but in that case in the given schema all the column names must match without quotes, so please try to run this way:
{noformat}
...
val schema = StructType(Seq(
StructField("ID", LongType, nullable = false),
StructField("TABLE1_ID", LongType, nullable = true),
StructField("t2col1", StringType, nullable = true)
))
spark.sqlContext.sql("set spark.sql.caseSensitive=true")
val df = spark.sqlContext.createDataFrame(rdd1, schema)
...
{noformat}
> Spark3 connector cannot distinguish column name cases
> -----------------------------------------------------
>
> Key: PHOENIX-6668
> URL: https://issues.apache.org/jira/browse/PHOENIX-6668
> Project: Phoenix
> Issue Type: Bug
> Reporter: Istvan Toth
> Priority: Major
>
> The Spark2 connector handled lowercase and mixed case column names correctly in DataFrame definitions.
> Spark3 only does case-insensitive column resolving, and even _spark.sql.caseSensitive_ doesn't seem to do anything, neither backquouting.
> Again, this is not something that can likely be fixes from the Phoenix side without changes in Spark, and this ticket is mainy for documenting this regression.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)