You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@phoenix.apache.org by "Attila Zsolt Piros (Jira)" <ji...@apache.org> on 2022/09/19 08:36:00 UTC

[jira] [Comment Edited] (PHOENIX-6668) Spark3 connector cannot distinguish column name cases

    [ https://issues.apache.org/jira/browse/PHOENIX-6668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17606479#comment-17606479 ] 

Attila Zsolt Piros edited comment on PHOENIX-6668 at 9/19/22 8:35 AM:
----------------------------------------------------------------------

[~stoty] Thanks for explaining the purpose of the quotes here. 
I have run some more tests and I think "spark.sql.caseSensitive=true" is working but in that case in the given schema all the column names must match without quotes, so please try to run it this way:

{noformat}
   ...
   val schema = StructType(Seq(
      StructField("ID", LongType, nullable = false),
      StructField("TABLE1_ID", LongType, nullable = true),
      StructField("t2col1", StringType, nullable = true)
    ))    
    spark.sqlContext.sql("set spark.sql.caseSensitive=true")
    val df = spark.sqlContext.createDataFrame(rdd1, schema)
    ...
{noformat}
 


was (Author: attilapiros):
[~stoty] Thanks for explaining the purpose of the quotes here. 
I have run some more tests and I think "spark.sql.caseSensitive=true" is working but in that case in the given schema all the column names must match without quotes, so please try to run this way:

{noformat}
   ...
   val schema = StructType(Seq(
      StructField("ID", LongType, nullable = false),
      StructField("TABLE1_ID", LongType, nullable = true),
      StructField("t2col1", StringType, nullable = true)
    ))    
    spark.sqlContext.sql("set spark.sql.caseSensitive=true")
    val df = spark.sqlContext.createDataFrame(rdd1, schema)
    ...
{noformat}
 

> Spark3 connector cannot distinguish column name cases
> -----------------------------------------------------
>
>                 Key: PHOENIX-6668
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-6668
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: Istvan Toth
>            Priority: Major
>
> The Spark2 connector handled lowercase and mixed case column names correctly in DataFrame definitions.
> Spark3 only does case-insensitive column resolving, and even _spark.sql.caseSensitive_  doesn't seem to do anything, neither backquouting.
> Again, this is not something that can likely be fixes from the Phoenix side without changes in Spark, and this ticket is mainy for documenting this regression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)