You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Max Gekk (Jira)" <ji...@apache.org> on 2023/03/30 13:27:00 UTC

[jira] [Updated] (SPARK-42979) Define literal constructors as keywords

     [ https://issues.apache.org/jira/browse/SPARK-42979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Max Gekk updated SPARK-42979:
-----------------------------
    Description: 
Currently, Spark SQL defines literal constructors as:

 
{code}
| identifier stringLit #typeConstructor
{code}
where identifier is parsed later by visitTypeConstructor():

{code:scala}
    valueType match {
      case "DATE" =>
        val zoneId = getZoneId(conf.sessionLocalTimeZone)
        val specialDate = convertSpecialDate(value, zoneId).map(Literal(_, DateType))
        specialDate.getOrElse(toLiteral(stringToDate, DateType))
      case "TIMESTAMP_NTZ" =>
...
{code}

So, the literal constructors are not Spark SQL keywords, and this causes some inconveniences while analysing/transforming the lexer tree. For example, while forming the stable column aliases.

Need to define literal constructors in SqlBaseParser.g4.

 

 

  was:
Currently, Spark SQL defines primitive types as:

 
{code}
| identifier (LEFT_PAREN INTEGER_VALUE
  (COMMA INTEGER_VALUE)* RIGHT_PAREN)?                      #primitiveDataType
{code}
where identifier is parsed later by visitPrimitiveDataType():

{code:scala}
  override def visitPrimitiveDataType(ctx: PrimitiveDataTypeContext): DataType = withOrigin(ctx) {
    val dataType = ctx.identifier.getText.toLowerCase(Locale.ROOT)
    (dataType, ctx.INTEGER_VALUE().asScala.toList) match {
      case ("boolean", Nil) => BooleanType
      case ("tinyint" | "byte", Nil) => ByteType
      case ("smallint" | "short", Nil) => ShortType
      case ("int" | "integer", Nil) => IntegerType
      case ("bigint" | "long", Nil) => LongType
      case ("float" | "real", Nil) => FloatType
...
{code}

So, the types are not Spark SQL keywords, and this causes some inconveniences while analysing/transforming the lexer tree. For example, while forming the stable column aliases.

Need to define Spark SQL types in SqlBaseLexer.g4.

Also, typed literals have the same issue. The types "DATE", "TIMESTAMP_NTZ", "TIMESTAMP", "TIMESTAMP_LTZ", "INTERVAL", and "X" should be defined as base lexer tokens. 

 

 


> Define literal constructors as keywords
> ---------------------------------------
>
>                 Key: SPARK-42979
>                 URL: https://issues.apache.org/jira/browse/SPARK-42979
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.5.0
>            Reporter: Max Gekk
>            Assignee: Max Gekk
>            Priority: Major
>             Fix For: 3.5.0
>
>
> Currently, Spark SQL defines literal constructors as:
>  
> {code}
> | identifier stringLit #typeConstructor
> {code}
> where identifier is parsed later by visitTypeConstructor():
> {code:scala}
>     valueType match {
>       case "DATE" =>
>         val zoneId = getZoneId(conf.sessionLocalTimeZone)
>         val specialDate = convertSpecialDate(value, zoneId).map(Literal(_, DateType))
>         specialDate.getOrElse(toLiteral(stringToDate, DateType))
>       case "TIMESTAMP_NTZ" =>
> ...
> {code}
> So, the literal constructors are not Spark SQL keywords, and this causes some inconveniences while analysing/transforming the lexer tree. For example, while forming the stable column aliases.
> Need to define literal constructors in SqlBaseParser.g4.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org