You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2023/03/09 15:19:10 UTC

[GitHub] [spark] beliefer opened a new pull request, #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

beliefer opened a new pull request, #40355:
URL: https://github.com/apache/spark/pull/40355

   ### What changes were proposed in this pull request?
   Spark connect already supported `functions.lit`, but `functions.typedlit`.
   This PR add some new msg to the connect protocol and support `functions.typedlit`.
   
   
   ### Why are the changes needed?
   Spark connect need to add `functions.typedlit`.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   New feature.
   
   
   ### How was this patch tested?
   New test cases.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135757405


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   However, this idea will make the existing case fail. needs more consideration
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136459542


##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,87 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(scalaValue: Any, elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+
+      scalaValue match {
+        case a: Array[_] =>
+          a.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case s: scala.collection.Seq[_] =>
+          s.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      ab
+    }
+
+    def mapBuilder(scalaValue: Any, keyType: DataType, valueType: DataType) = {
+      val mb = builder.getMapBuilder
+        .setKeyType(toConnectProtoType(keyType))
+        .setValueType(toConnectProtoType(valueType))
+
+      scalaValue match {
+        case map: scala.collection.Map[_, _] =>
+          map.foreach { case (k, v) =>
+            mb.addKeys(toLiteralProto(k, keyType))
+            mb.addValues(toLiteralProto(v, valueType))
+          }
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      mb
+    }
+
+    def structBuilder(scalaValue: Any, structType: StructType) = {
+      val sb = builder.getStructBuilder.setStructType(toConnectProtoType(structType))

Review Comment:
   I see



##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,87 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(scalaValue: Any, elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+
+      scalaValue match {
+        case a: Array[_] =>
+          a.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case s: scala.collection.Seq[_] =>
+          s.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      ab
+    }
+
+    def mapBuilder(scalaValue: Any, keyType: DataType, valueType: DataType) = {
+      val mb = builder.getMapBuilder
+        .setKeyType(toConnectProtoType(keyType))
+        .setValueType(toConnectProtoType(valueType))
+
+      scalaValue match {
+        case map: scala.collection.Map[_, _] =>
+          map.foreach { case (k, v) =>
+            mb.addKeys(toLiteralProto(k, keyType))
+            mb.addValues(toLiteralProto(v, valueType))
+          }
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      mb
+    }
+
+    def structBuilder(scalaValue: Any, structType: StructType) = {
+      val sb = builder.getStructBuilder.setStructType(toConnectProtoType(structType))

Review Comment:
   Thanks ~ I see



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136447424


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   Thank you for you detail review.
   AFAK, StringConverter covert `String` with `UTF8String.fromString`. It seems we no need to change code below `dataType.hasString`  and add `UTF8String.fromString(v.getString)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1139568392


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -134,46 +192,46 @@ object LiteralExpressionProtoConverter {
       builder.result()
     }
 
-    val elementType = array.getElementType
-    if (elementType.hasShort) {
-      makeArrayData(v => v.getShort.toShort)
-    } else if (elementType.hasInteger) {
-      makeArrayData(v => v.getInteger)
-    } else if (elementType.hasLong) {
-      makeArrayData(v => v.getLong)
-    } else if (elementType.hasDouble) {
-      makeArrayData(v => v.getDouble)
-    } else if (elementType.hasByte) {
-      makeArrayData(v => v.getByte.toByte)
-    } else if (elementType.hasFloat) {
-      makeArrayData(v => v.getFloat)
-    } else if (elementType.hasBoolean) {
-      makeArrayData(v => v.getBoolean)
-    } else if (elementType.hasString) {
-      makeArrayData(v => v.getString)
-    } else if (elementType.hasBinary) {
-      makeArrayData(v => v.getBinary.toByteArray)
-    } else if (elementType.hasDate) {
-      makeArrayData(v => DateTimeUtils.toJavaDate(v.getDate))
-    } else if (elementType.hasTimestamp) {
-      makeArrayData(v => DateTimeUtils.toJavaTimestamp(v.getTimestamp))
-    } else if (elementType.hasTimestampNtz) {
-      makeArrayData(v => DateTimeUtils.microsToLocalDateTime(v.getTimestampNtz))
-    } else if (elementType.hasDayTimeInterval) {
-      makeArrayData(v => IntervalUtils.microsToDuration(v.getDayTimeInterval))
-    } else if (elementType.hasYearMonthInterval) {
-      makeArrayData(v => IntervalUtils.monthsToPeriod(v.getYearMonthInterval))
-    } else if (elementType.hasDecimal) {
-      makeArrayData(v => Decimal(v.getDecimal.getValue))
-    } else if (elementType.hasCalendarInterval) {
-      makeArrayData(v => {
-        val interval = v.getCalendarInterval
-        new CalendarInterval(interval.getMonths, interval.getDays, interval.getMicroseconds)
-      })
-    } else if (elementType.hasArray) {
-      makeArrayData(v => toArrayData(v.getArray))
-    } else {
-      throw InvalidPlanInput(s"Unsupported Literal Type: $elementType)")
+    makeArrayData(getConverter(array.getElementType))
+  }
+
+  private def toMapData(map: proto.Expression.Literal.Map): Any = {
+    def makeMapData[K, V](
+        keyConverter: proto.Expression.Literal => K,
+        valueConverter: proto.Expression.Literal => V)(implicit
+        tagK: ClassTag[K],
+        tagV: ClassTag[V]): mutable.Map[K, V] = {
+      val builder = mutable.HashMap.empty[K, V]
+      val keys = map.getKeysList.asScala
+      val values = map.getValuesList.asScala
+      builder.sizeHint(keys.size)
+      keys.zip(values).foreach { case (key, value) =>
+        builder += ((keyConverter(key), valueConverter(value)))
+      }
+      builder
+    }
+
+    makeMapData(getConverter(map.getKeyType), getConverter(map.getValueType))
+  }
+
+  private def toStructData(struct: proto.Expression.Literal.Struct): Any = {
+    val elements = struct.getElementsList.asScala
+    val dataTypes = struct.getStructType.getStruct.getFieldsList.asScala.map(_.getDataType)
+    val structData = elements

Review Comment:
   Although use java reflection, it seems good to me. @LuciferYang Thank you.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135470030


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralValueProtoConverter.scala:
##########
@@ -154,46 +215,46 @@ object LiteralValueProtoConverter {
     }
 
     val elementType = array.getElementType
-    if (elementType.hasShort) {
-      makeArrayData(v => v.getShort.toShort)
-    } else if (elementType.hasInteger) {
-      makeArrayData(v => v.getInteger)
-    } else if (elementType.hasLong) {
-      makeArrayData(v => v.getLong)
-    } else if (elementType.hasDouble) {
-      makeArrayData(v => v.getDouble)
-    } else if (elementType.hasByte) {
-      makeArrayData(v => v.getByte.toByte)
-    } else if (elementType.hasFloat) {
-      makeArrayData(v => v.getFloat)
-    } else if (elementType.hasBoolean) {
-      makeArrayData(v => v.getBoolean)
-    } else if (elementType.hasString) {
-      makeArrayData(v => v.getString)
-    } else if (elementType.hasBinary) {
-      makeArrayData(v => v.getBinary.toByteArray)
-    } else if (elementType.hasDate) {
-      makeArrayData(v => DateTimeUtils.toJavaDate(v.getDate))
-    } else if (elementType.hasTimestamp) {
-      makeArrayData(v => DateTimeUtils.toJavaTimestamp(v.getTimestamp))
-    } else if (elementType.hasTimestampNtz) {
-      makeArrayData(v => DateTimeUtils.microsToLocalDateTime(v.getTimestampNtz))
-    } else if (elementType.hasDayTimeInterval) {
-      makeArrayData(v => IntervalUtils.microsToDuration(v.getDayTimeInterval))
-    } else if (elementType.hasYearMonthInterval) {
-      makeArrayData(v => IntervalUtils.monthsToPeriod(v.getYearMonthInterval))
-    } else if (elementType.hasDecimal) {
-      makeArrayData(v => Decimal(v.getDecimal.getValue))
-    } else if (elementType.hasCalendarInterval) {
-      makeArrayData(v => {
-        val interval = v.getCalendarInterval
-        new CalendarInterval(interval.getMonths, interval.getDays, interval.getMicroseconds)
-      })
-    } else if (elementType.hasArray) {
-      makeArrayData(v => toArrayData(v.getArray))
-    } else {
-      throw InvalidPlanInput(s"Unsupported Literal Type: $elementType)")
+    makeArrayData(getConverter(elementType))
+  }
+
+  private def toMapData(map: proto.Expression.Literal.Map): Any = {
+    def makeMapData[K, V](
+        keyConverter: proto.Expression.Literal => K,
+        valueConverter: proto.Expression.Literal => V)(implicit
+        tagK: ClassTag[K],
+        tagV: ClassTag[V]): mutable.Map[K, V] = {
+      val builder = mutable.HashMap.empty[K, V]
+      val keys = map.getKeysList.asScala
+      val values = map.getValuesList.asScala
+      builder.sizeHint(keys.size)
+      keys.zip(values).foreach { case (key, value) =>
+        builder += ((keyConverter(key), valueConverter(value)))
+      }
+      builder
     }
+
+    makeMapData(getConverter(map.getKeyType), getConverter(map.getValueType))
   }
 
+  private def toStructData(struct: proto.Expression.Literal.Struct): Any = {
+    val elements = struct.getElementsList.asScala
+    val dataTypes = struct.getStructType.getStruct.getFieldsList.asScala.map(_.getDataType)
+    val structData = elements
+      .zip(dataTypes)
+      .map { case (element, dataType) =>
+        getConverter(dataType)(element)
+      }
+      .toList
+
+    structData match {
+      case List(a) => (a)
+      case List(a, b) => (a, b)
+      case List(a, b, c) => (a, b, c)
+      case List(a, b, c, d) => (a, b, c, d)
+      case List(a, b, c, d, e) => (a, b, c, d, e)
+      case List(a, b, c, d, e, f) => (a, b, c, d, e, f)

Review Comment:
   ping @hvanhovell @zhengruifeng Do you have other idea ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1139690510


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -134,46 +192,46 @@ object LiteralExpressionProtoConverter {
       builder.result()
     }
 
-    val elementType = array.getElementType
-    if (elementType.hasShort) {
-      makeArrayData(v => v.getShort.toShort)
-    } else if (elementType.hasInteger) {
-      makeArrayData(v => v.getInteger)
-    } else if (elementType.hasLong) {
-      makeArrayData(v => v.getLong)
-    } else if (elementType.hasDouble) {
-      makeArrayData(v => v.getDouble)
-    } else if (elementType.hasByte) {
-      makeArrayData(v => v.getByte.toByte)
-    } else if (elementType.hasFloat) {
-      makeArrayData(v => v.getFloat)
-    } else if (elementType.hasBoolean) {
-      makeArrayData(v => v.getBoolean)
-    } else if (elementType.hasString) {
-      makeArrayData(v => v.getString)
-    } else if (elementType.hasBinary) {
-      makeArrayData(v => v.getBinary.toByteArray)
-    } else if (elementType.hasDate) {
-      makeArrayData(v => DateTimeUtils.toJavaDate(v.getDate))
-    } else if (elementType.hasTimestamp) {
-      makeArrayData(v => DateTimeUtils.toJavaTimestamp(v.getTimestamp))
-    } else if (elementType.hasTimestampNtz) {
-      makeArrayData(v => DateTimeUtils.microsToLocalDateTime(v.getTimestampNtz))
-    } else if (elementType.hasDayTimeInterval) {
-      makeArrayData(v => IntervalUtils.microsToDuration(v.getDayTimeInterval))
-    } else if (elementType.hasYearMonthInterval) {
-      makeArrayData(v => IntervalUtils.monthsToPeriod(v.getYearMonthInterval))
-    } else if (elementType.hasDecimal) {
-      makeArrayData(v => Decimal(v.getDecimal.getValue))
-    } else if (elementType.hasCalendarInterval) {
-      makeArrayData(v => {
-        val interval = v.getCalendarInterval
-        new CalendarInterval(interval.getMonths, interval.getDays, interval.getMicroseconds)
-      })
-    } else if (elementType.hasArray) {
-      makeArrayData(v => toArrayData(v.getArray))
-    } else {
-      throw InvalidPlanInput(s"Unsupported Literal Type: $elementType)")
+    makeArrayData(getConverter(array.getElementType))
+  }
+
+  private def toMapData(map: proto.Expression.Literal.Map): Any = {
+    def makeMapData[K, V](
+        keyConverter: proto.Expression.Literal => K,
+        valueConverter: proto.Expression.Literal => V)(implicit
+        tagK: ClassTag[K],
+        tagV: ClassTag[V]): mutable.Map[K, V] = {
+      val builder = mutable.HashMap.empty[K, V]
+      val keys = map.getKeysList.asScala
+      val values = map.getValuesList.asScala
+      builder.sizeHint(keys.size)
+      keys.zip(values).foreach { case (key, value) =>
+        builder += ((keyConverter(key), valueConverter(value)))
+      }
+      builder
+    }
+
+    makeMapData(getConverter(map.getKeyType), getConverter(map.getValueType))
+  }
+
+  private def toStructData(struct: proto.Expression.Literal.Struct): Any = {
+    val elements = struct.getElementsList.asScala
+    val dataTypes = struct.getStructType.getStruct.getFieldsList.asScala.map(_.getDataType)
+    val structData = elements

Review Comment:
   When upgrading to use Scala 3, I think we should be able to avoid using java reflection
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1133912368


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,48 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),
+      fn.typedLit(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9))),
+      fn.typedLit(Seq(Map("a" -> 1, "b" -> 2), Map("a" -> 3, "b" -> 4), Map("a" -> 5, "b" -> 6))),
+      fn.typedLit(Map(1 -> Map("a" -> 1, "b" -> 2), 2 -> Map("a" -> 3, "b" -> 4))),
+      fn.typedLit((Seq(1, 2, 3), Map("a" -> 1, "b" -> 2), ("a", Map(1 -> "a", 2 -> "b")))))

Review Comment:
   ```
   functions.typedlit[mutable.Map[Int, Option[Int]]](mutable.Map(1 -> None))
   ```
   
   in sql module return `MAP(1, CAST(NULL AS INT))`
   
   in connect client module throw UnsupportedOperationException:
   
   ```
   literal Map(1 -> None) not supported (yet).
   java.lang.UnsupportedOperationException: literal Map(1 -> None) not supported (yet).
   	at org.apache.spark.sql.connect.common.LiteralValueProtoConverter$.toLiteralProtoBuilder(LiteralValueProtoConverter.scala:100)
   	at org.apache.spark.sql.connect.common.LiteralValueProtoConverter$.$anonfun$create$2(LiteralValueProtoConverter.scala:151)
   	at scala.util.Failure.getOrElse(Try.scala:222)
   	at org.apache.spark.sql.connect.common.LiteralValueProtoConverter$.create(LiteralValueProtoConverter.scala:151)
   	at org.apache.spark.sql.functions$.typedlit(functions.scala:138)
   	at org.apache.spark.sql.PlanGenerationTestSuite.$anonfun$new$441(PlanGenerationTestSuite.scala:2112)
   	at org.apache.spark.sql.PlanGenerationTestSuite.$anonfun$test$1(PlanGenerationTestSuite.scala:115)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] amaliujia commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.

amaliujia commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1134333517


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -106,6 +106,37 @@ object functions {
       case _ => createLiteral(toLiteralProtoBuilder(literal))
     }
   }
+
+  /**
+   * Creates a [[Column]] of literal value.
+   *
+   * An alias of `typedlit`, and it is encouraged to use `typedlit` directly.
+   *
+   * @group normal_funcs
+   * @since 3.4.0
+   */
+  def typedLit[T: TypeTag](literal: T): Column = typedlit(literal)
+
+  /**
+   * Creates a [[Column]] of literal value.
+   *
+   * The passed in object is returned directly if it is already a [[Column]]. If the object is a
+   * Scala Symbol, it is converted into a [[Column]] also. Otherwise, a new [[Column]] is created
+   * to represent the literal value. The difference between this function and [[lit]] is that this
+   * function can handle parameterized scala types e.g.: List, Seq and Map.
+   *
+   * @note
+   *   `typedlit` will call expensive Scala reflection APIs. `lit` is preferred if parameterized
+   *   Scala types are not used.

Review Comment:
   Actually yes. I see `ScalaReflection` is introduced below



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1539453237

   friendly ping @HyukjinKwon @hvanhovell @zhengruifeng 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1484986064

   ping @hvanhovell 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1131890754


##########
connector/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -195,6 +197,17 @@ message Expression {
       DataType elementType = 1;
       repeated Literal element = 2;
     }
+
+    message Map {
+      DataType keyType = 1;
+      DataType valueType = 2;
+      map<string, Literal> map_data = 3;

Review Comment:
   Yeah. So we can support the key as any literal.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1133926026


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,48 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),
+      fn.typedLit(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9))),
+      fn.typedLit(Seq(Map("a" -> 1, "b" -> 2), Map("a" -> 3, "b" -> 4), Map("a" -> 5, "b" -> 6))),
+      fn.typedLit(Map(1 -> Map("a" -> 1, "b" -> 2), 2 -> Map("a" -> 3, "b" -> 4))),
+      fn.typedLit((Seq(1, 2, 3), Map("a" -> 1, "b" -> 2), ("a", Map(1 -> "a", 2 -> "b")))))

Review Comment:
   Another case： 
   
   ```
   functions.typedlit[Array[Option[Int]]](Array(Some(1)))
   ```
   
   in sql module return `ARRAY(1)`
   
   in connect client module throw UnsupportedOperationException:
   
   ```
   Unsupported component type class scala.Option in arrays.
   java.lang.UnsupportedOperationException: Unsupported component type class scala.Option in arrays.
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1139170837


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -134,46 +192,46 @@ object LiteralExpressionProtoConverter {
       builder.result()
     }
 
-    val elementType = array.getElementType
-    if (elementType.hasShort) {
-      makeArrayData(v => v.getShort.toShort)
-    } else if (elementType.hasInteger) {
-      makeArrayData(v => v.getInteger)
-    } else if (elementType.hasLong) {
-      makeArrayData(v => v.getLong)
-    } else if (elementType.hasDouble) {
-      makeArrayData(v => v.getDouble)
-    } else if (elementType.hasByte) {
-      makeArrayData(v => v.getByte.toByte)
-    } else if (elementType.hasFloat) {
-      makeArrayData(v => v.getFloat)
-    } else if (elementType.hasBoolean) {
-      makeArrayData(v => v.getBoolean)
-    } else if (elementType.hasString) {
-      makeArrayData(v => v.getString)
-    } else if (elementType.hasBinary) {
-      makeArrayData(v => v.getBinary.toByteArray)
-    } else if (elementType.hasDate) {
-      makeArrayData(v => DateTimeUtils.toJavaDate(v.getDate))
-    } else if (elementType.hasTimestamp) {
-      makeArrayData(v => DateTimeUtils.toJavaTimestamp(v.getTimestamp))
-    } else if (elementType.hasTimestampNtz) {
-      makeArrayData(v => DateTimeUtils.microsToLocalDateTime(v.getTimestampNtz))
-    } else if (elementType.hasDayTimeInterval) {
-      makeArrayData(v => IntervalUtils.microsToDuration(v.getDayTimeInterval))
-    } else if (elementType.hasYearMonthInterval) {
-      makeArrayData(v => IntervalUtils.monthsToPeriod(v.getYearMonthInterval))
-    } else if (elementType.hasDecimal) {
-      makeArrayData(v => Decimal(v.getDecimal.getValue))
-    } else if (elementType.hasCalendarInterval) {
-      makeArrayData(v => {
-        val interval = v.getCalendarInterval
-        new CalendarInterval(interval.getMonths, interval.getDays, interval.getMicroseconds)
-      })
-    } else if (elementType.hasArray) {
-      makeArrayData(v => toArrayData(v.getArray))
-    } else {
-      throw InvalidPlanInput(s"Unsupported Literal Type: $elementType)")
+    makeArrayData(getConverter(array.getElementType))
+  }
+
+  private def toMapData(map: proto.Expression.Literal.Map): Any = {
+    def makeMapData[K, V](
+        keyConverter: proto.Expression.Literal => K,
+        valueConverter: proto.Expression.Literal => V)(implicit
+        tagK: ClassTag[K],
+        tagV: ClassTag[V]): mutable.Map[K, V] = {
+      val builder = mutable.HashMap.empty[K, V]
+      val keys = map.getKeysList.asScala
+      val values = map.getValuesList.asScala
+      builder.sizeHint(keys.size)
+      keys.zip(values).foreach { case (key, value) =>
+        builder += ((keyConverter(key), valueConverter(value)))
+      }
+      builder
+    }
+
+    makeMapData(getConverter(map.getKeyType), getConverter(map.getValueType))
+  }
+
+  private def toStructData(struct: proto.Expression.Literal.Struct): Any = {
+    val elements = struct.getElementsList.asScala
+    val dataTypes = struct.getStructType.getStruct.getFieldsList.asScala.map(_.getDataType)
+    val structData = elements

Review Comment:
   Maybe we can define a method similar to the following:
   
   ```
       def toTuple[A <: Object](data: Seq[A]): Product = {
         try {
           val tupleClass = Utils.classForName("scala.Tuple" + data.length)
           tupleClass.getConstructors.head.newInstance(data: _*).asInstanceOf[Product]
         } catch {
           case _: Exception =>
             throw InvalidPlanInput(s"Unsupported Literal: " +
               s"${data.mkString("Array(", ", ", ")")})")
         }
       }
   ```
   
   then 
   
   ```
       val structData = elements
         .zip(dataTypes)
         .map { case (element, dataType) =>
           getConverter(dataType)(element)
         }.toSeq.asInstanceOf[Seq[Object]]
       toTuple(structData)
   ```
   
   Some optimizations can be made for constructors in `toTuple`, such as constructor method caching
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136482549


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   ```
   gh pr checkout 40355
   dev/change-scala-version.sh 2.13
   build/sbt "connect-client-jvm/testOnly org.apache.spark.sql.PlanGenerationTestSuite" -Pscala-2.13
   
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1464821519

   ping @hvanhovell @HyukjinKwon @zhengruifeng @amaliujia 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136462174


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   I reconsidered that we should make `LiteralExpressionProtoConverter` only related to the scala type. The `CatalystTypeConverters` can help treat them as the catalyst one.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135696525


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   For the `STRUCT`, my thoughts as follows, for reference only:
   1.  the return type of `toStructData` should be `Array[Any]`
   2. `dataType.hasString` in `getConverter` may need return `UTF8String.fromString(v.getString)`
   3. the result maybe `expressions.Literal.create(new GenericInternalRow(structData), dataType)`
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135112879


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,48 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),
+      fn.typedLit(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9))),
+      fn.typedLit(Seq(Map("a" -> 1, "b" -> 2), Map("a" -> 3, "b" -> 4), Map("a" -> 5, "b" -> 6))),
+      fn.typedLit(Map(1 -> Map("a" -> 1, "b" -> 2), 2 -> Map("a" -> 3, "b" -> 4))),
+      fn.typedLit((Seq(1, 2, 3), Map("a" -> 1, "b" -> 2), ("a", Map(1 -> "a", 2 -> "b")))))

Review Comment:
   Yeah ~ the test cases have been relatively rich
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.

HyukjinKwon commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1540082810

   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1518995122

   cc @ueshin 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1477836670

   ping @hvanhovell @HyukjinKwon Could you have time to take a look?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] amaliujia commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "amaliujia (via GitHub)" <gi...@apache.org>.

amaliujia commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1134331640


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -106,6 +106,37 @@ object functions {
       case _ => createLiteral(toLiteralProtoBuilder(literal))
     }
   }
+
+  /**
+   * Creates a [[Column]] of literal value.
+   *
+   * An alias of `typedlit`, and it is encouraged to use `typedlit` directly.
+   *
+   * @group normal_funcs
+   * @since 3.4.0
+   */
+  def typedLit[T: TypeTag](literal: T): Column = typedlit(literal)
+
+  /**
+   * Creates a [[Column]] of literal value.
+   *
+   * The passed in object is returned directly if it is already a [[Column]]. If the object is a
+   * Scala Symbol, it is converted into a [[Column]] also. Otherwise, a new [[Column]] is created
+   * to represent the literal value. The difference between this function and [[lit]] is that this
+   * function can handle parameterized scala types e.g.: List, Seq and Map.
+   *
+   * @note
+   *   `typedlit` will call expensive Scala reflection APIs. `lit` is preferred if parameterized
+   *   Scala types are not used.

Review Comment:
   Just for my self education: 
   
   is this still true for the case of Spark Connect?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1133565363


##########
connector/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -195,6 +197,18 @@ message Expression {
       DataType element_type = 1;
       repeated Literal elements = 2;
     }
+
+    message Map {
+      DataType keyType = 1;
+      DataType valueType = 2;

Review Comment:
   ```suggestion
         DataType key_type = 1;
         DataType value_type = 2;
   ```



##########
connector/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -195,6 +197,18 @@ message Expression {
       DataType element_type = 1;
       repeated Literal elements = 2;
     }
+
+    message Map {
+      DataType keyType = 1;
+      DataType valueType = 2;
+      repeated Literal keys = 3;
+      repeated Literal values = 4;
+    }
+
+    message Struct {
+      DataType structType = 1;

Review Comment:
   ```suggestion
         DataType struct_type = 1;
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136482549


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   ```
   gh pr checkout 40355
   dev/change-scala-version.sh 2.13
   build/sbt "connect-client-jvm/testOnly org.apache.spark.sql.PlanGenerationTestSuite" -Pscala-2.13
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135946467


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   @beliefer I want to give a pull request to your fork(to make struct more general),  but I haven't find it from the candidate list, so I paste the diff [link](https://github.com/apache/spark/commit/4728d82011a57f92664a3bcfd6191e6a3a9dd97f#diff-0dc55081de1d8f70192f5e9dce5d8a3f00f1f9b1441cad6bce642a68dc39d5d2) here for reference only(all test passed with Scala 2.12)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135728185


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   I made a simple change locally according to my idea and tested ``. The results are as follows:
   
   ```
   simple.select(fn.typedlit(("1", 2, true, 3.toDouble, 4.toFloat, 5.toByte, 6.toShort)))
   ```
   
   the `explain` file contents as follows:
   
   ```
   Project [[1,2,true,3.0,4.0,5,6] AS NAMED_STRUCT('_1', '1', '_2', 2, '_3', true, '_4', 3.0D, '_5', CAST('4.0' AS FLOAT), '_6', 5Y, '_7', 6S)#0]
   +- LocalRelation <empty>, [id#0L, a#0, b#0]
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135940847


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   Local checked: the new case can't pass with Scala 2.13



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1131296289


##########
connector/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -195,6 +197,17 @@ message Expression {
       DataType elementType = 1;
       repeated Literal element = 2;
     }
+
+    message Map {
+      DataType keyType = 1;
+      DataType valueType = 2;
+      map<string, Literal> map_data = 3;

Review Comment:
   hmm... can the key of a map be Array/Map/Struct?
   
   My previous thought was to define keys and values as two arrays.
   
   like 
   
   ```
   message Map {
         DataType keyType = 1;
         DataType valueType = 2;
         repeated Literal key = 3;
         repeated Literal value = 4;
   }
   ```
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] hvanhovell commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "hvanhovell (via GitHub)" <gi...@apache.org>.

hvanhovell commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1189992713


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   Yeah that is fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1162267083


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -106,6 +106,37 @@ object functions {
       case _ => createLiteral(toLiteralProtoBuilder(literal))
     }
   }
+
+  /**
+   * Creates a [[Column]] of literal value.
+   *
+   * An alias of `typedlit`, and it is encouraged to use `typedlit` directly.
+   *
+   * @group normal_funcs
+   * @since 3.4.0

Review Comment:
   should change to 3.5.0



##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/functions.scala:
##########
@@ -106,6 +106,37 @@ object functions {
       case _ => createLiteral(toLiteralProtoBuilder(literal))
     }
   }
+
+  /**
+   * Creates a [[Column]] of literal value.
+   *
+   * An alias of `typedlit`, and it is encouraged to use `typedlit` directly.
+   *
+   * @group normal_funcs
+   * @since 3.4.0
+   */
+  def typedLit[T: TypeTag](literal: T): Column = typedlit(literal)
+
+  /**
+   * Creates a [[Column]] of literal value.
+   *
+   * The passed in object is returned directly if it is already a [[Column]]. If the object is a
+   * Scala Symbol, it is converted into a [[Column]] also. Otherwise, a new [[Column]] is created
+   * to represent the literal value. The difference between this function and [[lit]] is that this
+   * function can handle parameterized scala types e.g.: List, Seq and Map.
+   *
+   * @note
+   *   `typedlit` will call expensive Scala reflection APIs. `lit` is preferred if parameterized
+   *   Scala types are not used.
+   *
+   * @group normal_funcs
+   * @since 3.4.0

Review Comment:
   should change to 3.5.0



##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -211,46 +349,48 @@ object LiteralValueProtoConverter {
       builder.result()
     }
 
-    val elementType = array.getElementType
-    if (elementType.hasShort) {
-      makeArrayData(v => v.getShort.toShort)
-    } else if (elementType.hasInteger) {
-      makeArrayData(v => v.getInteger)
-    } else if (elementType.hasLong) {
-      makeArrayData(v => v.getLong)
-    } else if (elementType.hasDouble) {
-      makeArrayData(v => v.getDouble)
-    } else if (elementType.hasByte) {
-      makeArrayData(v => v.getByte.toByte)
-    } else if (elementType.hasFloat) {
-      makeArrayData(v => v.getFloat)
-    } else if (elementType.hasBoolean) {
-      makeArrayData(v => v.getBoolean)
-    } else if (elementType.hasString) {
-      makeArrayData(v => v.getString)
-    } else if (elementType.hasBinary) {
-      makeArrayData(v => v.getBinary.toByteArray)
-    } else if (elementType.hasDate) {
-      makeArrayData(v => DateTimeUtils.toJavaDate(v.getDate))
-    } else if (elementType.hasTimestamp) {
-      makeArrayData(v => DateTimeUtils.toJavaTimestamp(v.getTimestamp))
-    } else if (elementType.hasTimestampNtz) {
-      makeArrayData(v => DateTimeUtils.microsToLocalDateTime(v.getTimestampNtz))
-    } else if (elementType.hasDayTimeInterval) {
-      makeArrayData(v => IntervalUtils.microsToDuration(v.getDayTimeInterval))
-    } else if (elementType.hasYearMonthInterval) {
-      makeArrayData(v => IntervalUtils.monthsToPeriod(v.getYearMonthInterval))
-    } else if (elementType.hasDecimal) {
-      makeArrayData(v => Decimal(v.getDecimal.getValue))
-    } else if (elementType.hasCalendarInterval) {
-      makeArrayData(v => {
-        val interval = v.getCalendarInterval
-        new CalendarInterval(interval.getMonths, interval.getDays, interval.getMicroseconds)
-      })
-    } else if (elementType.hasArray) {
-      makeArrayData(v => toCatalystArray(v.getArray))
-    } else {
-      throw new UnsupportedOperationException(s"Unsupported Literal Type: $elementType)")
+    makeArrayData(getConverter(array.getElementType))
+  }
+
+  def toCatalystMap(map: proto.Expression.Literal.Map): mutable.Map[_, _] = {
+    def makeMapData[K, V](
+        keyConverter: proto.Expression.Literal => K,
+        valueConverter: proto.Expression.Literal => V)(implicit
+        tagK: ClassTag[K],
+        tagV: ClassTag[V]): mutable.Map[K, V] = {
+      val builder = mutable.HashMap.empty[K, V]
+      val keys = map.getKeysList.asScala
+      val values = map.getValuesList.asScala
+      builder.sizeHint(keys.size)
+      keys.zip(values).foreach { case (key, value) =>
+        builder += ((keyConverter(key), valueConverter(value)))
+      }
+      builder
     }
+
+    makeMapData(getConverter(map.getKeyType), getConverter(map.getValueType))
+  }
+
+  def toCatalystStruct(struct: proto.Expression.Literal.Struct): Any = {
+    def toTuple[A <: Object](data: Seq[A]): Product = {
+      try {
+        val tupleClass = Utils.classForName(s"scala.Tuple${data.length}")
+        tupleClass.getConstructors.head.newInstance(data: _*).asInstanceOf[Product]
+      } catch {
+        case _: Exception =>
+          throw InvalidPlanInput(s"Unsupported Literal: ${data.mkString("Array(", ", ", ")")})")
+      }
+    }
+
+    val elements = struct.getElementsList.asScala
+    val dataTypes = struct.getStructType.getStruct.getFieldsList.asScala.map(_.getDataType)
+    val structData = elements
+      .zip(dataTypes)
+      .map { case (element, dataType) =>
+        getConverter(dataType)(element)
+      }
+      .asInstanceOf[Seq[Object]]

Review Comment:
   Otherwise, the Scala 2.13 test will fail
   
   ```
   [info] - function_typedLit *** FAILED *** (14 milliseconds)
   [info]   java.lang.ClassCastException: scala.collection.mutable.ArrayBuffer cannot be cast to scala.collection.immutable.Seq
   [info]   at org.apache.spark.sql.connect.common.LiteralValueProtoConverter$.toCatalystStruct(LiteralValueProtoConverter.scala:389)
   [info]   at org.apache.spark.sql.connect.planner.LiteralExpressionProtoConverter$.toCatalystExpression(LiteralExpressionProtoConverter.scala:114)
   [info]   at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformLiteral(SparkConnectPlanner.scala:1250)
   [info]   at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformExpression(SparkConnectPlanner.scala:1183)
   [info]   at org.apache.spark.sql.connect.planner.SparkConnectPlanner.$anonfun$transformProject$1(SparkConnectPlanner.scala:1165)
   [info]   at scala.collection.immutable.List.map(List.scala:250)
   [info]   at scala.collection.immutable.List.map(List.scala:79)
   [info]   at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformProject(SparkConnectPlanner.scala:1165)
   [info]   at org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformRelation(SparkConnectPlanner.scala:96)
   [info]   at org.apache.spark.sql.connect.ProtoToParsedPlanTestSuite.$anonfun$createTest$2(ProtoToParsedPlanTestSuite.scala:167)
   
   ```



##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -211,46 +349,48 @@ object LiteralValueProtoConverter {
       builder.result()
     }
 
-    val elementType = array.getElementType
-    if (elementType.hasShort) {
-      makeArrayData(v => v.getShort.toShort)
-    } else if (elementType.hasInteger) {
-      makeArrayData(v => v.getInteger)
-    } else if (elementType.hasLong) {
-      makeArrayData(v => v.getLong)
-    } else if (elementType.hasDouble) {
-      makeArrayData(v => v.getDouble)
-    } else if (elementType.hasByte) {
-      makeArrayData(v => v.getByte.toByte)
-    } else if (elementType.hasFloat) {
-      makeArrayData(v => v.getFloat)
-    } else if (elementType.hasBoolean) {
-      makeArrayData(v => v.getBoolean)
-    } else if (elementType.hasString) {
-      makeArrayData(v => v.getString)
-    } else if (elementType.hasBinary) {
-      makeArrayData(v => v.getBinary.toByteArray)
-    } else if (elementType.hasDate) {
-      makeArrayData(v => DateTimeUtils.toJavaDate(v.getDate))
-    } else if (elementType.hasTimestamp) {
-      makeArrayData(v => DateTimeUtils.toJavaTimestamp(v.getTimestamp))
-    } else if (elementType.hasTimestampNtz) {
-      makeArrayData(v => DateTimeUtils.microsToLocalDateTime(v.getTimestampNtz))
-    } else if (elementType.hasDayTimeInterval) {
-      makeArrayData(v => IntervalUtils.microsToDuration(v.getDayTimeInterval))
-    } else if (elementType.hasYearMonthInterval) {
-      makeArrayData(v => IntervalUtils.monthsToPeriod(v.getYearMonthInterval))
-    } else if (elementType.hasDecimal) {
-      makeArrayData(v => Decimal(v.getDecimal.getValue))
-    } else if (elementType.hasCalendarInterval) {
-      makeArrayData(v => {
-        val interval = v.getCalendarInterval
-        new CalendarInterval(interval.getMonths, interval.getDays, interval.getMicroseconds)
-      })
-    } else if (elementType.hasArray) {
-      makeArrayData(v => toCatalystArray(v.getArray))
-    } else {
-      throw new UnsupportedOperationException(s"Unsupported Literal Type: $elementType)")
+    makeArrayData(getConverter(array.getElementType))
+  }
+
+  def toCatalystMap(map: proto.Expression.Literal.Map): mutable.Map[_, _] = {
+    def makeMapData[K, V](
+        keyConverter: proto.Expression.Literal => K,
+        valueConverter: proto.Expression.Literal => V)(implicit
+        tagK: ClassTag[K],
+        tagV: ClassTag[V]): mutable.Map[K, V] = {
+      val builder = mutable.HashMap.empty[K, V]
+      val keys = map.getKeysList.asScala
+      val values = map.getValuesList.asScala
+      builder.sizeHint(keys.size)
+      keys.zip(values).foreach { case (key, value) =>
+        builder += ((keyConverter(key), valueConverter(value)))
+      }
+      builder
     }
+
+    makeMapData(getConverter(map.getKeyType), getConverter(map.getValueType))
+  }
+
+  def toCatalystStruct(struct: proto.Expression.Literal.Struct): Any = {
+    def toTuple[A <: Object](data: Seq[A]): Product = {
+      try {
+        val tupleClass = Utils.classForName(s"scala.Tuple${data.length}")
+        tupleClass.getConstructors.head.newInstance(data: _*).asInstanceOf[Product]
+      } catch {
+        case _: Exception =>
+          throw InvalidPlanInput(s"Unsupported Literal: ${data.mkString("Array(", ", ", ")")})")
+      }
+    }
+
+    val elements = struct.getElementsList.asScala
+    val dataTypes = struct.getStructType.getStruct.getFieldsList.asScala.map(_.getDataType)
+    val structData = elements
+      .zip(dataTypes)
+      .map { case (element, dataType) =>
+        getConverter(dataType)(element)
+      }
+      .asInstanceOf[Seq[Object]]

Review Comment:
   ```suggestion
         .asInstanceOf[scala.collection.Seq[Object]].toSeq
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1489622921

   @hvanhovell Could you take a review ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1131296289


##########
connector/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -195,6 +197,17 @@ message Expression {
       DataType elementType = 1;
       repeated Literal element = 2;
     }
+
+    message Map {
+      DataType keyType = 1;
+      DataType valueType = 2;
+      map<string, Literal> map_data = 3;

Review Comment:
   hmm... can the key of a map be ArrayMap/Struct?
   
   My previous thought was to define keys and values as two arrays
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135696525


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   For the `STRUCT`, my thoughts as follows, for reference only:
   1.  the return type of `toStructData` should be `Array[Any]`
   2. `dataType.hasString` in `getConverter` may need change to return `UTF8String.fromString(v.getString)`,  otherwise, error may be reported in step 3
   3. the result maybe `expressions.Literal.create(new GenericInternalRow(structData), dataType)`
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhenlineo commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "zhenlineo (via GitHub)" <gi...@apache.org>.

zhenlineo commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1147007659


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2069,6 +2069,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Array(1, 2, 3)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),

Review Comment:
   Do we plan to support named struct too? e.g. `{"a":1,"b":2,"c":3}`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1133568797


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,44 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)))

Review Comment:
   what about adding a few nested cases?
   e.g.  
   `array<array<...>>`
   `array<map<key, value>>`
   `map<string, map<string, string>>`
   `struct<array<...>, map<k1, v1>, struct<string, map<k2, v2>>>`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1133898291


##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,56 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(list: List[_], elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+      list.foreach(x => ab.addElements(toLiteralProto(x, elementType)))
+      ab
+    }
+
+    def mapBuilder(map: Map[_, _], keyType: DataType, valueType: DataType) = {
+      val mb = builder.getMapBuilder
+        .setKeyType(toConnectProtoType(keyType))
+        .setValueType(toConnectProtoType(valueType))
+      map.foreach { case (k, v) =>
+        mb.addKeys(toLiteralProto(k, keyType))
+        mb.addValues(toLiteralProto(v, valueType))
+      }
+      mb
+    }
+
+    def structBuilder(values: Seq[Any], structType: StructType) = {
+      val sb = builder.getStructBuilder.setStructType(toConnectProtoType(structType))
+      values.zip(structType.fields.map(_.dataType)).foreach { case (v, dataType) =>
+        sb.addElements(toLiteralProto(v, dataType))
+      }
+      sb
+    }
+
+    (literal, dataType) match {
+      case (v: List[_], ArrayType(elementType, _)) =>

Review Comment:
   for support both `mutable` and `immutable` collection, I think we should match `scala.collection.Seq` and `scala.collection.Map`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1467866018

   ping @hvanhovell @zhengruifeng @LuciferYang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1133925796


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,48 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),
+      fn.typedLit(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9))),
+      fn.typedLit(Seq(Map("a" -> 1, "b" -> 2), Map("a" -> 3, "b" -> 4), Map("a" -> 5, "b" -> 6))),
+      fn.typedLit(Map(1 -> Map("a" -> 1, "b" -> 2), 2 -> Map("a" -> 3, "b" -> 4))),
+      fn.typedLit((Seq(1, 2, 3), Map("a" -> 1, "b" -> 2), ("a", Map(1 -> "a", 2 -> "b")))))

Review Comment:
   Uh, this usage. Thank you for the reminder.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1131296289


##########
connector/connect/common/src/main/protobuf/spark/connect/expressions.proto:
##########
@@ -195,6 +197,17 @@ message Expression {
       DataType elementType = 1;
       repeated Literal element = 2;
     }
+
+    message Map {
+      DataType keyType = 1;
+      DataType valueType = 2;
+      map<string, Literal> map_data = 3;

Review Comment:
   hmm... can the key of a map be Array/Map/Struct?
   
   My previous thought was to define keys and values as two arrays
   
   
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1462380542

   Thanks for your work. I'll take a closer look tomorrow
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1133568797


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,44 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)))

Review Comment:
   what about adding a few nested cases?
   e.g.  array<array>; array<map>; map<string, map<string, string>>



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136481658


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   How to recurrent ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136459267


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   Yeah, my idea is use `GenericArrayData`, `ArrayBasedMapData` and `GenericInternalRow` to create `Array`, `Map` and `Struct` Literal directly, then the `Struct` type will not limited by the number of fields. In that scenario, maybe should return `UTF8String ` directly
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136486450


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   run test with `SPARK_GENERATE_GOLDEN_FILES=1`, the diff as follows:
   
   <img width="813" alt="image" src="https://user-images.githubusercontent.com/1475305/225195564-4d4907da-8e03-4971-bcb3-9175c4433878.png">
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135696525


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   For the `STRUCT`, my thoughts as follows, for reference only:
   1.  the return type of `toStructData` should be `Array[Any]`
   2. `dataType.hasString` in `getConverter` may need return `UTF8String.fromString(v.getString)`,  otherwise, error may be reported in step 3
   3. the result maybe `expressions.Literal.create(new GenericInternalRow(structData), dataType)`
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135468221


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,48 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),
+      fn.typedLit(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9))),
+      fn.typedLit(Seq(Map("a" -> 1, "b" -> 2), Map("a" -> 3, "b" -> 4), Map("a" -> 5, "b" -> 6))),
+      fn.typedLit(Map(1 -> Map("a" -> 1, "b" -> 2), 2 -> Map("a" -> 3, "b" -> 4))),
+      fn.typedLit((Seq(1, 2, 3), Map("a" -> 1, "b" -> 2), ("a", Map(1 -> "a", 2 -> "b")))))

Review Comment:
   @LuciferYang Could you take a look again?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] zhengruifeng commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.

zhengruifeng commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1480570911

   ping @hvanhovell @zhenlineo 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135946467


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   @beliefer I want to give a pull request to your fork(to make struct more general),  but I haven't find it from the candidate list, so I paste the diff [link](https://github.com/apache/spark/commit/4728d82011a57f92664a3bcfd6191e6a3a9dd97f#diff-0dc55081de1d8f70192f5e9dce5d8a3f00f1f9b1441cad6bce642a68dc39d5d2) here for reference only(all test passed with Scala 2.12), hope it can help



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1133932100


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,48 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),
+      fn.typedLit(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9))),
+      fn.typedLit(Seq(Map("a" -> 1, "b" -> 2), Map("a" -> 3, "b" -> 4), Map("a" -> 5, "b" -> 6))),
+      fn.typedLit(Map(1 -> Map("a" -> 1, "b" -> 2), 2 -> Map("a" -> 3, "b" -> 4))),
+      fn.typedLit((Seq(1, 2, 3), Map("a" -> 1, "b" -> 2), ("a", Map(1 -> "a", 2 -> "b")))))

Review Comment:
   Maybe we should refer `CatalystTypeConverters.ArrayConverter`,  `CatalystTypeConverters.MapConverter` and  other `CatalystTypeConverter`?



##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,48 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),
+      fn.typedLit(Seq(Seq(1, 2, 3), Seq(4, 5, 6), Seq(7, 8, 9))),
+      fn.typedLit(Seq(Map("a" -> 1, "b" -> 2), Map("a" -> 3, "b" -> 4), Map("a" -> 5, "b" -> 6))),
+      fn.typedLit(Map(1 -> Map("a" -> 1, "b" -> 2), 2 -> Map("a" -> 3, "b" -> 4))),
+      fn.typedLit((Seq(1, 2, 3), Map("a" -> 1, "b" -> 2), ("a", Map(1 -> "a", 2 -> "b")))))

Review Comment:
   Maybe we should refer to `CatalystTypeConverters.ArrayConverter`,  `CatalystTypeConverters.MapConverter` and  other `CatalystTypeConverter`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] hvanhovell commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "hvanhovell (via GitHub)" <gi...@apache.org>.

hvanhovell commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1189929920


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   @beliefer how complete is this test? In other words, do you know of anything we do not support?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1189940284


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   not support `UserDefinedType` as far as I know
   
   EDIT: need to check if really need support it
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1541163674

   @HyukjinKwon @LuciferYang @zhengruifeng @amaliujia @zhenlineo Thank you !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1147073305


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2069,6 +2069,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),
+      fn.typedLit[String](null),
+      fn.typedLit(true),
+      fn.typedLit(68.toByte),
+      fn.typedLit(9872.toShort),
+      fn.typedLit(-8726532),
+      fn.typedLit(7834609328726532L),
+      fn.typedLit(Math.E),
+      fn.typedLit(-0.8f),
+      fn.typedLit(BigDecimal(8997620, 5)),
+      fn.typedLit(BigDecimal(898897667231L, 7).bigDecimal),
+      fn.typedLit("connect!"),
+      fn.typedLit('T'),
+      fn.typedLit(Array.tabulate(10)(i => ('A' + i).toChar)),
+      fn.typedLit(Array.tabulate(23)(i => (i + 120).toByte)),
+      fn.typedLit(mutable.WrappedArray.make(Array[Byte](8.toByte, 6.toByte))),
+      fn.typedLit(null),
+      fn.typedLit(java.time.LocalDate.of(2020, 10, 10)),
+      fn.typedLit(Decimal.apply(BigDecimal(8997620, 6))),
+      fn.typedLit(java.time.Instant.ofEpochMilli(1677155519808L)),
+      fn.typedLit(new java.sql.Timestamp(12345L)),
+      fn.typedLit(java.time.LocalDateTime.of(2023, 2, 23, 20, 36)),
+      fn.typedLit(java.sql.Date.valueOf("2023-02-23")),
+      fn.typedLit(java.time.Duration.ofSeconds(200L)),
+      fn.typedLit(java.time.Period.ofDays(100)),
+      fn.typedLit(new CalendarInterval(2, 20, 100L)),
+
+      // Handle parameterized scala types e.g.: List, Seq and Map.
+      fn.typedLit(Some(1)),
+      fn.typedLit(Array(1, 2, 3)),
+      fn.typedLit(Seq(1, 2, 3)),
+      fn.typedLit(Map("a" -> 1, "b" -> 2)),
+      fn.typedLit(("a", 2, 1.0)),

Review Comment:
   It seems `typedLit` don't support this usage.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136606050


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   I tested this case on my local machine. It's should be fixed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #40355:
URL: https://github.com/apache/spark/pull/40355#issuecomment-1469539349

   @hvanhovell Could you have time to take a look ? Thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon closed pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "HyukjinKwon (via GitHub)" <gi...@apache.org>.

HyukjinKwon closed pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit
URL: https://github.com/apache/spark/pull/40355


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1189940284


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   not support `UserDefinedType` as far as I know, 
   
   



##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,55 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {

Review Comment:
   not support `UserDefinedType` as far as I know
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1131922264


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralValueProtoConverter.scala:
##########
@@ -154,46 +215,46 @@ object LiteralValueProtoConverter {
     }
 
     val elementType = array.getElementType
-    if (elementType.hasShort) {
-      makeArrayData(v => v.getShort.toShort)
-    } else if (elementType.hasInteger) {
-      makeArrayData(v => v.getInteger)
-    } else if (elementType.hasLong) {
-      makeArrayData(v => v.getLong)
-    } else if (elementType.hasDouble) {
-      makeArrayData(v => v.getDouble)
-    } else if (elementType.hasByte) {
-      makeArrayData(v => v.getByte.toByte)
-    } else if (elementType.hasFloat) {
-      makeArrayData(v => v.getFloat)
-    } else if (elementType.hasBoolean) {
-      makeArrayData(v => v.getBoolean)
-    } else if (elementType.hasString) {
-      makeArrayData(v => v.getString)
-    } else if (elementType.hasBinary) {
-      makeArrayData(v => v.getBinary.toByteArray)
-    } else if (elementType.hasDate) {
-      makeArrayData(v => DateTimeUtils.toJavaDate(v.getDate))
-    } else if (elementType.hasTimestamp) {
-      makeArrayData(v => DateTimeUtils.toJavaTimestamp(v.getTimestamp))
-    } else if (elementType.hasTimestampNtz) {
-      makeArrayData(v => DateTimeUtils.microsToLocalDateTime(v.getTimestampNtz))
-    } else if (elementType.hasDayTimeInterval) {
-      makeArrayData(v => IntervalUtils.microsToDuration(v.getDayTimeInterval))
-    } else if (elementType.hasYearMonthInterval) {
-      makeArrayData(v => IntervalUtils.monthsToPeriod(v.getYearMonthInterval))
-    } else if (elementType.hasDecimal) {
-      makeArrayData(v => Decimal(v.getDecimal.getValue))
-    } else if (elementType.hasCalendarInterval) {
-      makeArrayData(v => {
-        val interval = v.getCalendarInterval
-        new CalendarInterval(interval.getMonths, interval.getDays, interval.getMicroseconds)
-      })
-    } else if (elementType.hasArray) {
-      makeArrayData(v => toArrayData(v.getArray))
-    } else {
-      throw InvalidPlanInput(s"Unsupported Literal Type: $elementType)")
+    makeArrayData(getConverter(elementType))
+  }
+
+  private def toMapData(map: proto.Expression.Literal.Map): Any = {
+    def makeMapData[K, V](
+        keyConverter: proto.Expression.Literal => K,
+        valueConverter: proto.Expression.Literal => V)(implicit
+        tagK: ClassTag[K],
+        tagV: ClassTag[V]): mutable.Map[K, V] = {
+      val builder = mutable.HashMap.empty[K, V]
+      val keys = map.getKeysList.asScala
+      val values = map.getValuesList.asScala
+      builder.sizeHint(keys.size)
+      keys.zip(values).foreach { case (key, value) =>
+        builder += ((keyConverter(key), valueConverter(value)))
+      }
+      builder
     }
+
+    makeMapData(getConverter(map.getKeyType), getConverter(map.getValueType))
   }
 
+  private def toStructData(struct: proto.Expression.Literal.Struct): Any = {
+    val elements = struct.getElementsList.asScala
+    val dataTypes = struct.getStructType.getStruct.getFieldsList.asScala.map(_.getDataType)
+    val structData = elements
+      .zip(dataTypes)
+      .map { case (element, dataType) =>
+        getConverter(dataType)(element)
+      }
+      .toList
+
+    structData match {
+      case List(a) => (a)
+      case List(a, b) => (a, b)
+      case List(a, b, c) => (a, b, c)
+      case List(a, b, c, d) => (a, b, c, d)
+      case List(a, b, c, d, e) => (a, b, c, d, e)
+      case List(a, b, c, d, e, f) => (a, b, c, d, e, f)

Review Comment:
   These code looks fat. But I have no idea that convert list to tuple[N].



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1131919854


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,43 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),

Review Comment:
   Thank you for the reminder.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1131290738


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/PlanGenerationTestSuite.scala:
##########
@@ -2065,6 +2065,43 @@ class PlanGenerationTestSuite
       fn.lit(Array(new CalendarInterval(2, 20, 100L), new CalendarInterval(2, 21, 200L))))
   }
 
+  test("function typedLit") {
+    simple.select(
+      fn.typedLit(fn.col("id")),
+      fn.typedLit('id),
+      fn.typedLit(1),

Review Comment:
   If I remember correctly, `typedLit` in sql module support `Option` value , like `fn.typedLit(Some(1))`?  Is `typedLit` in connect client will also support it?
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135946467


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   I want to give a pull request to your fork(to make struct more general),  but I haven't find it from the candidate list, so I paste the diff [link](https://github.com/apache/spark/commit/4728d82011a57f92664a3bcfd6191e6a3a9dd97f#diff-0dc55081de1d8f70192f5e9dce5d8a3f00f1f9b1441cad6bce642a68dc39d5d2) here for reference only(all test passed with Scala 2.12)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136439396


##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,87 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(scalaValue: Any, elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+
+      scalaValue match {
+        case a: Array[_] =>
+          a.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case s: scala.collection.Seq[_] =>
+          s.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      ab
+    }
+
+    def mapBuilder(scalaValue: Any, keyType: DataType, valueType: DataType) = {
+      val mb = builder.getMapBuilder
+        .setKeyType(toConnectProtoType(keyType))
+        .setValueType(toConnectProtoType(valueType))
+
+      scalaValue match {
+        case map: scala.collection.Map[_, _] =>
+          map.foreach { case (k, v) =>
+            mb.addKeys(toLiteralProto(k, keyType))
+            mb.addValues(toLiteralProto(v, valueType))
+          }
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      mb
+    }
+
+    def structBuilder(scalaValue: Any, structType: StructType) = {
+      val sb = builder.getStructBuilder.setStructType(toConnectProtoType(structType))

Review Comment:
   `fn.typedLit(("a", 2, 1.0))` and `fn.typedLit((Seq(1, 2, 3), Map("a" -> 1, "b" -> 2), ("a", Map(1 -> "a", 2 -> "b")))))`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136437829


##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,87 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(scalaValue: Any, elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+
+      scalaValue match {
+        case a: Array[_] =>
+          a.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case s: scala.collection.Seq[_] =>
+          s.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      ab
+    }
+
+    def mapBuilder(scalaValue: Any, keyType: DataType, valueType: DataType) = {
+      val mb = builder.getMapBuilder
+        .setKeyType(toConnectProtoType(keyType))
+        .setValueType(toConnectProtoType(valueType))
+
+      scalaValue match {
+        case map: scala.collection.Map[_, _] =>
+          map.foreach { case (k, v) =>
+            mb.addKeys(toLiteralProto(k, keyType))
+            mb.addValues(toLiteralProto(v, valueType))
+          }
+        case other =>

Review Comment:
   `java.lang.Map` the same as `java.lang.Iterable`, `ScalaReflection.schemaFor[T]` will throw error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135728185


##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   I made a simple change locally according to my idea and tested  
   
   ```
   simple.select(fn.typedlit(("1", 2, true, 3.toDouble, 4.toFloat, 5.toByte, 6.toShort)))
   ```
   
   The `explain` file contents as follows:
   
   ```
   Project [[1,2,true,3.0,4.0,5,6] AS NAMED_STRUCT('_1', '1', '_2', 2, '_3', true, '_4', 3.0D, '_5', CAST('4.0' AS FLOAT), '_6', 5Y, '_7', 6S)#0]
   +- LocalRelation <empty>, [id#0L, a#0, b#0]
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "LuciferYang (via GitHub)" <gi...@apache.org>.

LuciferYang commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1135515854


##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,87 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(scalaValue: Any, elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+
+      scalaValue match {
+        case a: Array[_] =>
+          a.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case s: scala.collection.Seq[_] =>

Review Comment:
   So it's impossible to match `java.lang.Iterable` here, right? I tried the api in sql module, which is also `The feature is not supported`



##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,87 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(scalaValue: Any, elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+
+      scalaValue match {
+        case a: Array[_] =>
+          a.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case s: scala.collection.Seq[_] =>
+          s.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      ab
+    }
+
+    def mapBuilder(scalaValue: Any, keyType: DataType, valueType: DataType) = {
+      val mb = builder.getMapBuilder
+        .setKeyType(toConnectProtoType(keyType))
+        .setValueType(toConnectProtoType(valueType))
+
+      scalaValue match {
+        case map: scala.collection.Map[_, _] =>
+          map.foreach { case (k, v) =>
+            mb.addKeys(toLiteralProto(k, keyType))
+            mb.addValues(toLiteralProto(v, valueType))
+          }
+        case other =>

Review Comment:
   Similarly, it can't be `JMap` here, right?
   
   



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -121,6 +135,53 @@ object LiteralExpressionProtoConverter {
     }
   }
 
+  private def getConverter(dataType: proto.DataType): proto.Expression.Literal => Any = {
+    if (dataType.hasShort) { v =>
+      v.getShort.toShort
+    } else if (dataType.hasInteger) { v =>
+      v.getInteger
+    } else if (dataType.hasLong) { v =>
+      v.getLong
+    } else if (dataType.hasDouble) { v =>
+      v.getDouble
+    } else if (dataType.hasByte) { v =>
+      v.getByte.toByte
+    } else if (dataType.hasFloat) { v =>
+      v.getFloat
+    } else if (dataType.hasBoolean) { v =>
+      v.getBoolean
+    } else if (dataType.hasString) { v =>
+      v.getString
+    } else if (dataType.hasBinary) { v =>
+      v.getBinary.toByteArray
+    } else if (dataType.hasDate) { v =>
+      DateTimeUtils.toJavaDate(v.getDate)
+    } else if (dataType.hasTimestamp) { v =>
+      DateTimeUtils.toJavaTimestamp(v.getTimestamp)
+    } else if (dataType.hasTimestampNtz) { v =>
+      DateTimeUtils.microsToLocalDateTime(v.getTimestampNtz)
+    } else if (dataType.hasDayTimeInterval) { v =>
+      IntervalUtils.microsToDuration(v.getDayTimeInterval)
+    } else if (dataType.hasYearMonthInterval) { v =>
+      IntervalUtils.monthsToPeriod(v.getYearMonthInterval)
+    } else if (dataType.hasDecimal) { v =>
+      Decimal(v.getDecimal.getValue)
+    } else if (dataType.hasCalendarInterval) { v =>
+      {

Review Comment:
   Need this bracket?
   
   



##########
connector/connect/server/src/main/scala/org/apache/spark/sql/connect/planner/LiteralExpressionProtoConverter.scala:
##########
@@ -106,6 +107,19 @@ object LiteralExpressionProtoConverter {
           toArrayData(lit.getArray),
           ArrayType(DataTypeProtoConverter.toCatalystType(lit.getArray.getElementType)))
 
+      case proto.Expression.Literal.LiteralTypeCase.MAP =>
+        expressions.Literal.create(
+          toMapData(lit.getMap),
+          MapType(
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getKeyType),
+            DataTypeProtoConverter.toCatalystType(lit.getMap.getValueType)))
+
+      case proto.Expression.Literal.LiteralTypeCase.STRUCT =>

Review Comment:
   For the `STRUCT`, my thoughts as follows, for reference only:
   1.  the return type of `structData` should be `Array[Any]`
   2. `dataType.hasString` in `getConverter` may need return `UTF8String.fromString(v.getString)`
   3. the result maybe `expressions.Literal.create(new GenericInternalRow(structData), dataType)`
   
   
   



##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,87 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(scalaValue: Any, elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+
+      scalaValue match {
+        case a: Array[_] =>
+          a.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case s: scala.collection.Seq[_] =>
+          s.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      ab
+    }
+
+    def mapBuilder(scalaValue: Any, keyType: DataType, valueType: DataType) = {
+      val mb = builder.getMapBuilder
+        .setKeyType(toConnectProtoType(keyType))
+        .setValueType(toConnectProtoType(valueType))
+
+      scalaValue match {
+        case map: scala.collection.Map[_, _] =>
+          map.foreach { case (k, v) =>
+            mb.addKeys(toLiteralProto(k, keyType))
+            mb.addValues(toLiteralProto(v, valueType))
+          }
+        case other =>
+          throw new IllegalArgumentException(s"literal $other not supported (yet).")
+      }
+
+      mb
+    }
+
+    def structBuilder(scalaValue: Any, structType: StructType) = {
+      val sb = builder.getStructBuilder.setStructType(toConnectProtoType(structType))

Review Comment:
   hmm... seems no case coverage `structBuilder`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] beliefer commented on a diff in pull request #40355: [SPARK-42604][CONNECT] Implement functions.typedlit

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #40355:
URL: https://github.com/apache/spark/pull/40355#discussion_r1136437153


##########
connector/connect/common/src/main/scala/org/apache/spark/sql/connect/common/LiteralValueProtoConverter.scala:
##########
@@ -97,6 +101,87 @@ object LiteralValueProtoConverter {
     }
   }
 
+  def toLiteralProtoBuilder(
+      literal: Any,
+      dataType: DataType): proto.Expression.Literal.Builder = {
+    val builder = proto.Expression.Literal.newBuilder()
+
+    def arrayBuilder(scalaValue: Any, elementType: DataType) = {
+      val ab = builder.getArrayBuilder.setElementType(toConnectProtoType(elementType))
+
+      scalaValue match {
+        case a: Array[_] =>
+          a.foreach(item => ab.addElements(toLiteralProto(item, elementType)))
+        case s: scala.collection.Seq[_] =>

Review Comment:
   Yes. I tried this API too. In fact `ScalaReflection.schemaFor[T]` will throw error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org