You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/30 07:43:19 UTC

[GitHub] [spark] itholic opened a new pull request, #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

itholic opened a new pull request, #37722:
URL: https://github.com/apache/spark/pull/37722

   ### What changes were proposed in this pull request?
   
   This PR proposes to support `list` type for `pyspark.sql.functions.lit`.
   
   ### Why are the changes needed?
   
   To improve the API usability.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes, now the `list` type is available for `pyspark.sql.functions.list` as below:
   
   - Before
   ```python
   >>> spark.range(3).withColumn("c", lit([1,2,3])).show()
   Traceback (most recent call last):
   ...
   : org.apache.spark.SparkRuntimeException: [UNSUPPORTED_FEATURE.LITERAL_TYPE] The feature is not supported: Literal for '[1, 2, 3]' of class java.util.ArrayList.
   	at org.apache.spark.sql.errors.QueryExecutionErrors$.literalTypeUnsupportedError(QueryExecutionErrors.scala:302)
   	at org.apache.spark.sql.catalyst.expressions.Literal$.apply(literals.scala:100)
   	at org.apache.spark.sql.functions$.lit(functions.scala:125)
   	at org.apache.spark.sql.functions.lit(functions.scala)
   	at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
   	at java.base/java.lang.reflect.Method.invoke(Method.java:577)
   	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
   	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:374)
   	at py4j.Gateway.invoke(Gateway.java:282)
   	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
   	at py4j.commands.CallCommand.execute(CallCommand.java:79)
   	at py4j.ClientServerConnection.waitForCommands(ClientServerConnection.java:182)
   	at py4j.ClientServerConnection.run(ClientServerConnection.java:106)
   	at java.base/java.lang.Thread.run(Thread.java:833)
   ```
   
   - After
   ```python
   >>> spark.range(3).withColumn("c", lit([1,2,3])).show()
   +---+---------+
   | id|        c|
   +---+---------+
   |  0|[1, 2, 3]|
   |  1|[1, 2, 3]|
   |  2|[1, 2, 3]|
   +---+---------+
   ```
   
   
   ### How was this patch tested?
   
   Added doctest & unit test.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37722:
URL: https://github.com/apache/spark/pull/37722#issuecomment-1231393927

   Thanks, @HyukjinKwon and @zhengruifeng for the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r959044498


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +152,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Create a literal from a list.
+
+    >>> spark.range(1).select(F.lit([1, 2, 3])).show()
+    +--------------+
+    |array(1, 2, 3)|
+    +--------------+
+    |     [1, 2, 3]|
+    +--------------+
     """
+    col = array(*[lit(item) for item in col]) if isinstance(col, list) else col

Review Comment:
   df.id isn't a way to access a column in scala. try either $"id" or df.col("id")



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
itholic commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r959043895


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +152,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Create a literal from a list.
+
+    >>> spark.range(1).select(F.lit([1, 2, 3])).show()
+    +--------------+
+    |array(1, 2, 3)|
+    +--------------+
+    |     [1, 2, 3]|
+    +--------------+
     """
+    col = array(*[lit(item) for item in col]) if isinstance(col, list) else col

Review Comment:
   qq: Maybe seems like the Scala side even doesn't allow for the single `Column` case ??
   
   ```scala
   scala> df.select(lit(df.id)).show()
   <console>:24: error: value id in class Dataset cannot be accessed in org.apache.spark.sql.Dataset[Long]
          df.select(lit(df.id)).show()
   ```
   
   which is allowed in Python side:
   
   ```python
   >>> df.select(F.lit(df.id)).show()
   +---+
   | id|
   +---+
   |  0|
   |  1|
   |  2|
   |  3|
   |  4|
   |  5|
   |  6|
   |  7|
   |  8|
   |  9|
   +---+
   ```
   
   Do we want to also block this case if we block the `[Column, Column]` case ??



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r958291678


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +152,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Create a literal from a list.
+
+    >>> spark.range(1).select(F.lit([1, 2, 3])).show()
+    +--------------+
+    |array(1, 2, 3)|
+    +--------------+
+    |     [1, 2, 3]|
+    +--------------+
     """
+    col = array(*[lit(item) for item in col]) if isinstance(col, list) else col

Review Comment:
   Actually we should probably block the case of `[Column, Column]` because Scala side does not allow that case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
itholic commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r959043895


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +152,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Create a literal from a list.
+
+    >>> spark.range(1).select(F.lit([1, 2, 3])).show()
+    +--------------+
+    |array(1, 2, 3)|
+    +--------------+
+    |     [1, 2, 3]|
+    +--------------+
     """
+    col = array(*[lit(item) for item in col]) if isinstance(col, list) else col

Review Comment:
   qq: Maybe seems like the Scala side even doesn't allow for the single `Column` case ??
   
   ```scala
   scala> df.select(lit(df.id)).show()
   <console>:24: error: value id in class Dataset cannot be accessed in org.apache.spark.sql.Dataset[Long]
          df.select(lit(df.id)).show()
   ```
   
   which is allowed in Python side:
   
   ```python
   >>> df.select(F.lit(df.id)).show()
   +---+
   | id|
   +---+
   |  0|
   |  1|
   |  2|
   |  3|
   |  4|
   |  5|
   |  6|
   |  7|
   |  8|
   |  9|
   +---+
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
itholic commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r959034285


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +152,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Create a literal from a list.
+
+    >>> spark.range(1).select(F.lit([1, 2, 3])).show()
+    +--------------+
+    |array(1, 2, 3)|
+    +--------------+
+    |     [1, 2, 3]|
+    +--------------+
     """
+    col = array(*[lit(item) for item in col]) if isinstance(col, list) else col

Review Comment:
   Let me address it. Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r959082184


##########
python/pyspark/sql/tests/test_functions.py:
##########
@@ -984,6 +984,10 @@ def test_lit_list(self):
         actual = self.spark.range(1).select(lit(test_list)).first()[0]
         self.assertEqual(actual, expected)
 
+        df = self.spark.range(10)
+        with self.assertRaisesRegex(ValueError, "lit does not allow for list of Columns"):

Review Comment:
   ```suggestion
           with self.assertRaisesRegex(ValueError, "lit does not allow a column in a list"):
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r958124468


##########
python/pyspark/sql/tests/test_functions.py:
##########
@@ -962,6 +962,11 @@ def test_lit_day_time_interval(self):
         actual = self.spark.range(1).select(lit(td)).first()[0]
         self.assertEqual(actual, td)
 
+    def test_lit_list(self):
+        test_list = [1, 2, 3]

Review Comment:
   Should probably have one more test with other fixed types. e.g.) `["a", 1, None, 1.0]`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r958289551


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +152,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Create a literal from a list.
+
+    >>> spark.range(1).select(F.lit([1, 2, 3])).show()

Review Comment:
   ```suggestion
       >>> spark.range(1).select(lit([1, 2, 3])).show()
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r958127853


##########
python/pyspark/sql/tests/test_functions.py:
##########
@@ -962,6 +962,11 @@ def test_lit_day_time_interval(self):
         actual = self.spark.range(1).select(lit(td)).first()[0]
         self.assertEqual(actual, td)
 
+    def test_lit_list(self):
+        test_list = [1, 2, 3]

Review Comment:
   I think we can also add a test case like `[[1,2,3], [3,4]]`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r959027515


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +152,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Create a literal from a list.
+
+    >>> spark.range(1).select(F.lit([1, 2, 3])).show()
+    +--------------+
+    |array(1, 2, 3)|
+    +--------------+
+    |     [1, 2, 3]|
+    +--------------+
     """
+    col = array(*[lit(item) for item in col]) if isinstance(col, list) else col

Review Comment:
   so may looks like this:
   ```
       if isinstance(col, Column):
           return col
       elif isinstance(col, list):
           if any(isinstance(c, Column) for c in col):
               raise ValueError(...)
           return array(*[lit(item) for item in col])
       else:
           return _invoke_function("lit", col)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #37722:
URL: https://github.com/apache/spark/pull/37722#issuecomment-1232425222

   oops, sorry that @HyukjinKwon 's suggestions don't get committed. Would you @itholic mind updating it in a followup? Or I can do that too.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r958124746


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +149,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Support for list

Review Comment:
   Let's also add `versionchanged` directive for the list support.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r958120567


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +149,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Support for list

Review Comment:
   ```suggestion
       Create a literal from a list.
   ```



##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +149,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Support for list

Review Comment:
   Can you fix "Python primitive type." in the docstring above?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37722:
URL: https://github.com/apache/spark/pull/37722#issuecomment-1231275463

   cc @zhengruifeng 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r959081857


##########
python/pyspark/sql/functions.py:
##########
@@ -155,15 +155,21 @@ def lit(col: Any) -> Column:
 
     Create a literal from a list.
 
-    >>> spark.range(1).select(F.lit([1, 2, 3])).show()
+    >>> spark.range(1).select(lit([1, 2, 3])).show()
     +--------------+
     |array(1, 2, 3)|
     +--------------+
     |     [1, 2, 3]|
     +--------------+
     """
-    col = array(*[lit(item) for item in col]) if isinstance(col, list) else col
-    return col if isinstance(col, Column) else _invoke_function("lit", col)
+    if isinstance(col, Column):
+        return col
+    elif isinstance(col, list):
+        if any(isinstance(c, Column) for c in col):
+            raise ValueError("lit does not allow for list of Columns")

Review Comment:
   ```suggestion
               raise ValueError("lit does not allow a column in a list")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #37722:
URL: https://github.com/apache/spark/pull/37722#issuecomment-1232421895

   Merged to master, thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
itholic commented on PR #37722:
URL: https://github.com/apache/spark/pull/37722#issuecomment-1232486225

   @zhengruifeng Let me address it. Thanks! :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng closed pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
zhengruifeng closed pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.
URL: https://github.com/apache/spark/pull/37722


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r958123331


##########
python/pyspark/sql/tests/test_functions.py:
##########
@@ -962,6 +962,11 @@ def test_lit_day_time_interval(self):
         actual = self.spark.range(1).select(lit(td)).first()[0]
         self.assertEqual(actual, td)
 
+    def test_lit_list(self):
+        test_list = [1, 2, 3]

Review Comment:
   Let's add a docstring like
   
   
   ```
   # SPARK-12345: a short description of the test
   ```
   
   See https://spark.apache.org/contributing.html



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on a diff in pull request #37722: [SPARK-40271][PYTHON] Support list type for `pyspark.sql.functions.lit`.

Posted by GitBox <gi...@apache.org>.
itholic commented on code in PR #37722:
URL: https://github.com/apache/spark/pull/37722#discussion_r959047939


##########
python/pyspark/sql/functions.py:
##########
@@ -149,7 +152,17 @@ def lit(col: Any) -> Column:
     +------+---+
     |     5|  0|
     +------+---+
+
+    Create a literal from a list.
+
+    >>> spark.range(1).select(F.lit([1, 2, 3])).show()
+    +--------------+
+    |array(1, 2, 3)|
+    +--------------+
+    |     [1, 2, 3]|
+    +--------------+
     """
+    col = array(*[lit(item) for item in col]) if isinstance(col, list) else col

Review Comment:
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org