You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "MaxGekk (via GitHub)" <gi...@apache.org> on 2024/03/12 01:38:26 UTC

Re: [PR] [SPARK-46654][SQL][PYTHON] Make `to_csv` explicitly indicate that it does not support some types of data [spark]

MaxGekk commented on code in PR #44665:
URL: https://github.com/apache/spark/pull/44665#discussion_r1520694548


##########
python/pyspark/sql/functions/builtin.py:
##########
@@ -15534,19 +15532,7 @@ def to_csv(col: "ColumnOrName", options: Optional[Dict[str, str]] = None) -> Col
     |      2,Alice|
     +-------------+
 
-    Example 2: Converting a complex StructType to a CSV string
-
-    >>> from pyspark.sql import Row, functions as sf
-    >>> data = [(1, Row(age=2, name='Alice', scores=[100, 200, 300]))]
-    >>> df = spark.createDataFrame(data, ("key", "value"))
-    >>> df.select(sf.to_csv(df.value)).show(truncate=False) # doctest: +SKIP
-    +-----------------------+
-    |to_csv(value)          |
-    +-----------------------+
-    |2,Alice,"[100,200,300]"|

Review Comment:
   > I think we should simply disallow this. We could have a legacy conf, but it doesn't much make sense. What does it return for ArrayType and MapType?
   
   @HyukjinKwon See what it does return for `ArrayType` in the example. I doubt that we should simply disallow it w/o any legacy config at least.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org