You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "Don-Burns (via GitHub)" <gi...@apache.org> on 2023/07/28 10:12:23 UTC

[GitHub] [spark] Don-Burns commented on pull request #33428: [SPARK-36220][PYTHON] Fix pyspark.sql.types.Row type annotation

Don-Burns commented on PR #33428:
URL: https://github.com/apache/spark/pull/33428#issuecomment-1655435663

   I am jumping in very late on this.
   But hoping to learn from it.
   If you are creating a DF from scratch what is the suggested way of creating rows with null values if having non-strings passed as positional args is discouraged/a code smell?
   There are cases where column names are valid for spark but not able to be expressed as python keywords. e.g. has a dash in the name
   
   
   e.g. I define the schema separately and build my row data to create the df
   
   ```python
   from pyspark.sql import SparkSession
   from pyspark.sql.types import Row, StringType, StructField, StructType
   
   spark = SparkSession.builder.getOrCreate()
   schema = StructType(
       [
           StructField("some-col", StringType(), True),
       ]
   )
   
   data = [Row("a value"), Row(None)]
   
   df = spark.createDataFrame(data=data, schema=schema)
   df.show()
   ```
   ![image](https://github.com/apache/spark/assets/56016914/c4eaec66-484c-42d2-bc31-229e657fb58d)
   ![image](https://github.com/apache/spark/assets/56016914/bb9368fb-78a6-4db9-803a-3097eb91ea4d)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org