You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/04/24 10:22:01 UTC

[GitHub] [spark] sadhen opened a new pull request #32320: [SPARK-35211][PYSPARK] _create_dataframe: infer schema earlier and do type check

sadhen opened a new pull request #32320:
URL: https://github.com/apache/spark/pull/32320


   
   ### What changes were proposed in this pull request?
   infra schema earlier and do type check.
   
   This pr fixes SPARK-35211 when schema verification is turned on. If schema verification is turned off, the bug described in SPARK-35211 still exists. I will create another PR to solve the issue.
   
   
   ### Why are the changes needed?
   ``` python
   spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false")
   from pyspark.testing.sqlutils  import ExamplePoint
   import pandas as pd
   pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1, 1), ExamplePoint(2, 2)])})
   df = spark.createDataFrame(pdf)
   df.show()
   ```
   The result is not correct because of incorrect type conversion.
   
   With this PR, type check will be performed:
   ```
   (spark) ➜  spark git:(sadhen/SPARK-35211) ✗ bin/pyspark
   Python 3.8.8 (default, Feb 24 2021, 13:46:16)
   [Clang 10.0.0 ] :: Anaconda, Inc. on darwin
   Type "help", "copyright", "credits" or "license" for more information.
   Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
   Setting default log level to "WARN".
   To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
   21/04/24 17:42:23 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /__ / .__/\_,_/_/ /_/\_\   version 3.2.0-SNAPSHOT
         /_/
   
   Using Python version 3.8.8 (default, Feb 24 2021 13:46:16)
   Spark context Web UI available at http://172.30.0.12:4040
   Spark context available as 'sc' (master = local[*], app id = local-1619257343692).
   SparkSession available as 'spark'.
   >>> spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "false")
   >>> from pyspark.testing.sqlutils  import ExamplePoint
   >>> import pandas as pd
   >>> pdf = pd.DataFrame({'point': pd.Series([ExamplePoint(1, 1), ExamplePoint(2, 2)])})
   >>> df = spark.createDataFrame(pdf)
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/Users/da/github/apache/spark/python/pyspark/sql/session.py", line 653, in createDataFrame
       return super(SparkSession, self).createDataFrame(
     File "/Users/da/github/apache/spark/python/pyspark/sql/pandas/conversion.py", line 340, in createDataFrame
       return self._create_dataframe(data, schema, samplingRatio, verifySchema)
     File "/Users/da/github/apache/spark/python/pyspark/sql/session.py", line 699, in _create_dataframe
       rdd, schema = self._createFromLocal(map(prepare, data), schema)
     File "/Users/da/github/apache/spark/python/pyspark/sql/session.py", line 499, in _createFromLocal
       data = list(data)
     File "/Users/da/github/apache/spark/python/pyspark/sql/session.py", line 688, in prepare
       verify_func(obj)
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1409, in verify
       verify_value(obj)
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1390, in verify_struct
       verifier(v)
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1409, in verify
       verify_value(obj)
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1304, in verify_udf
       verifier(dataType.toInternal(obj))
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1409, in verify
       verify_value(obj)
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1354, in verify_array
       element_verifier(i)
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1409, in verify
       verify_value(obj)
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1403, in verify_default
       verify_acceptable_types(obj)
     File "/Users/da/github/apache/spark/python/pyspark/sql/types.py", line 1291, in verify_acceptable_types
       raise TypeError(new_msg("%s can not accept object %r in type %s"
   TypeError: element in array field point: DoubleType can not accept object 1 in type <class 'int'>
   ```
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   
   ### How was this patch tested?
   unit test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826274103


   @sadhen, can we separate refactoring and the UDT inferred type verification? It would make the change much easier to review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826108029


   **[Test build #137889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137889/testReport)** for PR 32320 at commit [`4dc085c`](https://github.com/apache/spark/commit/4dc085cb7fc4f01948b3450efdd0713bf971cc0c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32320: [SPARK-35211][PYSPARK] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826070963


   **[Test build #137882 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137882/testReport)** for PR 32320 at commit [`bea87a5`](https://github.com/apache/spark/commit/bea87a5eb4ea368ff03718093b21a4974a373ab8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826116156


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137889/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadhen commented on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
sadhen commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826278011






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826101750


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadhen commented on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
sadhen commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826280920


   This PR will be rebased on master when https://github.com/apache/spark/pull/32332 is merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826108029


   **[Test build #137889 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137889/testReport)** for PR 32320 at commit [`4dc085c`](https://github.com/apache/spark/commit/4dc085cb7fc4f01948b3450efdd0713bf971cc0c).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826274103


   @sadhen, can we separate refactoring and the UDT inferred type verification? It would make the change much easier to review.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] darcy-shen closed pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
darcy-shen closed pull request #32320:
URL: https://github.com/apache/spark/pull/32320


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826084707


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137882/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32320: [SPARK-35211][PYSPARK] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826071081






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826070963


   **[Test build #137882 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137882/testReport)** for PR 32320 at commit [`bea87a5`](https://github.com/apache/spark/commit/bea87a5eb4ea368ff03718093b21a4974a373ab8).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826084531


   **[Test build #137882 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137882/testReport)** for PR 32320 at commit [`bea87a5`](https://github.com/apache/spark/commit/bea87a5eb4ea368ff03718093b21a4974a373ab8).
    * This patch **fails PySpark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadhen commented on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
sadhen commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826278011


   @HyukjinKwon There are little differences in `_createFromRDD` and `_createFromLocal`. If I do inferred type verification in a separate PR, I need to insert the following code snippet twice:
   
   ``` python
                   verify_func = _make_type_verifier(struct) if verifySchema else lambda _: True
   
                   def verified_converter(obj):
                       verify_func(obj)
                       return converter(obj)
                   data = inner_map(verified_converter, data)
   ```
   
   That's why I did a refactor.
   
   Let me create another PR for inferred type verification.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826110859


   **[Test build #137889 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/137889/testReport)** for PR 32320 at commit [`4dc085c`](https://github.com/apache/spark/commit/4dc085cb7fc4f01948b3450efdd0713bf971cc0c).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826101750


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32320: [SPARK-35211][PYSPARK] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826071081


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826116156


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137889/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826071081






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #32320: [SPARK-35211][PYSPARK] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826070963






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826071081


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826084707


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/137882/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #32320: [SPARK-35211][PYTHON] _create_dataframe: infer schema earlier and do type check

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826070963






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] sadhen commented on pull request #32320: [SPARK-35211][PYTHON] verify inferred schema for _create_dataframe

Posted by GitBox <gi...@apache.org>.
sadhen commented on pull request #32320:
URL: https://github.com/apache/spark/pull/32320#issuecomment-826280702


   A PR without refactor is prepared: https://github.com/apache/spark/pull/32332


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org