You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "itholic (via GitHub)" <gi...@apache.org> on 2023/09/12 03:01:07 UTC

[GitHub] [spark] itholic opened a new pull request, #42878: [SPARK-43123][PS] Raise `TypeError` for `DataFrame.interpolate` when all columns are object-dtype.

itholic opened a new pull request, #42878:
URL: https://github.com/apache/spark/pull/42878

   ### What changes were proposed in this pull request?
   
   This PR proposes to aise `TypeError` for `DataFrame.interpolate` when all columns are object-dtype.
   
   
   ### Why are the changes needed?
   
   To match the behavior of Pandas:
   ```python
   >>> pd.DataFrame({"A": ['a', 'b', 'c'], "B": ['a', 'b', 'c']}).interpolate()
   ...
   TypeError: Cannot interpolate with all object-dtype columns in the DataFrame. Try setting at least one column to a numeric dtype.
   ```
   We currently return empty DataFrame instead of raise TypeError:
   ```python
   >>> pd.DataFrame({"A": ['a', 'b', 'c'], "B": ['a', 'b', 'c']}).interpolate()
   Empty DataFrame
   Columns: []
   Index: [0, 1, 2]
   ```
   
   
   ### Does this PR introduce _any_ user-facing change?
   
   Compute `DataFrame.interpolate` on DataFrame that has all object-dtype columns will raise TypeError instead of returning an empty DataFrame.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   Added UT.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   <!--
   If generative AI tooling has been used in the process of authoring this patch, please include the
   phrase: 'Generated-by: ' followed by the name of the tool and its version.
   If no, write 'No'.
   Please refer to the [ASF Generative Tooling Guidance](https://www.apache.org/legal/generative-tooling.html) for details.
   -->
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #42878: [SPARK-45123][PS] Raise `TypeError` for `DataFrame.interpolate` when all columns are object-dtype.

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #42878:
URL: https://github.com/apache/spark/pull/42878#issuecomment-1716759522

   Oops.. Updated. Thanks, @ueshin !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #42878: [SPARK-43123][PS] Raise `TypeError` for `DataFrame.interpolate` when all columns are object-dtype.

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng commented on PR #42878:
URL: https://github.com/apache/spark/pull/42878#issuecomment-1715081286

   merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng closed pull request #42878: [SPARK-43123][PS] Raise `TypeError` for `DataFrame.interpolate` when all columns are object-dtype.

Posted by "zhengruifeng (via GitHub)" <gi...@apache.org>.
zhengruifeng closed pull request #42878: [SPARK-43123][PS] Raise `TypeError` for `DataFrame.interpolate` when all columns are object-dtype.
URL: https://github.com/apache/spark/pull/42878


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on pull request #42878: [SPARK-43123][PS] Raise `TypeError` for `DataFrame.interpolate` when all columns are object-dtype.

Posted by "ueshin (via GitHub)" <gi...@apache.org>.
ueshin commented on PR #42878:
URL: https://github.com/apache/spark/pull/42878#issuecomment-1716237887

   @itholic The JIRA iD in the title seems to be wrong?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] itholic commented on pull request #42878: [SPARK-43123][PS] Raise `TypeError` for `DataFrame.interpolate` when all columns are object-dtype.

Posted by "itholic (via GitHub)" <gi...@apache.org>.
itholic commented on PR #42878:
URL: https://github.com/apache/spark/pull/42878#issuecomment-1714891903

   cc @zhengruifeng 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org