You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/21 22:19:29 UTC

[GitHub] [spark] xinrong-databricks opened a new pull request #34063: [3.2][SPARK-36771][PYTHON] Fix pop of Categorical Series

xinrong-databricks opened a new pull request #34063:
URL: https://github.com/apache/spark/pull/34063


   ### What changes were proposed in this pull request?
   Fix `pop` of Categorical Series to be consistent with the latest pandas (1.3.2) behavior.
   
   ### Why are the changes needed?
   As https://github.com/databricks/koalas/issues/2198, pandas API on Spark behaves differently from pandas on `pop` of Categorical Series.
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, results of `pop` of Categorical Series change.
   
   #### From
   ```py
   >>> psser = ps.Series(["a", "b", "c", "a"], dtype="category")
   >>> psser
   0    a                                                                          
   1    b
   2    c
   3    a
   dtype: category
   Categories (3, object): ['a', 'b', 'c']
   >>> psser.pop(0)
   0
   >>> psser
   1    b
   2    c
   3    a
   dtype: category
   Categories (3, object): ['a', 'b', 'c']
   >>> psser.pop(3)
   0
   >>> psser
   1    b
   2    c
   dtype: category
   Categories (3, object): ['a', 'b', 'c']
   ```
   
   #### To
   ```py
   >>> psser = ps.Series(["a", "b", "c", "a"], dtype="category")
   >>> psser
   0    a                                                                          
   1    b
   2    c
   3    a
   dtype: category
   Categories (3, object): ['a', 'b', 'c']
   >>> psser.pop(0)
   'a'
   >>> psser
   1    b
   2    c
   3    a
   dtype: category
   Categories (3, object): ['a', 'b', 'c']
   >>> psser.pop(3)
   'a'
   >>> psser
   1    b
   2    c
   dtype: category
   Categories (3, object): ['a', 'b', 'c']
   
   ```
   
   ### How was this patch tested?
   Unit tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix `pop` of Categorical Series

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924473074


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47998/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix `pop` of Categorical Series

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924473074


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47998/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix pop of Categorical Series

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924449373


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143487/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin commented on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix `pop` of Categorical Series

Posted by GitBox <gi...@apache.org>.
ueshin commented on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924526600


   Thanks! merging to branch-3.2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix pop of Categorical Series

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924434137


   **[Test build #143487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143487/testReport)** for PR 34063 at commit [`e9d11ff`](https://github.com/apache/spark/commit/e9d11ff4bd76efe838c83b29fc2ec64b46478e25).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ueshin closed pull request #34063: [SPARK-36771][PYTHON][3.2] Fix `pop` of Categorical Series

Posted by GitBox <gi...@apache.org>.
ueshin closed pull request #34063:
URL: https://github.com/apache/spark/pull/34063


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix pop of Categorical Series

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924446738


   **[Test build #143487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143487/testReport)** for PR 34063 at commit [`e9d11ff`](https://github.com/apache/spark/commit/e9d11ff4bd76efe838c83b29fc2ec64b46478e25).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xinrong-databricks commented on pull request #34063: [3.2][SPARK-36771][PYTHON] Fix pop of Categorical Series

Posted by GitBox <gi...@apache.org>.
xinrong-databricks commented on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924430388


   CC @ueshin @HyukjinKwon @itholic 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix `pop` of Categorical Series

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924465845


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47998/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix pop of Categorical Series

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924449373


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143487/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix pop of Categorical Series

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924434137


   **[Test build #143487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143487/testReport)** for PR 34063 at commit [`e9d11ff`](https://github.com/apache/spark/commit/e9d11ff4bd76efe838c83b29fc2ec64b46478e25).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34063: [SPARK-36771][PYTHON][3.2] Fix pop of Categorical Series

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34063:
URL: https://github.com/apache/spark/pull/34063#issuecomment-924451143


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47998/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org