You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/10/10 10:30:04 UTC

[GitHub] [spark] dchvn opened a new pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

dchvn opened a new pull request #34235:
URL: https://github.com/apache/spark/pull/34235


   ### What changes were proposed in this pull request?
   Raise ValueError "matrices are not aligned" if index is not same for ps.Series.dot
   
   ### Why are the changes needed?
   Follow pandas
   
   ### Does this PR introduce _any_ user-facing change?
   Before this PR
   ```python
   >>> psdf1 = ps.Series([1, 2, 3], index=[0, 1, 2])
   >>> psdf2 = ps.Series([1, 2, 3], index=[0, 1, 3])
   >>> psdf1.dot(psdf2)
   5 
   ```
   
   After this PR
   ```python
   >>> psdf1 = ps.Series([1, 2, 3], index=[0, 1, 2])
   >>> psdf2 = ps.Series([1, 2, 3], index=[0, 1, 3])
   >>> psdf1.dot(psdf2)
   Traceback (most recent call last):                                              
     File "<stdin>", line 1, in <module>
     File "/u02/spark/python/pyspark/pandas/series.py", line 5003, in dot
       raise ValueError("matrices are not aligned")
   ValueError: matrices are not aligned
   ```
   ### How was this patch tested?
   unit test
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dchvn commented on a change in pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
dchvn commented on a change in pull request #34235:
URL: https://github.com/apache/spark/pull/34235#discussion_r727802685



##########
File path: python/pyspark/pandas/series.py
##########
@@ -5014,11 +4998,11 @@ def dot(self, other: Union["Series", DataFrame]) -> Union[Scalar, "Series"]:
         y   -14
         dtype: int64
         """
-        if isinstance(other, DataFrame):
-            if not same_anchor(self, other):
-                if not self.index.sort_values().equals(other.index.sort_values()):
-                    raise ValueError("matrices are not aligned")
+        if not same_anchor(self, other):
+            if not self.index.sort_values().equals(other.index.sort_values()):

Review comment:
       I would like to add another parameter like ```check_index```
   ``` python
   def dot(self, other: Union["Series", DataFrame], check_index: bool = True) -> Union[Scalar, "Series"]:
   ```
   ```True```: behave like pandas
   ```False```: ignore mismatches with NaN
   What do you think ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947411551


   **[Test build #144451 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144451/testReport)** for PR 34235 at commit [`d8eb812`](https://github.com/apache/spark/commit/d8eb812e845171f6dbe9742e5c96367aa761b9f4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dchvn commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-950515078


   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34235:
URL: https://github.com/apache/spark/pull/34235#discussion_r727881814



##########
File path: python/pyspark/pandas/series.py
##########
@@ -5014,11 +4998,11 @@ def dot(self, other: Union["Series", DataFrame]) -> Union[Scalar, "Series"]:
         y   -14
         dtype: int64
         """
-        if isinstance(other, DataFrame):
-            if not same_anchor(self, other):
-                if not self.index.sort_values().equals(other.index.sort_values()):
-                    raise ValueError("matrices are not aligned")
+        if not same_anchor(self, other):
+            if not self.index.sort_values().equals(other.index.sort_values()):

Review comment:
       Yeah, something like that. Or, we can have a configuration at https://github.com/apache/spark/blob/master/python/pyspark/pandas/config.py#L112-L140, and fix all similar places together.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947434491


   **[Test build #144451 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144451/testReport)** for PR 34235 at commit [`d8eb812`](https://github.com/apache/spark/commit/d8eb812e845171f6dbe9742e5c96367aa761b9f4).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947411551


   **[Test build #144451 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144451/testReport)** for PR 34235 at commit [`d8eb812`](https://github.com/apache/spark/commit/d8eb812e845171f6dbe9742e5c96367aa761b9f4).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947452148


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144451/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dchvn commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947483404


   CC @HyukjinKwon @itholic FYI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947455778


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48924/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-950496167


   Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dchvn commented on a change in pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
dchvn commented on a change in pull request #34235:
URL: https://github.com/apache/spark/pull/34235#discussion_r727802685



##########
File path: python/pyspark/pandas/series.py
##########
@@ -5014,11 +4998,11 @@ def dot(self, other: Union["Series", DataFrame]) -> Union[Scalar, "Series"]:
         y   -14
         dtype: int64
         """
-        if isinstance(other, DataFrame):
-            if not same_anchor(self, other):
-                if not self.index.sort_values().equals(other.index.sort_values()):
-                    raise ValueError("matrices are not aligned")
+        if not same_anchor(self, other):
+            if not self.index.sort_values().equals(other.index.sort_values()):

Review comment:
       I would like to add another parameter like ```check_index```
   ``` python
   def dot(self, other: Union["Series", DataFrame], check_index: bool = True) -> Union[Scalar, "Series"]:
   ```
   ```True```: behave like pandas
   ```False```: ignore mismatches with NaN
   What do you think ?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939461380


   **[Test build #144062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144062/testReport)** for PR 34235 at commit [`5cb8918`](https://github.com/apache/spark/commit/5cb89184d5adb7439869d0f454057411898c2093).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947505083


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48924/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947452148


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144451/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939469831


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144062/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dchvn commented on a change in pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
dchvn commented on a change in pull request #34235:
URL: https://github.com/apache/spark/pull/34235#discussion_r728631512



##########
File path: python/pyspark/pandas/series.py
##########
@@ -5014,11 +4998,11 @@ def dot(self, other: Union["Series", DataFrame]) -> Union[Scalar, "Series"]:
         y   -14
         dtype: int64
         """
-        if isinstance(other, DataFrame):
-            if not same_anchor(self, other):
-                if not self.index.sort_values().equals(other.index.sort_values()):
-                    raise ValueError("matrices are not aligned")
+        if not same_anchor(self, other):
+            if not self.index.sort_values().equals(other.index.sort_values()):

Review comment:
       @HyukjinKwon Could you take a look at https://github.com/apache/spark/pull/34281 when you have time? Thanks!




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dchvn commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
dchvn commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-950494670


   Ping @ueshin @itholic @xinrong-databricks , Could you take a look? Many thanks!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939592768


   cc @itholic 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939477154


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48540/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939469831


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144062/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939468712


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48540/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939461380


   **[Test build #144062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144062/testReport)** for PR 34235 at commit [`5cb8918`](https://github.com/apache/spark/commit/5cb89184d5adb7439869d0f454057411898c2093).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #34235:
URL: https://github.com/apache/spark/pull/34235


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939478420


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48540/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34235:
URL: https://github.com/apache/spark/pull/34235#discussion_r727881814



##########
File path: python/pyspark/pandas/series.py
##########
@@ -5014,11 +4998,11 @@ def dot(self, other: Union["Series", DataFrame]) -> Union[Scalar, "Series"]:
         y   -14
         dtype: int64
         """
-        if isinstance(other, DataFrame):
-            if not same_anchor(self, other):
-                if not self.index.sort_values().equals(other.index.sort_values()):
-                    raise ValueError("matrices are not aligned")
+        if not same_anchor(self, other):
+            if not self.index.sort_values().equals(other.index.sort_values()):

Review comment:
       Yeah, something like that. Or, we can have a configuration at https://github.com/apache/spark/blob/master/python/pyspark/pandas/config.py#L112-L140, and fix all similar places together.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939478420


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48540/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-939465140


   **[Test build #144062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144062/testReport)** for PR 34235 at commit [`5cb8918`](https://github.com/apache/spark/commit/5cb89184d5adb7439869d0f454057411898c2093).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34235:
URL: https://github.com/apache/spark/pull/34235#discussion_r726734654



##########
File path: python/pyspark/pandas/series.py
##########
@@ -5014,11 +4998,11 @@ def dot(self, other: Union["Series", DataFrame]) -> Union[Scalar, "Series"]:
         y   -14
         dtype: int64
         """
-        if isinstance(other, DataFrame):
-            if not same_anchor(self, other):
-                if not self.index.sort_values().equals(other.index.sort_values()):
-                    raise ValueError("matrices are not aligned")
+        if not same_anchor(self, other):
+            if not self.index.sort_values().equals(other.index.sort_values()):

Review comment:
       Can we add another parameter to enable/disable this check? This is too expensive operation .. sort both sides, and aggregate which introduces 3 suffles ... 
   
   Or maybe we should introduce a configuration to enable and disable all of such instances ....




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #34235:
URL: https://github.com/apache/spark/pull/34235#discussion_r726735028



##########
File path: python/pyspark/pandas/series.py
##########
@@ -5014,11 +4998,11 @@ def dot(self, other: Union["Series", DataFrame]) -> Union[Scalar, "Series"]:
         y   -14
         dtype: int64
         """
-        if isinstance(other, DataFrame):
-            if not same_anchor(self, other):
-                if not self.index.sort_values().equals(other.index.sort_values()):
-                    raise ValueError("matrices are not aligned")
+        if not same_anchor(self, other):
+            if not self.index.sort_values().equals(other.index.sort_values()):

Review comment:
       WDYT @xinrong-databricks @ueshin @itholic 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947505120


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48924/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34235: [SPARK-36968][PYTHON] ps.Series.dot raise "matrices are not aligned" if index is not same

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34235:
URL: https://github.com/apache/spark/pull/34235#issuecomment-947505120


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48924/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org