You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/07/15 17:18:12 UTC

[GitHub] [spark] cloud-fan opened a new pull request #29125: [SPARK-32018][SQL] UnsafeRow.setDecimal should set null with overflowed value

cloud-fan opened a new pull request #29125:
URL: https://github.com/apache/spark/pull/29125


   partially backport https://github.com/apache/spark/pull/29026


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659075914


   **[Test build #125916 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125916/testReport)** for PR 29125 at commit [`4518513`](https://github.com/apache/spark/commit/451851373f6eb2db8adffe43669b51be7a30e8c1).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skambha commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
skambha commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-669396932


   
   >How about this: we force to enable ANSI for decimal sum, so that the behavior is the same without fixing the UnsafeRow >bug? It's not an ideal fix but should be safer to backport. @skambha what do you think? Can you help to do it?
   
   Not sure if I understand correctly, so can you clarify.  The reason I ask is : Currently, the v3.0 Sum has a ANSI mode in the evaluationExpression and forcing that to be true will not give us much.    We will still run into the problems I mentioned a few comments earlier. 
   
   --
   @cloud-fan, Just to clarify that we are in agreement. The first step is to revert this back port.  Can you confirm please.  
   Yes, I can submit a PR to do this UnsafeRow revert for the v3.0.x and v2.x.x.   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659478850


   **[Test build #125962 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125962/testReport)** for PR 29125 at commit [`7268c05`](https://github.com/apache/spark/commit/7268c0554622c07c420435bec167aa7563e20ecc).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659075914


   **[Test build #125916 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125916/testReport)** for PR 29125 at commit [`4518513`](https://github.com/apache/spark/commit/451851373f6eb2db8adffe43669b51be7a30e8c1).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659119577


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125906/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658960503






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659480246






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659320185






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659010480


   **[Test build #125906 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125906/testReport)** for PR 29125 at commit [`4518513`](https://github.com/apache/spark/commit/451851373f6eb2db8adffe43669b51be7a30e8c1).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659527858


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125965/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659526832


   **[Test build #125965 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125965/testReport)** for PR 29125 at commit [`7268c05`](https://github.com/apache/spark/commit/7268c0554622c07c420435bec167aa7563e20ecc).
    * This patch **fails PySpark pip packaging tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659142626


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125916/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659527846






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun closed pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun closed pull request #29125:
URL: https://github.com/apache/spark/pull/29125


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659583255


   Could you make a backporting PR on branch-2.4 since SPARK-32018 is reported on 2.x too? This partial patch looks safe to have.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658958999


   Thank you for pinging me, @cloud-fan .


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659118796


   **[Test build #125906 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125906/testReport)** for PR 29125 at commit [`4518513`](https://github.com/apache/spark/commit/451851373f6eb2db8adffe43669b51be7a30e8c1).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659320372


   **[Test build #125962 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125962/testReport)** for PR 29125 at commit [`7268c05`](https://github.com/apache/spark/commit/7268c0554622c07c420435bec167aa7563e20ecc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-670321237


   OK let me clarify a few things:
   1. I agree with you that making Spark more likely to return incorrect results is not acceptable.
   2. I hope you understand that having a bug in the very fundamental `UnsafeRow` is not acceptable either.
   
   I'll ask someone to implement the ANSI behavior for decimal sum in 3.0 and 2.4, so that it fails instead of returning wrong results.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659480246






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658960503






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659142621






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658959140


   Retest this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658993984






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658993984






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658991665


   Retest this please.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-668980063


   @skambha you will still hit the sum bug when you disable whole-stage-codegen (or fallback to it due to generated code exceeds 64kb), right?
   
   We are not introducing a new correctness bug. It's an existing bug and the backport makes it more visible.
   
   We've added a mechanism in the master branch to check the streaming state store backward compatibility. If we want to backport the actual fix, we need to backport this mechanism as well. I think that's too many things to backport.
   
   How about this: we force to enable ANSI for decimal sum, so that the behavior is the same without fixing the UnsafeRow bug? It's not an ideal fix but should be safer to backport. @skambha what do you think? Can you help to do it?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-667784890


   @skambha the `sum` shouldn't fail without ANSI mode, this PR fixes it.
   
   It's indeed a bug that we can write an overflowed decimal to UnsafeRow but can't read it. The `sum` is also buggy but we can't backport the fix due to streaming compatibility reasons.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skambha commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
skambha commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-668764003


   - Sum operation is very common and heavily used by users.  
   - Returning incorrect results silently is serious as there is no way for a user to know that their query returned incorrect results.   Earlier the user would get an error and they can possibly increase the precision and rerun their query, but now they will not even know it is incorrect results unless they manually verify (which may not even be possible for large data).  We are now exposing more cases which will return incorrect results now with this back port.  
   
   The [Spark website](https://spark.apache.org/contributing.html) states this “Note that, **data correctness/data loss bugs are very serious**.  Make sure the corresponding bug report JIRA ticket is labeled as correctness or data-loss. If the bug report doesn’t get enough attention, please send an email to dev@spark.apache.org, to draw more attentions."
   
   Incorrect results/data correctness are very serious
      
   As already discussed, yes the UnsafeRow has far reaching impact and has unsafe side effects.  In my opinion we should not back port just this change to v3. and v2.4.x line specially in a point release and expose wrong results to user for a common operation like sum. 
   
   So, my vote would be to not have this UnsafeRow only change in v3.0.x and v2.x.x
   
   —
   @cloud-fan Regarding your question on back porting the sum change,  I think the issue was the streaming backward compatibility impact which blocked that change from going in.  I am not that familiar with the streaming backward compatibility implications.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skambha commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
skambha commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-667336119


   @cloud-fan,  I noticed this back port only now.  This change is more far reaching in its impact as previous callers of UnsafeRow.getDecimal that would have thrown an exception  earlier would now return null.   
   
   As an e.g,  a caller like aggregate sum will need changes to account for this.   Earlier cases where sum would throw error for overflow will **now return incorrect results**.  The new tests that were added for sum overflow cases in the DataFrameSuite in master can be used to see repro. 
   
   Since this pr is closed,  I will add a comment to the JIRA as well.  
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659317064


   retest this please


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659252029


   > No need to fix Sum.scala?
   
   That sum fix is in master only. I don't know if we can backport it as it breaks the streaming state store.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659142220


   **[Test build #125916 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125916/testReport)** for PR 29125 at commit [`4518513`](https://github.com/apache/spark/commit/451851373f6eb2db8adffe43669b51be7a30e8c1).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659010480


   **[Test build #125906 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125906/testReport)** for PR 29125 at commit [`4518513`](https://github.com/apache/spark/commit/451851373f6eb2db8adffe43669b51be7a30e8c1).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-670831201


   cc @ScrapCodes 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skambha commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
skambha commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-670210823


   IIUC, The solutions you mention were also discussed earlier and were not accepted by you. If you do not want to revert this  backport, then I hope you agree it is critical to fix it so users do not run into this incorrectness issue.  Please feel free to go ahead with the option you prefer.  
   
   I have expressed the issues and will summarize them below and also put it in the JIRA.  
   
   The important issue is we should not return incorrect results.  In general, it is not a good practice to back port a change to a stable branch and cause more queries to return incorrect results.
   
   Just to reiterate:
   
   1. This current PR that has back ported the UnsafeRow fix causes queries to return incorrect results.  This is for v2.4.x and v3.0.x line.   This change by itself has unsafe side effects and results in incorrect results being returned.   
   2. It does not matter whether you have whole stage on or off, ansi on or off, you will get more queries returning incorrect results.
   ``` 
   
   scala> val decStr = "1" + "0" * 19
   decStr: String = 10000000000000000000
   
   scala> val d3 = spark.range(0, 1, 1, 1).union(spark.range(0, 11, 1, 1))
   d3: org.apache.spark.sql.Dataset[Long] = [id: bigint]
   
   scala>  val d5 = d3.select(expr(s"cast('$decStr' as decimal (38, 18)) as d"),lit(1).as("key")).groupBy("key").agg(sum($"d").alias("sumd")).select($"sumd")
   d5: org.apache.spark.sql.DataFrame = [sumd: decimal(38,18)]
   
   scala> d5.show(false)   <----  INCORRECT RESULTS RETURNED
   +---------------------------------------+
   |sumd                                   |
   +---------------------------------------+
   |20000000000000000000.000000000000000000|
   +---------------------------------------+
   
   ```
   3.  Incorrect results is very serious and it is not good for Spark users to run into it for common operations like sum.
      


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658896338






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
viirya commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659008506


   Jenkins seems not working one this. But GitHub Actions are passed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-669712680


   I don't agree to revert the UnsafeRow bug fix. As I said, `UnsafeRow` is very fundamental and we can't tolerant any bugs.
   
   I agree that the sum decimal bug becomes more visible with the `UnsafeRow` fix, and I see 2 options (reverting the `UnsafeRow` fix is not an option to me):
   1. Backport the actual fix. This brings backward compatibility issues for streaming state store.
   2. Fail if overflow happens, regardless of the ansi flag. This is not ideal but at least it's better than 3.0.0/2.x, which we fail overflow when whole-stage-codegen is on, and return wrong answer otherwise.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659320372


   **[Test build #125962 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125962/testReport)** for PR 29125 at commit [`7268c05`](https://github.com/apache/spark/commit/7268c0554622c07c420435bec167aa7563e20ecc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659320185






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659365406


   **[Test build #125965 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125965/testReport)** for PR 29125 at commit [`7268c05`](https://github.com/apache/spark/commit/7268c0554622c07c420435bec167aa7563e20ecc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659260465






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659142621


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659119570


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-668364306


   It's "one bug hides another bug". I don't think the right choice is to leave the bug there. `UnsafeRow` is a very fundamental component and I think it's better to fix all the bugs we know. Aggregate is not the only place that uses `UnsafeRow`. It can even be used by external data sources.
   
   If we think the decimal sum overflow is serious enough, we should consider backporting the actual fix, and evaluate the streaming backward compatibility impact.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659365406


   **[Test build #125965 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125965/testReport)** for PR 29125 at commit [`7268c05`](https://github.com/apache/spark/commit/7268c0554622c07c420435bec167aa7563e20ecc).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] skambha commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
skambha commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-668179580


   @cloud-fan,  
   In some overflow scenarios: 
   With just this back port change  it will cause incorrect results to be returned to the user now
   Before this change, the user would see error
   
   The test cases in DataFrameSuite will show these scenarios.   Here is an example taken from there that I tried on spark 3.0.1 with and without this change and you can see this incorrect result behavior. 
   
   This back port by itself causes more scenarios to return incorrect results to the user.  
   
   1) With this back port change, **incorrect results:** 
   ```
   Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /___/ .__/\_,_/_/ /_/\_\   version 3.0.1-SNAPSHOT
         /_/
   
   Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_251)
   Type in expressions to have them evaluated.
   Type :help for more information.
   
   scala>  val decStr = "1" + "0" * 19
   decStr: String = 10000000000000000000
   
   scala>            val d3 = spark.range(0, 1, 1, 1).union(spark.range(0, 11, 1, 1))
   d3: org.apache.spark.sql.Dataset[Long] = [id: bigint] 
       
   scala>             val d5 = d3.select(expr(s"cast('$decStr' as decimal (38, 18)) as d"),
        |               lit(1).as("key")).groupBy("key").agg(sum($"d").alias("sumd")).select($"sumd")
   d5: org.apache.spark.sql.DataFrame = [sumd: decimal(38,18)]
      
   
   scala> d5.show(false)
   +---------------------------------------+
   |sumd                                   |
   +---------------------------------------+
   |20000000000000000000.000000000000000000|
   +---------------------------------------+
   
   ```
   
   2. With this change, **incorrect results** with ansi enabled mode as well. 
   ```Welcome to
         ____              __
        / __/__  ___ _____/ /__
       _\ \/ _ \/ _ `/ __/  '_/
      /___/ .__/\_,_/_/ /_/\_\   version 3.0.1-SNAPSHOT
         /_/
   
   Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_251)
   Type in expressions to have them evaluated.
   Type :help for more information.
   
   scala> spark.conf.set("spark.sql.ansi.enabled","true")
   
   scala> val decStr = "1" + "0" * 19
   decStr: String = 10000000000000000000
   
   scala> val d3 = spark.range(0, 1, 1, 1).union(spark.range(0, 11, 1, 1))
   d3: org.apache.spark.sql.Dataset[Long] = [id: bigint]
   
   scala>  val d5 = d3.select(expr(s"cast('$decStr' as decimal (38, 18)) as d"),
        |      |               lit(1).as("key")).groupBy("key").agg(sum($"d").alias("sumd")).select($"sumd")
   d5: org.apache.spark.sql.DataFrame = [sumd: decimal(38,18)]
   
   scala> d5.show(false)
   +---------------------------------------+
   |sumd                                   |
   +---------------------------------------+
   |20000000000000000000.000000000000000000|
   +---------------------------------------+
   ```
   
   WITHOUT THIS CHANGE.  the same test will throw an error for both the cases (ansi enabled) and not. 
   ```
   scala> val decStr = "1" + "0" * 19
   decStr: String = 10000000000000000000
   
   scala> val d3 = spark.range(0, 1, 1, 1).union(spark.range(0, 11, 1, 1))
   d3: org.apache.spark.sql.Dataset[Long] = [id: bigint]
   
   scala> val d5 = d3.select(expr(s"cast('$decStr' as decimal (38, 18)) as d"),
        |          lit(1).as("key")).groupBy("key").agg(sum($"d").alias("sumd")).select($"sumd")
   d5: org.apache.spark.sql.DataFrame = [sumd: decimal(38,18)]
   
   scala> d5.show(false)
   20/08/03 11:15:05 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)/ 2]
   java.lang.ArithmeticException: Decimal precision 39 exceeds max precision 38
           at org.apache.spark.sql.types.Decimal.set(Decimal.scala:122)
           at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:574)
           at org.apache.spark.sql.types.Decimal.apply(Decimal.scala)
           at org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:393)
   ```
   
   Without this change, the ansi enabled scenario also throws error. 
   ```
   scala> spark.conf.set("spark.sql.ansi.enabled","true")
   
   scala> val decStr = "1" + "0" * 19
   decStr: String = 10000000000000000000
   
   scala>  val d3 = spark.range(0, 1, 1, 1).union(spark.range(0, 11, 1, 1))
   d3: org.apache.spark.sql.Dataset[Long] = [id: bigint]
   
   val d5 = d3.select(expr(s"cast('$decStr' as decimal (38, 18)) as d"),
        |        lit(1).as("key")).groupBy("key").agg(sum($"d").alias("sumd")).select($"sumd")
   d5: org.apache.spark.sql.DataFrame = [sumd: decimal(38,18)]
   
   scala> d5.show(false)
   20/08/03 11:18:08 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
   java.lang.ArithmeticException: Decimal precision 39 exceeds max precision 38
           at org.apache.spark.sql.types.Decimal.set(Decimal.scala:122)
           at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:574)
           at org.apache.spark.sql.types.Decimal.apply(Decimal.scala)
           at org.apache.spark.sql.catalyst.expressions.UnsafeRow.getDecimal(UnsafeRow.java:393)
           at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage3.agg_doAggregate_sum_0$(generated.java:41)
   
   
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun edited a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-670831201


   cc @ScrapCodes since he is a release manager for Apache Spark 2.4.7.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] viirya edited a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
viirya edited a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659008506


   Jenkins seems not working one this. <del>But GitHub Actions are passed.</del>
   
   Oh, this is for 3.0 and GitHub Actions is for master only.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659260465






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659527846


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-659119570






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658896338






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #29125:
URL: https://github.com/apache/spark/pull/29125#issuecomment-658894290


   cc @dongjoon-hyun @viirya 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org