You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/09/21 11:42:12 UTC

[GitHub] [spark] hvanhovell commented on pull request #29810: [SPARK-32940][SQL] Collect, first and last should be deterministic aggregate functions

hvanhovell commented on pull request #29810:
URL: https://github.com/apache/spark/pull/29810#issuecomment-696061369


   Maybe I am missing something here. AFAIK the problem with First/Last/CollectList methods is that we can't control how results are merged. This depends on how we shuffle fetches results and this is not deterministic.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org