You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by GitBox <gi...@apache.org> on 2022/03/27 08:47:18 UTC

[GitHub] [incubator-kyuubi] ulysses-you commented on pull request #2225: Support to set result max rows for spark engine

ulysses-you commented on pull request #2225:
URL: https://github.com/apache/incubator-kyuubi/pull/2225#issuecomment-1079884217


   > Can ForcedMaxOutputRowsRule be replaced by this one
   
   It's different in physical side. `dataset.take` just like the incremental mode which run job partition by partition if previous partitions data has not satisfied the limit number. `dataset.limit.collect` will run all partitions to do local limit then do global limit in one partition. So in general `dataset.limit.collect` is more effective.
   
   And this PR is useful. Since the limitation is controlled by client side using Incremental mode, but we can control it at engine side using `dataset.take`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org