You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Till Rohrmann (JIRA)" <ji...@apache.org> on 2015/08/20 11:18:22 UTC
[jira] [Comment Edited] (FLINK-2549) Add topK operator for DataSet
[ https://issues.apache.org/jira/browse/FLINK-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14704565#comment-14704565 ]
Till Rohrmann edited comment on FLINK-2549 at 8/20/15 9:17 AM:
---------------------------------------------------------------
I agree with [~StephanEwen]. Sorting the complete input with n elements has a complexity of O(n * log( n )) whereas keeping the k top most elements in a priority queue gives you in worst case O(n * log( k )). Assuming k << n, then this is worth the effort.
was (Author: till.rohrmann):
I agree with [~StephanEwen]. Sorting the complete input with n elements has a complexity of O(n * log(n)) whereas keeping the k top most elements in a priority queue gives you in worst case O(n * log(k)). Assuming k << n, then this is worth the effort.
> Add topK operator for DataSet
> -----------------------------
>
> Key: FLINK-2549
> URL: https://issues.apache.org/jira/browse/FLINK-2549
> Project: Flink
> Issue Type: New Feature
> Components: Core, Java API, Scala API
> Reporter: Chengxiang Li
> Assignee: Chengxiang Li
> Priority: Minor
>
> topK is a common operation for user, it would be great to have it in Flink.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)