You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2015/12/03 03:57:11 UTC

[jira] [Commented] (FLINK-2549) Add topK operator for DataSet

    [ https://issues.apache.org/jira/browse/FLINK-2549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037155#comment-15037155 ] 

ASF GitHub Bot commented on FLINK-2549:
---------------------------------------

Github user ChengXiangLi commented on the pull request:

    https://github.com/apache/flink/pull/1161#issuecomment-161501441
  
    Hi, @StephanEwen , is there any progress of Managed Memory Allocation abstractions for UDF? Not only about TopK operator, i think it's also very important for users or DSLs to build more robust and efficient applications. For example, in Table API queries, as the data schema is predictable during each phase of processing, we does not need to create real `Row` object, just store the binary data in self managed memory, and use the offset to read `Row` fields. So all the intermediate data is store as binary on self managed memory, no need to create lots of `Row` object and its fields object anymore, which should be more robust, memory-efficient, and with better performance. 


> Add topK operator for DataSet
> -----------------------------
>
>                 Key: FLINK-2549
>                 URL: https://issues.apache.org/jira/browse/FLINK-2549
>             Project: Flink
>          Issue Type: New Feature
>          Components: Core, Java API, Scala API
>            Reporter: Chengxiang Li
>            Assignee: Chengxiang Li
>            Priority: Minor
>
> topK is a common operation for user, it would be great to have it in Flink. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)