You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "Fabian Hueske (JIRA)" <ji...@apache.org> on 2015/10/30 21:47:27 UTC

[jira] [Commented] (FLINK-2946) Add orderBy() to Table API

    [ https://issues.apache.org/jira/browse/FLINK-2946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14983283#comment-14983283 ] 

Fabian Hueske commented on FLINK-2946:
--------------------------------------

At the moment, {{sortPartition()}} with parallelism 1 is the only way to do it. 
But you should keep an eye on the effort to add support for range partitioning (FLINK-7) and the corresponding [pull request|https://github.com/apache/flink/pull/1255]. With range partitioning, you can do the sort in parallel.

Range partitioning can be used if the result is stored as partitioned files. If the result should be collected to or printed on the client, you still have to do the sort with parallelism 1 because Flink doesn't fetch the partitions of the result in order. However, that should not be a limitation, because the size of data sets to collect is restricted anyways.


> Add orderBy() to Table API
> --------------------------
>
>                 Key: FLINK-2946
>                 URL: https://issues.apache.org/jira/browse/FLINK-2946
>             Project: Flink
>          Issue Type: New Feature
>          Components: Table API
>            Reporter: Timo Walther
>            Assignee: Timo Walther
>
> In order to implement a FLINK-2099 prototype that uses the Table APIs code generation facilities, the Table API needs a sorting feature.
> I would implement it the next days. Ideas how to implement such a sorting feature are very welcome. Is there any more efficient way instead of {{.sortPartition(...).setParallism(1)}}? Is it better to sort locally on the nodes first and finally sort on one node afterwards?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)