You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Vibhatha Lakmal Abeykoon (Jira)" <ji...@apache.org> on 2022/07/22 10:02:00 UTC
[jira] [Created] (ARROW-17183) [C++] Adding ExecNode with Sort and Fetch capability
Vibhatha Lakmal Abeykoon created ARROW-17183:
------------------------------------------------
Summary: [C++] Adding ExecNode with Sort and Fetch capability
Key: ARROW-17183
URL: https://issues.apache.org/jira/browse/ARROW-17183
Project: Apache Arrow
Issue Type: New Feature
Components: C++
Reporter: Vibhatha Lakmal Abeykoon
Assignee: Vibhatha Lakmal Abeykoon
In Substrait integrations with ACERO, a functionality required is the ability to fetch records sorted and unsorted.
Fetch operation is defined as selecting `K` number of records with an offset. For instance pick 10 records skipping the first 5 elements. Here we can define this as a Slice operation and records can be easily extracted in a sink-node.
Sort and Fetch operation applies when we need to execute a Fetch operation on sorted data. The main issue is we cannot have a sort node followed by a fetch. The reason is that all existing node definitions supporting sort are based on sink nodes. Since there cannot be a node followed by sink, this functionality has to take place in a single node.
But this is not a perfect solution for fetch and sort, but one way to do this is define a sink node where the records are sorted and then a set of items are fetched.
Another dilema is what if sort is followed by a fetch. In that case, there has to be a flag to enable the order of the operations.
The objective of this ticket is to discuss a viable efficient solution and include new nodes or a method to execute such a logic.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)