You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Nicola Crane (Jira)" <ji...@apache.org> on 2022/05/25 08:24:00 UTC

[jira] [Created] (ARROW-16649) [C++] Add support for sorting to the Substrait consumer

Nicola Crane created ARROW-16649:
------------------------------------

             Summary: [C++] Add support for sorting to the Substrait consumer
                 Key: ARROW-16649
                 URL: https://issues.apache.org/jira/browse/ARROW-16649
             Project: Apache Arrow
          Issue Type: Improvement
          Components: C++
            Reporter: Nicola Crane


The streaming execution engine supports sorting (I believe, as a sink node option?), but the Substrait consumer does not currently consume sort relations.  Please can we have support for this?

Here's the example code/plan I tested with:

 
{code:java}
library(dplyr)
library(substrait)

# create a basic table and order it
out <- tibble::tibble(a = 1, b = 2) %>%
  arrow_substrait_compiler() %>%
  arrange(a)

# take a look at the plan created
out$plan()
#> message of type 'substrait.Plan' with 2 fields set
#> extension_uris {
#>   extension_uri_anchor: 1
#> }
#> relations {
#>   root {
#>     input {
#>       sort {
#>         input {
#>           read {
#>             base_schema {
#>               names: "a"
#>               names: "b"
#>               struct_ {
#>                 types {
#>                   fp64 {
#>                   }
#>                 }
#>                 types {
#>                   fp64 {
#>                   }
#>                 }
#>               }
#>             }
#>             named_table {
#>               names: "named_table_1"
#>             }
#>           }
#>         }
#>         sorts {
#>           expr {
#>             selection {
#>               direct_reference {
#>                 struct_field {
#>                 }
#>               }
#>             }
#>           }
#>           direction: SORT_DIRECTION_ASC_NULLS_LAST
#>         }
#>       }
#>     }
#>     names: "a"
#>     names: "b"
#>   }
#> }

# try to run the plan
collect(out)
#> Error: NotImplemented: conversion to arrow::compute::Declaration from Substrait relation sort {
...
#> /home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:73  FromProto(plan_rel.rel(), ext_set)
{code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)