You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Nicola Crane (Jira)" <ji...@apache.org> on 2022/11/24 20:11:00 UTC

[jira] [Created] (ARROW-18403) [C++] Error consuming Substrait plan which uses count function: "only unary aggregate functions are currently supported"

Nicola Crane created ARROW-18403:
------------------------------------

             Summary: [C++] Error consuming Substrait plan which uses count function: "only unary aggregate functions are currently supported"
                 Key: ARROW-18403
                 URL: https://issues.apache.org/jira/browse/ARROW-18403
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++
            Reporter: Nicola Crane


ARROW-17523 added support for the Substrait extension function "count", but when I write code which produces a Substrait plan which calls it, and then try to run it in Acero, I get an error.

The plan:

{code:r}
message of type 'substrait.Plan' with 3 fields set
extension_uris {
  extension_uri_anchor: 1
  uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml"
}
extension_uris {
  extension_uri_anchor: 2
  uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_comparison.yaml"
}
extension_uris {
  extension_uri_anchor: 3
  uri: "https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate_generic.yaml"
}
extensions {
  extension_function {
    extension_uri_reference: 3
    function_anchor: 2
    name: "count"
  }
}
relations {
  rel {
    aggregate {
      input {
        project {
          common {
            emit {
              output_mapping: 9
              output_mapping: 10
              output_mapping: 11
              output_mapping: 12
              output_mapping: 13
              output_mapping: 14
              output_mapping: 15
              output_mapping: 16
              output_mapping: 17
            }
          }
          input {
            read {
              base_schema {
                names: "int"
                names: "dbl"
                names: "dbl2"
                names: "lgl"
                names: "false"
                names: "chr"
                names: "verses"
                names: "padded_strings"
                names: "some_negative"
                struct_ {
                  types {
                    i32 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    bool_ {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    bool_ {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    string {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                  types {
                    fp64 {
                      nullability: NULLABILITY_NULLABLE
                    }
                  }
                }
              }
              local_files {
                items {
                  uri_file: "file:///tmp/RtmpsBsoZJ/file1915f604cff4a"
                  parquet {
                  }
                }
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 1
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 2
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 3
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 4
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 5
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 6
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 7
                }
              }
              root_reference {
              }
            }
          }
          expressions {
            selection {
              direct_reference {
                struct_field {
                  field: 8
                }
              }
              root_reference {
              }
            }
          }
        }
      }
      groupings {
        grouping_expressions {
          selection {
            direct_reference {
              struct_field {
                field: 3
              }
            }
            root_reference {
            }
          }
        }
      }
      measures {
        measure {
          function_reference: 2
          phase: AGGREGATION_PHASE_INITIAL_TO_RESULT
          output_type {
            i64 {
              nullability: NULLABILITY_NULLABLE
            }
          }
          invocation: AGGREGATION_INVOCATION_ALL
        }
      }
    }
  }
}
{code}

The error:


{code:java}
Error: NotImplemented: Only unary aggregate functions are currently supported
/home/nic2/arrow/cpp/src/arrow/engine/substrait/relation_internal.cc:587  converter(aggregate_call)
/home/nic2/arrow/cpp/src/arrow/engine/substrait/serde.cc:153  FromProto(plan_rel.has_root() ? plan_rel.root().input() : plan_rel.rel(), ext_set, conversion_options)
{code}

I have no idea what the "phase" and "invocation" fields above do, but previous attempts to get Acero to consume this plan led to errors due to me using default values instead of the ones specified there (e.g. "Not Implemented: Unsupported aggregation phase 'AGGREGATION_PHASE_UNSPECIFIED'"), so I just changed them to see if it helped.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)