You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/03 20:33:48 UTC

[GitHub] [arrow] rtpsw commented on pull request #12601: ARROW-15901: [C++] Support Substrait projection with custom output field names

rtpsw commented on PR #12601:
URL: https://github.com/apache/arrow/pull/12601#issuecomment-1116576332

   > Another approach could be to add a vector of names to `ConsumingSinkNodeOptions`. The sink node could then take the output schema from its input, swap out the names, and pass that on to the consumer. This should round-trip nicely.
   
   @westonpace, I looked into it a bit and it appears to run into an issue - the schema is not available to `ConsumingSinkNode`. Instead, `ConsumingSinkNode` passes a (schema-less) `ExecBatch` to a `SinkNodeConsumer`; it is `SinkNodeConsumer` which is expected to already have been initialized with a schema, like in `TableSinkNodeConsumer`, and use it to obtain a `RecordBatch`. Perhaps a reasonable solution would be to extend `ConsumerFactory` to accept a `names` parameter? Despite this being a backwards-incompatible change, presumably the Arrow-Substrait code is still in flex so this would be OK.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org