You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/03 14:01:00 UTC

[GitHub] [arrow-datafusion] tustvold opened a new issue, #2426: ShuffleWriterExec::schema mismatch

tustvold opened a new issue, #2426:
URL: https://github.com/apache/arrow-datafusion/issues/2426

   **Describe the bug**
   
   `ShuffleWriterExec::schema()` returns the schema of the underlying plan, however, `ShuffleWriterExec::execute` returns a stream of RecordBatch containing metadata and a consequently completely different schema.
   
   **To Reproduce**
   
   Use `ShuffleWriterExec`
   
   **Expected behavior**
   
   `ExecutionPlan::schema` should return the same schema as the `SendableRecordBatchStream` yielded by `ExecutionPlan::execute`.
   
   **Additional context**
   
   There is a potentially valid question as to why we have the schema stored in so many places...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-datafusion] tustvold commented on issue #2426: ShuffleWriterExec::schema mismatch

Posted by GitBox <gi...@apache.org>.
tustvold commented on issue #2426:
URL: https://github.com/apache/arrow-datafusion/issues/2426#issuecomment-1116211592

   I tried changing this in #2428 but it leads distributed_join_plan to fail with
   
   ```
   Error: DataFusionError(Plan("The left or right side of the join does not have all columns on \"on\": \nMissing on the left: {Column { name: \"l_orderkey\", index: 0 }}\nMissing on the right: {Column { name: \"o_orderkey\", index: 0 }}"))
   ```
   
   I'm not familiar enough with this code to know what is going on here, but something doesn't feel right


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org