You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/05/13 17:58:56 UTC
[GitHub] [arrow-datafusion] alamb opened a new pull request, #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
alamb opened a new pull request, #2530:
URL: https://github.com/apache/arrow-datafusion/pull/2530
This PR demonstrates a test that fails with the code from https://github.com/apache/arrow-rs/pull/1682 in arrow (not included in arrow 14.0.0).
This PR pins datafusion to arrow right after https://github.com/apache/arrow-rs/pull/1682 was merged at commit https://github.com/apache/arrow-rs/commit/5b154ea40314dc2f09babbb363bf7f1fe439d4eb
To reproduce:
```shell
cargo test -p datafusion --lib
```
Results in:
```shell
failures:
---- physical_plan::file_format::parquet::tests::evolved_schema_filter stdout ----
thread 'physical_plan::file_format::parquet::tests::evolved_schema_filter' panicked at 'called `Result::unwrap()` on an `Err` value: ArrowError(ExternalError(ParquetError(General("out of order projection is not supported"))))', datafusion/core/src/physical_plan/file_format/parquet.rs:968:14
---- physical_plan::file_format::parquet::tests::evolved_schema_inconsistent_order stdout ----
thread 'physical_plan::file_format::parquet::tests::evolved_schema_inconsistent_order' panicked at 'called `Result::unwrap()` on an `Err` value: ArrowError(ExternalError(ParquetError(General("out of order projection is not supported"))))', datafusion/core/src/physical_plan/file_format/parquet.rs:819:14
failures:
physical_plan::file_format::parquet::tests::evolved_schema_filter
physical_plan::file_format::parquet::tests::evolved_schema_inconsistent_order
test result: FAILED. 656 passed; 2 failed; 1 ignored; 0 measured; 0 filtered out; finished in 2.00s
error: test failed, to rerun pass '-p datafusion --lib'
Error: Process completed with exit code 101.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] tustvold commented on pull request #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
Posted by GitBox <gi...@apache.org>.
tustvold commented on PR #2530:
URL: https://github.com/apache/arrow-datafusion/pull/2530#issuecomment-1126346437
I've not had time to look properly yet, but my suspicion is that the schema adapter logic knows what the expected output schema is and rearranges the columns - masking the fact what was returned by the parquet reader did not respect the projection order.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] tustvold commented on pull request #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
Posted by GitBox <gi...@apache.org>.
tustvold commented on PR #2530:
URL: https://github.com/apache/arrow-datafusion/pull/2530#issuecomment-1126308789
Likely related https://github.com/apache/arrow-datafusion/issues/2453 - the DataFusion logic for handling column projection to parquet is currently silently broken and likely only working because of the schema adapter logic
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] tustvold closed pull request #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
Posted by GitBox <gi...@apache.org>.
tustvold closed pull request #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
URL: https://github.com/apache/arrow-datafusion/pull/2530
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on a diff in pull request #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
Posted by GitBox <gi...@apache.org>.
alamb commented on code in PR #2530:
URL: https://github.com/apache/arrow-datafusion/pull/2530#discussion_r872646998
##########
Cargo.toml:
##########
@@ -38,3 +38,8 @@ exclude = ["ballista-cli", "datafusion-cli"]
[profile.release]
codegen-units = 1
lto = true
+
+[patch.crates-io]
+arrow = { git = "https://github.com/apache/arrow-rs.git", rev="5b154ea40314dc2f09babbb363bf7f1fe439d4eb" }
Review Comment:
Right after https://github.com/apache/arrow-rs/pull/1682
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on pull request #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
Posted by GitBox <gi...@apache.org>.
alamb commented on PR #2530:
URL: https://github.com/apache/arrow-datafusion/pull/2530#issuecomment-1126338572
🤔 I suppose we'll have to fix datafusion then...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on pull request #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
Posted by GitBox <gi...@apache.org>.
alamb commented on PR #2530:
URL: https://github.com/apache/arrow-datafusion/pull/2530#issuecomment-1126338989
> the DataFusion logic for handling column projection to parquet is currently silently broken and likely only working because of the schema adapter logic
I don't understand how things can be broken but also be working...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-datafusion] alamb commented on pull request #2530: Error after pre-release arrow upgrade: "out of order projection is not supported" (NOT FOR MERGING)
Posted by GitBox <gi...@apache.org>.
alamb commented on PR #2530:
URL: https://github.com/apache/arrow-datafusion/pull/2530#issuecomment-1127553540
Filed https://github.com/apache/arrow-datafusion/issues/2543 to track
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org