You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/10/17 04:36:52 UTC

[GitHub] [arrow] jorgecarleitao opened a new pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

jorgecarleitao opened a new pull request #8481:
URL: https://github.com/apache/arrow/pull/8481


   This PR:
   
   * Renamed `ExecutionError` to `DataFusionError`
   * Renamed `DataFusionError::ParserError` to `DataFusionError::SQL`
   * Renamed `DataFusionError::InternalError` to `DataFusionError::Internal`
   * Renamed `DataFusionError::ExecutionError` to `DataFusionError::Execution`
   * Adds a new error variant `DataFusionError::Plan` that is used during planning
   * Removes `DataFusionError::InvalidColumn` that was not being used.
   * Removes `DataFusionError::General`, replacing them by the appropriate errors
   * Improves the message of `DataFusionError::Internal` to incentivize users to file a bug report when it happens
   * Extended the documentation of every variant
   
   The design behind this PR is that the error variants should correspond to what happened:
   
   * `Internal`: a Datafusion's internal invariant was violated (e.g. downcast failed) => file a bug report
   * `Plan`: planning was incorrect
   * `NotImplemented`: something is not implemented and we know about it. Ideally, we should have an associated JIRA issue
   * `Execution`: an error during execution. We should avoid raising these, but sometimes it is impossible.
   * `IoError`: stuff related with reading and writing
   
   I went through every error that we return in `DataFusion` and verified that it is assigned correctly to one of these variants.
   
   I am a bit uncertain about the `ParquetError` and `ArrowError`. IMO `ArrowError` should be mapped to `DataFusionError::Execution`, as it only happens during execution, and `ParquetError` should be mapped to an `IoError`.
   
   I also think that we should split `NotImplemented` in two: `NotImplemented` and `NotSupported` as e.g. `float16` is something that we will likely never support, while "modulus" is just not implemented yet.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm edited a comment on pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

Posted by GitBox <gi...@apache.org>.
wesm edited a comment on pull request #8481:
URL: https://github.com/apache/arrow/pull/8481#issuecomment-712567457


   If a PR has a base commit that is one of the master commits that is part of the rebased master (i.e. some commit between the date the RC was cut and the date of the release), then the PR is completely borked after the master-rebase. We've seen this happen many times in the past. We could perhaps come up with some more intelligent script that only rebases PRs that are based on a commit hash that no longer exists in master


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #8481:
URL: https://github.com/apache/arrow/pull/8481#issuecomment-712895885


   I agree with raising the matter on the mailing list -- this is a project governance issue and so needs to be discussed there. We did not arise at the current practices idly and there are pros/cons to whatever approach is taken. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] wesm commented on pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

Posted by GitBox <gi...@apache.org>.
wesm commented on pull request #8481:
URL: https://github.com/apache/arrow/pull/8481#issuecomment-712567457


   If a PR has a base commit that is one of the master commits that is part of the rebased master (i.e. some commit between the date the RC was cut and the date of the release), then the PR is completely borked after the rebase. We've seen this happen many times in the past. We could perhaps come up with some more intelligent script that only rebases PRs that are based on a commit hash that no longer exists in master


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao closed pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

Posted by GitBox <gi...@apache.org>.
jorgecarleitao closed pull request #8481:
URL: https://github.com/apache/arrow/pull/8481


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8481:
URL: https://github.com/apache/arrow/pull/8481#issuecomment-712566577


   @kszucs, would it be possible to avoid force-pushing to PRs after a release (even if master is force-pushed)? IMO it is very confusing to the contributor (committer or not).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8481:
URL: https://github.com/apache/arrow/pull/8481#issuecomment-710747726


   https://issues.apache.org/jira/browse/ARROW-10331


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorgecarleitao commented on pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

Posted by GitBox <gi...@apache.org>.
jorgecarleitao commented on pull request #8481:
URL: https://github.com/apache/arrow/pull/8481#issuecomment-712584694


   We are already force-pushing to master on every release, which goes against best practices of an open-source project.
   
   AFAIK, in open source, there is a strong expectation that PRs are managed by individual contributors, and committers of the project only request contributors to make changes to them, or kindly ask before pushing (not force-pushing) directly to the PR. We are inverting all expectations and _force-pushing_ to PRs(!!). Furthermore, those force-pushes actually break the PRs (as you can see above).  IMO this drives any reasonable contributor to be pissed off at the team for what they just did:
   
   * force-pushing to master
   * force-pushing to their PRs
   * breaking their PRs's CI
   * no prior notice or request of any of the above
   
   I suggest that we:
   
   1. stop force-pushing to PRs
   2. stop pushing to PRs without the contributor's explicit consent
   3. document in [the contributing guidelines](https://github.com/apache/arrow/blob/master/.github/CONTRIBUTING.md) that master is force-pushed on every release, and the steps that folks need to take to bring their PRs to the latest master.
   
   In general, it is the contributor's responsibility to keep the PRs in a "ready to merge" state, rebasing them to master as master changes. IMO a force-push to master corresponds to a change in master, and thus it is the contributor's responsibility to rebase their PRs against the new master.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] kszucs commented on pull request #8481: ARROW-10331: [Rust] [DataFusion] Re-organize DataFusion errors

Posted by GitBox <gi...@apache.org>.
kszucs commented on pull request #8481:
URL: https://github.com/apache/arrow/pull/8481#issuecomment-712797185


   I see your points. 
   
   Keeping the main branch flat is important and the commits after the release tag should have the right version numbers since git is not available in all scenarios, hence the rebase after a release. 
   
   We could certainly let the contributors to rebase their own branches, but in certain cases the contributors may be confused what to do after seeing many unrelated commits in their pull requests so additional maintainer roundtrips might be required to ask/advise for pull request rebases. From this perspective I rather find it useful although I'm also against force pushes in general.
   
   I suggest you to bring it up on the mailing list so the topic can reach a broader set of developers and maintainers. I'm sure that we can come up with a better solution after there is a consensus. 
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org