You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/06/14 08:04:46 UTC

[GitHub] [arrow-datafusion] wqc200 opened a new pull request #552: mv register_schema() to impl

wqc200 opened a new pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552


   # Which issue does this PR close?
   
   <!--
   We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123.
   -->
   
   Closes #.
   
    # Rationale for this change
   <!--
    Why are you proposing this change? If this is already explained clearly in the issue then this section is not needed.
    Explaining clearly why changes are proposed helps reviewers understand your changes and offer better suggestions for fixes.  
   -->
   
   # What changes are included in this PR?
   <!--
   There is no need to duplicate the description in the issue here but it is sometimes worth providing a summary of the individual changes in this PR.
   -->
   
   # Are there any user-facing changes?
   <!--
   If there are user-facing changes then we may require documentation to be updated before approving the PR.
   -->
   
   <!--
   If there are any breaking changes to public APIs, please add the `api change` label.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter edited a comment on pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
codecov-commenter edited a comment on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-860214243






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] returnString edited a comment on pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
returnString edited a comment on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-866759239


   Had to remind myself how all the catalog code works, but we _do_ support registering tables into a schema via the trait. However, the trait method has a default impl which fails:
   
   
   ```rust
   fn register_table(
   	&self,
   	name: String,
   	table: Arc<dyn TableProvider>,
   ) -> Result<Option<Arc<dyn TableProvider>>> {
   	Err(DataFusionError::Execution(
   		"schema provider does not support registering tables".to_owned(),
   	))
   }
   ```
   
   So if we wanted to do this for registering schemas into catalogs, we should probably use the same style of fallible operation.
   
   However, I only added this so as to continue supporting ExecutionContext::register_table et al when doing the initial catalog implementation, and it _does_ make the API more awkward for the infallible case of working with memory schemas.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] codecov-commenter commented on pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-860214243


   # [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/552?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#552](https://codecov.io/gh/apache/arrow-datafusion/pull/552?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (681f2aa) into [master](https://codecov.io/gh/apache/arrow-datafusion/commit/174226c086a4838eab2a238853b4871c295c0189?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (174226c) will **increase** coverage by `1.12%`.
   > The diff coverage is `75.16%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-datafusion/pull/552/graphs/tree.svg?width=650&height=150&src=pr&token=JXwWBKD3D9&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-datafusion/pull/552?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #552      +/-   ##
   ==========================================
   + Coverage   74.94%   76.07%   +1.12%     
   ==========================================
     Files         146      156      +10     
     Lines       24314    27055    +2741     
   ==========================================
   + Hits        18223    20581    +2358     
   - Misses       6091     6474     +383     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-datafusion/pull/552?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [ballista/rust/client/src/columnar\_batch.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jbGllbnQvc3JjL2NvbHVtbmFyX2JhdGNoLnJz) | `0.00% <ø> (ø)` | |
   | [ballista/rust/client/src/context.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jbGllbnQvc3JjL2NvbnRleHQucnM=) | `0.00% <0.00%> (ø)` | |
   | [ballista/rust/core/src/client.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9jbGllbnQucnM=) | `0.00% <ø> (ø)` | |
   | [ballista/rust/core/src/datasource.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9kYXRhc291cmNlLnJz) | `0.00% <ø> (ø)` | |
   | [ballista/rust/core/src/error.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9lcnJvci5ycw==) | `0.00% <ø> (ø)` | |
   | [...ta/rust/core/src/execution\_plans/shuffle\_reader.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9leGVjdXRpb25fcGxhbnMvc2h1ZmZsZV9yZWFkZXIucnM=) | `0.00% <0.00%> (ø)` | |
   | [...ust/core/src/execution\_plans/unresolved\_shuffle.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9leGVjdXRpb25fcGxhbnMvdW5yZXNvbHZlZF9zaHVmZmxlLnJz) | `50.00% <ø> (ø)` | |
   | [ballista/rust/core/src/memory\_stream.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9tZW1vcnlfc3RyZWFtLnJz) | `60.00% <ø> (+60.00%)` | :arrow_up: |
   | [ballista/rust/core/src/serde/physical\_plan/mod.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9zZXJkZS9waHlzaWNhbF9wbGFuL21vZC5ycw==) | `100.00% <ø> (ø)` | |
   | [...ista/rust/core/src/serde/physical\_plan/to\_proto.rs](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YmFsbGlzdGEvcnVzdC9jb3JlL3NyYy9zZXJkZS9waHlzaWNhbF9wbGFuL3RvX3Byb3RvLnJz) | `49.38% <0.00%> (-1.24%)` | :arrow_down: |
   | ... and [107 more](https://codecov.io/gh/apache/arrow-datafusion/pull/552/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/552?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-datafusion/pull/552?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [174226c...681f2aa](https://codecov.io/gh/apache/arrow-datafusion/pull/552?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-867825742


   > For example, if I were building a traditional database, here's how I'd execute queries:
   
   In IOx we took a very similar approach to what you describe: 🤣  https://github.com/influxdata/influxdb_iox/blob/main/server/src/db/access.rs#L201


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] wqc200 commented on pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
wqc200 commented on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-861939836


   > 
   > 
   > Thanks @wqc200 ! Can you please explain the reason for this proposed API change ?
   > 
   > It seems strange to me, for example, for wanting to extend the built in InformationSchema provider -- though I suspect you have a reason for wanting to do so. It would help to know what the reason was
   
   @alamb thanks
   
   I'm going to call the DataFusion Context from an external program, and I'm going to dynamically add a schema to the catalog in the DataFusion Context, so I need to add an API to the DataFusion Context.
   To sum up, I needed the ability to call DataFusion Context from the outside.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] wqc200 closed pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
wqc200 closed pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] returnString commented on pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
returnString commented on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-866759239


   Had to remind myself how all the catalog code works, but we _do_ support registering tables into a schema via the trait. However, the trait method has a default impl which fails:
   
   
   ```rust
   fn register_table(
   	&self,
   	name: String,
   	table: Arc<dyn TableProvider>,
   ) -> Result<Option<Arc<dyn TableProvider>>> {
   	Err(DataFusionError::Execution(
   		"schema provider does not support registering tables".to_owned(),
   	))
   }
   ```
   
   So if we wanted to do this for registering schemas into catalogs, we should probably use the same style of fallible operation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] returnString commented on pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
returnString commented on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-866756001


   Hrm, I think if we're moving this to the interface, we need to codify some notion of "unsupported operation". That's actually why I left it out initially - registering a schema inside e.g. the information_schema catalog doesn't really make sense, because it's a read-only projection of DB internals, and I didn't want to commit to more public API than was necessary.
   
   In my DataFusion-powered projects, I typically treat ExecutionContext instances as immutable which simplifies a lot of the setup. Essentially this entails creating catalogues using concrete types like `MemoryCatalogProvider` and then just attaching those to a new context, so I can work with the type-specific impls, rather than just trait methods. I'm not sure how widely adopted this is as a methodology, but I've found it works well!
   
   For example, if I were building a traditional database, here's how I'd execute queries:
   - build the list of catalogs (and internally, schemas/tables) the user has permissions to access (this relies on out-of-band data)
   - create an execution context populated with said catalog list
   - run the query using the context
   - discard the context
   
   Obviously this relies on the context setup being quite cheap, but I don't see any moves toward making that a particularly intensive process 😄 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-datafusion] alamb commented on pull request #552: mv register_schema() to impl

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #552:
URL: https://github.com/apache/arrow-datafusion/pull/552#issuecomment-866741671


   @Dandandan  or @returnString  do you have any thoughts on this API change -- I think getting some other perspectives might help


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org