You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "yash-gupta167 (via GitHub)" <gi...@apache.org> on 2023/03/26 13:38:12 UTC

[GitHub] [arrow-datafusion] yash-gupta167 commented on issue #5725: Depend on Arrow Subcrates

yash-gupta167 commented on issue #5725:
URL: https://github.com/apache/arrow-datafusion/issues/5725#issuecomment-1484099057

   Refactor the DataFusion ```Cargo.toml``` files. For each DataFusion component, update the ```Cargo.toml``` file to replace the top-level Arrow dependency with the specific subcrates identified .
   For example, if a DataFusion component currently has the following dependency in its Cargo.toml:
   ```
   [dependencies]
   arrow = "x.y.z"
   ```
   And it only needs the ```arrow-array```, ```arrow-ipc```, and ```arrow-compute``` subcrates, you can update the ```Cargo.toml``` file as follows:
   ```
   [dependencies]
   arrow-array = { version = "x.y.z", package = "arrow" }
   arrow-ipc = { version = "x.y.z", package = "arrow" }
   arrow-compute = { version = "x.y.z", package = "arrow" }
   ```
   -Update the DataFusion source code: Modify the import statements in the DataFusion source code to use the specific subcrates instead of the top-level Arrow crate. This may require updating module paths and potentially some minor code changes to accommodate the new dependencies.
   
   -Test the refactored DataFusion components: After refactoring the dependencies, run the test suite to ensure that all DataFusion components still function correctly. Address any issues that arise during testing.
   
   -Benchmark and document the improvements: Measure and document the improvements in compilation time and binary size reduction as a result of this refactoring. This can serve as a reference for future optimizations.
   Please Let me know if this solution is relevant or am I going wrong?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org