You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/24 11:19:58 UTC

[GitHub] [arrow-rs] alamb opened a new issue, #1613: Improve arrow-rs examples

alamb opened a new issue, #1613:
URL: https://github.com/apache/arrow-rs/issues/1613

   **Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
   To help people get up to speed with a new library quickly, it is helpful to start with compiling examples. The arrow-rs [docs](https://docs.rs/arrow/latest/arrow/) have many small working examples,  but it would also be nice to also include some full featured examples.
   
   clap-rs has a nice example of examples: https://github.com/clap-rs/clap/blob/v3.1.12/examples/README.md
   
   This is especially useful as we start making them easier to find such as https://github.com/apache/arrow-cookbook/pull/185
   
   **Describe the solution you'd like**
   Here are some possible improvements
   - [ ] Add a readme to the examples directory with descriptions of each
   - [ ] Add a link to the examples from the main readme page
   - [ ] Add examples of reading from a CSV File, and computing some simple calculation (max of each columns?)
   - [ ] Add examples of calling computation kernels (showing how to downcast)
   - [ ] Add automated checking (the way that clap does https://github.com/clap-rs/clap/blob/v3.1.12/examples/README.md#contributing)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Improve arrow-rs examples [arrow-rs]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #1613:
URL: https://github.com/apache/arrow-rs/issues/1613#issuecomment-1847654736

   I don't think this is tracking anything actionable now, so closing this ticket


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Improve arrow-rs examples [arrow-rs]

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb closed issue #1613: Improve arrow-rs examples
URL: https://github.com/apache/arrow-rs/issues/1613


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] datapythonista commented on issue #1613: Improve arrow-rs examples

Posted by GitBox <gi...@apache.org>.
datapythonista commented on issue #1613:
URL: https://github.com/apache/arrow-rs/issues/1613#issuecomment-1146809855

   I'm planning to work on this. What I'd personally do, is to have many small examples of increasing complexity. So, besides examples and recipes, it can be used as a tutorial, to learn Arrow topics step by step.
   
   If people are happy with this, I'll start working on PRs for the next:
   
   - Creating arrays for primitive types
     - With the array constructor (e.g. `Int32Array::from(vec![...])`)
     - With a builder (using `append_value` and `append_null)
     - With `collect()`
    - Creating arrays with null values. I'm unsure about this one, if the above are simple enough, probably we can have this in the above examples. But worth having this here for consideration for now
    - Creating arrays of more complex types (e.g. `Dictionary`, `Struct`...)
    - Creating `Schema`
    - Creating `RecordBatch`
    - Reading from different formats
      - Parquet
      - CSV
      - JSON
    - Writing to different formats (same)
    - Data manipulation and kernels. Will expand on this when the rest are done, for now just couple of simple examples to have something.
   
   Not sure how feasible it is, but would be amazing if we could render those examples (which will have documentation explaining what's going on) direct to the Arrow cookbook. I think it's a bit tricky, but doable. And I think it's better than having to maintain two different cookbooks/examples, or just having them in one place.
   
   Feedback on any of this very welcome.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow-rs] alamb commented on issue #1613: Improve arrow-rs examples

Posted by GitBox <gi...@apache.org>.
alamb commented on issue #1613:
URL: https://github.com/apache/arrow-rs/issues/1613#issuecomment-1148490560

   Thank you @datapythonista ! That sounds like a great (and also quite ambitious) plan
   
   I remember @elferherrera may have worked on something similar so perhaps he has a comment.
   
   Most array types have examples in the rust docs (thanks to @novemberkilo ), for example https://docs.rs/arrow/15.0.0/arrow/array/struct.DictionaryArray.html and https://docs.rs/arrow/15.0.0/arrow/array/type.Int64Array.html -- perhaps we could create a 'cookbook' that has the key examples and then links to the docs for more details -- there may be some way to keep the content in the `cookbook` and then `include` them into the rust docs
   
   In general, it might be worth thinking how these examples will be maintained (specifically how to make sure they key working) -- rustdoc examples get automatically checked as part of CI and there is a way to run examples in markdown as well -- perhaps via https://crates.io/crates/doc-comment or something similar (example https://github.com/LaikaStudios/shotgrid-rs/pull/12/files)
   
   
   > Creating arrays with null values. I'm unsure about this one, if the above are simple enough, probably we can have this in the above examples. But worth having this here for consideration for now
   
   I agree that it may be enough to start with arrays with 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org