You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Andy Grove <an...@gmail.com> on 2020/10/13 15:11:53 UTC

[Rust] Blog post for 2.0.0

There has been a huge amount of activity in the Rust subproject for the
2.0.0 release and I think that we should write a Rust-specific blog post to
go on the Arrow blog.

I made a brief start at a Google doc, which is mostly just bullet points
listing some things we could talk about. I'm sure I've missed some things,
and maybe we have too many things to talk about so we might want to try and
summarize some of this.

Here is the doc ... I would appreciate any help anyone can provide with
this. Perhaps if each contributor could flesh out the content around things
they directly worked on or are knowledgeable about, that would be great.

https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing

Thanks,

Andy.

Re: [Rust] Blog post for 2.0.0

Posted by Andy Grove <an...@gmail.com>.
The Rust blog post is now live:
https://arrow.apache.org/blog/2020/10/27/rust-2.0.0-release/

On Sun, Oct 18, 2020 at 12:46 PM Fernando Herrera <
fernando.j.herrera@gmail.com> wrote:

> Thanks Jorge for helping me to get across the need for a user guide. The
> examples you used are exactly what I had in mind. It would be great if the
> project had a user guide similar to tokio's. We could use this guide to
> explain how to get started and some examples using the available crates
> (Arrow, Arrow-Flight and Datafusion)
>
> I think that in order to start with the guide, I could take the approach
> you suggested using either doc_comment or rust-skeptic to write down md
> files to sketch the user guide. This guide could be included in each of the
> project folders. e.g. arrow, flight and datafusion. Although, I'm a bit
> inclined to have a single user guide (placed at the rust folder) that will
> include a reference to all the elements that are included in the rust arrow
> folder. This way the guide could be planned in such a way that a new user
> would start learning about arrow arrays and finish doing queries on data
> loaded from CVS or parquet files.
>
> Is this something that could interest you?
>
> Regards,
> Fernando
>
> On Fri, Oct 16, 2020 at 10:32 PM Jorge Cardoso Leitão <
> jorgecarleitao@gmail.com> wrote:
>
> > Hi,
> >
> > I would like to thank Fernando for raising this concern here: I also
> think
> > that we still do not put enough effort in the documentation :) I admit
> that
> > when I started in the project, I also had that need and just had some
> time
> > to go through the code.
> >
> > First, I find it useful to distinguish types of documentation:
> >
> > 1. API references
> > 2. user guide / tutorial
> >
> > The distinction between the two being that the former covers detailed
> usage
> > of a given function/struct/trait etc, while the latter covers usage of
> the
> > library as a whole (e.g. how a function is used in combination with
> > others). Virtually every
> > <https://docs.djangoproject.com/en/3.1/intro/tutorial01/> mature
> > <https://pandas.pydata.org/docs/user_guide/index.html> library
> > <https://spark.apache.org/docs/latest/quick-start.html> or
> > <https://www.tensorflow.org/tutorials/quickstart/beginner> framework
> > <https://docs.python.org/3/tutorial/> has both, as they serve different
> > but
> > well defined use-cases.
> >
> > The canonical way of documenting an API in rust is via the `docs.rs`,
> that
> > generates a format common for rust projects and has a *significant*
> benefit
> > for both writers and readers (for Python users, auto-docs on steroids:
> > auto-links to classes declarations, references, testing the examples are
> > part of running the tests, the documentation is written next to the
> actual
> > source code, reference to the source code on a single click). Rust users
> > expect the API documentation to be in `docs.rs` and released as part of
> > crate. I agree with @Andy that we should stick to docs.rs for this.
> While
> > there is always room for improvement, we do have the basics in place.
> >
> > I think that Fernando is alluding to the fact that we do not have a user
> > guide / tutorial, and I agree: we are missing one such as tokio
> > <https://tokio.rs/tokio/tutorial>'s, SIMD
> > <https://rust-lang.github.io/packed_simd/perf-guide/introduction.html
> >'s,
> > Rocket <https://rocket.rs/v0.4/guide/>'s or rust's book
> > <https://doc.rust-lang.org/book/>, that covers how to use the
> > library/framework.
> >
> > The main challenge is to ensure that the guide does not get deprecated.
> > Looking at what other rust libs are doing, Serde, Tokio and Rocket write
> > their guides in markdown and test the code on their guides (here: tokio
> > <https://github.com/tokio-rs/website/tree/master/doc-test>, Rocket
> > <https://github.com/SergioBenitez/Rocket/tree/v0.4/site/tests>). Rocket
> > use
> > their own codegen to test the docs, tokio uses doc_comment
> > <https://docs.rs/doc-comment/0.3.3/doc_comment/>
> >
> > > The point of this (small) crate is to allow you to add doc comments
> from
> > macros or to test external markdown files' code blocks through rustdoc.
> >
> > and serde uses rust-skeptic <https://github.com/budziq/rust-skeptic>.
> >
> > Thus, one idea is to write the guide in markdown on each (arrow and
> > datafusion) crate, run the examples there as part of the testing with
> > doc_comment <https://docs.rs/doc-comment/0.3.3/doc_comment/> or
> > rust-skeptic
> > <https://github.com/budziq/rust-skeptic>, and include these on arrow's
> > official documentation on build (we would need to depend on a third-party
> > Sphinx extension <
> https://www.sphinx-doc.org/en/master/usage/markdown.html
> > >
> > for this).
> >
> > This way, we keep the examples up-to-date, and the style and location
> close
> > to other implementation's documentation.
> >
> > Would this be an option?
> >
> > Best,
> > Jorge
> >
> >
> >
> > On Sat, Oct 17, 2020 at 12:48 AM Fernando Herrera <
> > fernando.j.herrera@gmail.com> wrote:
> >
> > > I understand the concern, especially with the project changing that
> > > quickly. However, I haven't found a good material that I can use to
> learn
> > > how to use the crate. I know that each module has a lot of tests (which
> > I'm
> > > thankful for) but going from one test case to the other doesn't work
> well
> > > as learning material. It is a bit hard to find a starting point within
> > the
> > > project, especially if it's your first time seeing the code. Should one
> > > start with the datatypes.rs or with the builder.rs?
> > >
> > > Also, I think it would help a lot to have a more relaxed approach (like
> > > "learning rust with entirely too many lists") rather than a reference
> > > approach (like the RTF). I see the RTF as something you use to find
> > > references regarding the code, rather than a learning material I would
> > use
> > > to grasp what can be done with the crate. That's why I was suggesting a
> > > book format, like the one that is used for Ballista. If you want a
> > > reference material you can always have a look at the documentation
> > created
> > > within the crate.
> > >
> > > What do you think?
> > >
> > > @Andy Grove... is it possible to take part in your incoming
> presentation?
> > >
> > >
> > > On Fri, Oct 16, 2020 at 5:23 PM Micah Kornfield <emkornfield@gmail.com
> >
> > > wrote:
> > >
> > > > >
> > > > > We should be careful with the balance of content between the
> > > Restructured
> > > > > Text Format documentation and the documentation in the crate that
> > gets
> > > > > published to docs.rs though. The rustdoc documentation is
> > unit-tested
> > > to
> > > > > ensure that it is always up to date and we will have to manually
> > update
> > > > the
> > > > > RTF documentation for each release, and the project is still
> evolving
> > > > > rather quickly.
> > > >
> > > >
> > > > If rust offers this out of the box then that definitely seems
> > preferable.
> > > > At some point it would be nice to enable doctest [1] for all of our
> > > > snippets in the main repo.
> > > >
> > > > [1]
> https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html
> > > >
> > > > On Fri, Oct 16, 2020 at 3:17 PM Andy Grove <an...@gmail.com>
> > > wrote:
> > > >
> > > > > I think that it would be great to produce this kind of content. I'm
> > > > giving
> > > > > a presentation on Arrow to my local Rust meetup (virtually) next
> week
> > > and
> > > > > these are similar to the topics I will be covering there.
> > > > >
> > > > > We should be careful with the balance of content between the
> > > Restructured
> > > > > Text Format documentation and the documentation in the crate that
> > gets
> > > > > published to docs.rs though. The rustdoc documentation is
> > unit-tested
> > > to
> > > > > ensure that it is always up to date and we will have to manually
> > update
> > > > the
> > > > > RTF documentation for each release, and the project is still
> evolving
> > > > > rather quickly.
> > > > >
> > > > > If the sample code included in RTF also exists as examples in the
> > repo
> > > > that
> > > > > get tested then we can just copy and paste the contents over each
> > time
> > > we
> > > > > release perhaps.
> > > > >
> > > > > Andy.
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Oct 16, 2020 at 3:59 PM Micah Kornfield <
> > emkornfield@gmail.com
> > > >
> > > > > wrote:
> > > > >
> > > > > > Java and C++ have tutorials in Restructured Text Format in the
> docs
> > > > > folder
> > > > > > [1].  I think creating something similar for Rust might be the
> best
> > > > place
> > > > > > to start.  These are rendered on the website.  For example Java
> is
> > > > > located
> > > > > > at [2].
> > > > > >
> > > > > >
> > > > > > [1] https://github.com/apache/arrow/tree/master/docs/source
> > > > > > [2] https://arrow.apache.org/docs/java/index.html
> > > > > >
> > > > > > On Fri, Oct 16, 2020 at 2:48 PM Fernando Herrera <
> > > > > > fernando.j.herrera@gmail.com> wrote:
> > > > > >
> > > > > > > I was working on the blog post I mentioned before regarding
> Arrow
> > > > usage
> > > > > > > (rust) and how to use the different elements available in the
> > > create.
> > > > > > After
> > > > > > > some thought, these were the topics I want to include:
> > > > > > >
> > > > > > >    1. Arrays examples and how they look like
> > > > > > >    Basic arrays and nested arrays
> > > > > > >    The buffer structure and how data is stored
> > > > > > >    Builders usage
> > > > > > >    Examples of complex arrays and how to construct them (using
> > > > builders
> > > > > > and
> > > > > > >    from)
> > > > > > >    2. What is a record batch?
> > > > > > >    How to construct a record batch
> > > > > > >    How a RecordBatch is used with IPC
> > > > > > >    3. How to read files?
> > > > > > >    CSV files and Parquet files
> > > > > > >    4. How to share information
> > > > > > >    What is Arrow flight?
> > > > > > >    How to set up a server with Rust
> > > > > > >    Examples
> > > > > > >    5. How to query information from arrays?
> > > > > > >    Datafusion examples
> > > > > > >
> > > > > > > However, as I was working on the examples
> > > > > > > <
> > > > https://github.com/elferherrera/test_example/blob/master/src/main.rs
> >
> > > > > > > that
> > > > > > > I was planning to use (most of them came from the Arrow
> > > repository) I
> > > > > > > thought that the best format would be a book, something similar
> > to
> > > > the
> > > > > > Rust
> > > > > > > book. I think this format will help us to fully explain how
> each
> > > > > > > constructor can be used in detail and how each of the data
> arrays
> > > can
> > > > > be
> > > > > > > used and manipulated.
> > > > > > >
> > > > > > > What do you think about it?
> > > > > > >
> > > > > > > I could start the book using the examples in the repository and
> > the
> > > > > tests
> > > > > > > done as a base. However, I cannot find a quick tutorial on
> > setting
> > > > up a
> > > > > > > book like that, let alone how to host it. I know it has to be
> > made
> > > > > using
> > > > > > > .md files, but that's as far as I have got. Can somebody give
> me
> > a
> > > > > > pointer
> > > > > > > on setting up something like that?
> > > > > > >
> > > > > > > Regards
> > > > > > >
> > > > > > > On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan <
> mark@markfarnan.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > I would agree with this.
> > > > > > > >
> > > > > > > > I’ve been working with the GO Arrow library last few weeks,
> and
> > > > took
> > > > > a
> > > > > > > > while to get head around it all / how to use etc.
> > > > > > > > Even then not sure i’ve got it right.
> > > > > > > >
> > > > > > > > Usage examples would be great.
> > > > > > > >
> > > > > > > > Regards
> > > > > > > >
> > > > > > > > Mark
> > > > > > > >
> > > > > > > > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera <
> > > > > > > > fernando.j.herrera@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > I was wondering if besides this blog post there should be
> > > another
> > > > > on
> > > > > > > with
> > > > > > > > > an example of usage. I think that is one of the key things
> > > > missing
> > > > > > for
> > > > > > > > > Arrow in general. This example should show the problems
> that
> > > > Arrow
> > > > > is
> > > > > > > > > solving and how to implement the solution in real life.
> > > > > > > > >
> > > > > > > > > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <
> > > > andygrove73@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> There has been a huge amount of activity in the Rust
> > > subproject
> > > > > for
> > > > > > > the
> > > > > > > > >> 2.0.0 release and I think that we should write a
> > Rust-specific
> > > > > blog
> > > > > > > > post to
> > > > > > > > >> go on the Arrow blog.
> > > > > > > > >>
> > > > > > > > >> I made a brief start at a Google doc, which is mostly just
> > > > bullet
> > > > > > > points
> > > > > > > > >> listing some things we could talk about. I'm sure I've
> > missed
> > > > some
> > > > > > > > things,
> > > > > > > > >> and maybe we have too many things to talk about so we
> might
> > > want
> > > > > to
> > > > > > > try
> > > > > > > > and
> > > > > > > > >> summarize some of this.
> > > > > > > > >>
> > > > > > > > >> Here is the doc ... I would appreciate any help anyone can
> > > > provide
> > > > > > > with
> > > > > > > > >> this. Perhaps if each contributor could flesh out the
> > content
> > > > > around
> > > > > > > > things
> > > > > > > > >> they directly worked on or are knowledgeable about, that
> > would
> > > > be
> > > > > > > great.
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
> > > > > > > > >>
> > > > > > > > >> Thanks,
> > > > > > > > >>
> > > > > > > > >> Andy.
> > > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Rust] Blog post for 2.0.0

Posted by Fernando Herrera <fe...@gmail.com>.
Thanks Jorge for helping me to get across the need for a user guide. The
examples you used are exactly what I had in mind. It would be great if the
project had a user guide similar to tokio's. We could use this guide to
explain how to get started and some examples using the available crates
(Arrow, Arrow-Flight and Datafusion)

I think that in order to start with the guide, I could take the approach
you suggested using either doc_comment or rust-skeptic to write down md
files to sketch the user guide. This guide could be included in each of the
project folders. e.g. arrow, flight and datafusion. Although, I'm a bit
inclined to have a single user guide (placed at the rust folder) that will
include a reference to all the elements that are included in the rust arrow
folder. This way the guide could be planned in such a way that a new user
would start learning about arrow arrays and finish doing queries on data
loaded from CVS or parquet files.

Is this something that could interest you?

Regards,
Fernando

On Fri, Oct 16, 2020 at 10:32 PM Jorge Cardoso Leitão <
jorgecarleitao@gmail.com> wrote:

> Hi,
>
> I would like to thank Fernando for raising this concern here: I also think
> that we still do not put enough effort in the documentation :) I admit that
> when I started in the project, I also had that need and just had some time
> to go through the code.
>
> First, I find it useful to distinguish types of documentation:
>
> 1. API references
> 2. user guide / tutorial
>
> The distinction between the two being that the former covers detailed usage
> of a given function/struct/trait etc, while the latter covers usage of the
> library as a whole (e.g. how a function is used in combination with
> others). Virtually every
> <https://docs.djangoproject.com/en/3.1/intro/tutorial01/> mature
> <https://pandas.pydata.org/docs/user_guide/index.html> library
> <https://spark.apache.org/docs/latest/quick-start.html> or
> <https://www.tensorflow.org/tutorials/quickstart/beginner> framework
> <https://docs.python.org/3/tutorial/> has both, as they serve different
> but
> well defined use-cases.
>
> The canonical way of documenting an API in rust is via the `docs.rs`, that
> generates a format common for rust projects and has a *significant* benefit
> for both writers and readers (for Python users, auto-docs on steroids:
> auto-links to classes declarations, references, testing the examples are
> part of running the tests, the documentation is written next to the actual
> source code, reference to the source code on a single click). Rust users
> expect the API documentation to be in `docs.rs` and released as part of
> crate. I agree with @Andy that we should stick to docs.rs for this. While
> there is always room for improvement, we do have the basics in place.
>
> I think that Fernando is alluding to the fact that we do not have a user
> guide / tutorial, and I agree: we are missing one such as tokio
> <https://tokio.rs/tokio/tutorial>'s, SIMD
> <https://rust-lang.github.io/packed_simd/perf-guide/introduction.html>'s,
> Rocket <https://rocket.rs/v0.4/guide/>'s or rust's book
> <https://doc.rust-lang.org/book/>, that covers how to use the
> library/framework.
>
> The main challenge is to ensure that the guide does not get deprecated.
> Looking at what other rust libs are doing, Serde, Tokio and Rocket write
> their guides in markdown and test the code on their guides (here: tokio
> <https://github.com/tokio-rs/website/tree/master/doc-test>, Rocket
> <https://github.com/SergioBenitez/Rocket/tree/v0.4/site/tests>). Rocket
> use
> their own codegen to test the docs, tokio uses doc_comment
> <https://docs.rs/doc-comment/0.3.3/doc_comment/>
>
> > The point of this (small) crate is to allow you to add doc comments from
> macros or to test external markdown files' code blocks through rustdoc.
>
> and serde uses rust-skeptic <https://github.com/budziq/rust-skeptic>.
>
> Thus, one idea is to write the guide in markdown on each (arrow and
> datafusion) crate, run the examples there as part of the testing with
> doc_comment <https://docs.rs/doc-comment/0.3.3/doc_comment/> or
> rust-skeptic
> <https://github.com/budziq/rust-skeptic>, and include these on arrow's
> official documentation on build (we would need to depend on a third-party
> Sphinx extension <https://www.sphinx-doc.org/en/master/usage/markdown.html
> >
> for this).
>
> This way, we keep the examples up-to-date, and the style and location close
> to other implementation's documentation.
>
> Would this be an option?
>
> Best,
> Jorge
>
>
>
> On Sat, Oct 17, 2020 at 12:48 AM Fernando Herrera <
> fernando.j.herrera@gmail.com> wrote:
>
> > I understand the concern, especially with the project changing that
> > quickly. However, I haven't found a good material that I can use to learn
> > how to use the crate. I know that each module has a lot of tests (which
> I'm
> > thankful for) but going from one test case to the other doesn't work well
> > as learning material. It is a bit hard to find a starting point within
> the
> > project, especially if it's your first time seeing the code. Should one
> > start with the datatypes.rs or with the builder.rs?
> >
> > Also, I think it would help a lot to have a more relaxed approach (like
> > "learning rust with entirely too many lists") rather than a reference
> > approach (like the RTF). I see the RTF as something you use to find
> > references regarding the code, rather than a learning material I would
> use
> > to grasp what can be done with the crate. That's why I was suggesting a
> > book format, like the one that is used for Ballista. If you want a
> > reference material you can always have a look at the documentation
> created
> > within the crate.
> >
> > What do you think?
> >
> > @Andy Grove... is it possible to take part in your incoming presentation?
> >
> >
> > On Fri, Oct 16, 2020 at 5:23 PM Micah Kornfield <em...@gmail.com>
> > wrote:
> >
> > > >
> > > > We should be careful with the balance of content between the
> > Restructured
> > > > Text Format documentation and the documentation in the crate that
> gets
> > > > published to docs.rs though. The rustdoc documentation is
> unit-tested
> > to
> > > > ensure that it is always up to date and we will have to manually
> update
> > > the
> > > > RTF documentation for each release, and the project is still evolving
> > > > rather quickly.
> > >
> > >
> > > If rust offers this out of the box then that definitely seems
> preferable.
> > > At some point it would be nice to enable doctest [1] for all of our
> > > snippets in the main repo.
> > >
> > > [1] https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html
> > >
> > > On Fri, Oct 16, 2020 at 3:17 PM Andy Grove <an...@gmail.com>
> > wrote:
> > >
> > > > I think that it would be great to produce this kind of content. I'm
> > > giving
> > > > a presentation on Arrow to my local Rust meetup (virtually) next week
> > and
> > > > these are similar to the topics I will be covering there.
> > > >
> > > > We should be careful with the balance of content between the
> > Restructured
> > > > Text Format documentation and the documentation in the crate that
> gets
> > > > published to docs.rs though. The rustdoc documentation is
> unit-tested
> > to
> > > > ensure that it is always up to date and we will have to manually
> update
> > > the
> > > > RTF documentation for each release, and the project is still evolving
> > > > rather quickly.
> > > >
> > > > If the sample code included in RTF also exists as examples in the
> repo
> > > that
> > > > get tested then we can just copy and paste the contents over each
> time
> > we
> > > > release perhaps.
> > > >
> > > > Andy.
> > > >
> > > >
> > > >
> > > > On Fri, Oct 16, 2020 at 3:59 PM Micah Kornfield <
> emkornfield@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Java and C++ have tutorials in Restructured Text Format in the docs
> > > > folder
> > > > > [1].  I think creating something similar for Rust might be the best
> > > place
> > > > > to start.  These are rendered on the website.  For example Java is
> > > > located
> > > > > at [2].
> > > > >
> > > > >
> > > > > [1] https://github.com/apache/arrow/tree/master/docs/source
> > > > > [2] https://arrow.apache.org/docs/java/index.html
> > > > >
> > > > > On Fri, Oct 16, 2020 at 2:48 PM Fernando Herrera <
> > > > > fernando.j.herrera@gmail.com> wrote:
> > > > >
> > > > > > I was working on the blog post I mentioned before regarding Arrow
> > > usage
> > > > > > (rust) and how to use the different elements available in the
> > create.
> > > > > After
> > > > > > some thought, these were the topics I want to include:
> > > > > >
> > > > > >    1. Arrays examples and how they look like
> > > > > >    Basic arrays and nested arrays
> > > > > >    The buffer structure and how data is stored
> > > > > >    Builders usage
> > > > > >    Examples of complex arrays and how to construct them (using
> > > builders
> > > > > and
> > > > > >    from)
> > > > > >    2. What is a record batch?
> > > > > >    How to construct a record batch
> > > > > >    How a RecordBatch is used with IPC
> > > > > >    3. How to read files?
> > > > > >    CSV files and Parquet files
> > > > > >    4. How to share information
> > > > > >    What is Arrow flight?
> > > > > >    How to set up a server with Rust
> > > > > >    Examples
> > > > > >    5. How to query information from arrays?
> > > > > >    Datafusion examples
> > > > > >
> > > > > > However, as I was working on the examples
> > > > > > <
> > > https://github.com/elferherrera/test_example/blob/master/src/main.rs>
> > > > > > that
> > > > > > I was planning to use (most of them came from the Arrow
> > repository) I
> > > > > > thought that the best format would be a book, something similar
> to
> > > the
> > > > > Rust
> > > > > > book. I think this format will help us to fully explain how each
> > > > > > constructor can be used in detail and how each of the data arrays
> > can
> > > > be
> > > > > > used and manipulated.
> > > > > >
> > > > > > What do you think about it?
> > > > > >
> > > > > > I could start the book using the examples in the repository and
> the
> > > > tests
> > > > > > done as a base. However, I cannot find a quick tutorial on
> setting
> > > up a
> > > > > > book like that, let alone how to host it. I know it has to be
> made
> > > > using
> > > > > > .md files, but that's as far as I have got. Can somebody give me
> a
> > > > > pointer
> > > > > > on setting up something like that?
> > > > > >
> > > > > > Regards
> > > > > >
> > > > > > On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan <mark@markfarnan.com
> >
> > > > wrote:
> > > > > >
> > > > > > > I would agree with this.
> > > > > > >
> > > > > > > I’ve been working with the GO Arrow library last few weeks, and
> > > took
> > > > a
> > > > > > > while to get head around it all / how to use etc.
> > > > > > > Even then not sure i’ve got it right.
> > > > > > >
> > > > > > > Usage examples would be great.
> > > > > > >
> > > > > > > Regards
> > > > > > >
> > > > > > > Mark
> > > > > > >
> > > > > > > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera <
> > > > > > > fernando.j.herrera@gmail.com> wrote:
> > > > > > > >
> > > > > > > > I was wondering if besides this blog post there should be
> > another
> > > > on
> > > > > > with
> > > > > > > > an example of usage. I think that is one of the key things
> > > missing
> > > > > for
> > > > > > > > Arrow in general. This example should show the problems that
> > > Arrow
> > > > is
> > > > > > > > solving and how to implement the solution in real life.
> > > > > > > >
> > > > > > > > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <
> > > andygrove73@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > >> There has been a huge amount of activity in the Rust
> > subproject
> > > > for
> > > > > > the
> > > > > > > >> 2.0.0 release and I think that we should write a
> Rust-specific
> > > > blog
> > > > > > > post to
> > > > > > > >> go on the Arrow blog.
> > > > > > > >>
> > > > > > > >> I made a brief start at a Google doc, which is mostly just
> > > bullet
> > > > > > points
> > > > > > > >> listing some things we could talk about. I'm sure I've
> missed
> > > some
> > > > > > > things,
> > > > > > > >> and maybe we have too many things to talk about so we might
> > want
> > > > to
> > > > > > try
> > > > > > > and
> > > > > > > >> summarize some of this.
> > > > > > > >>
> > > > > > > >> Here is the doc ... I would appreciate any help anyone can
> > > provide
> > > > > > with
> > > > > > > >> this. Perhaps if each contributor could flesh out the
> content
> > > > around
> > > > > > > things
> > > > > > > >> they directly worked on or are knowledgeable about, that
> would
> > > be
> > > > > > great.
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
> > > > > > > >>
> > > > > > > >> Thanks,
> > > > > > > >>
> > > > > > > >> Andy.
> > > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Rust] Blog post for 2.0.0

Posted by Jorge Cardoso Leitão <jo...@gmail.com>.
Hi,

I would like to thank Fernando for raising this concern here: I also think
that we still do not put enough effort in the documentation :) I admit that
when I started in the project, I also had that need and just had some time
to go through the code.

First, I find it useful to distinguish types of documentation:

1. API references
2. user guide / tutorial

The distinction between the two being that the former covers detailed usage
of a given function/struct/trait etc, while the latter covers usage of the
library as a whole (e.g. how a function is used in combination with
others). Virtually every
<https://docs.djangoproject.com/en/3.1/intro/tutorial01/> mature
<https://pandas.pydata.org/docs/user_guide/index.html> library
<https://spark.apache.org/docs/latest/quick-start.html> or
<https://www.tensorflow.org/tutorials/quickstart/beginner> framework
<https://docs.python.org/3/tutorial/> has both, as they serve different but
well defined use-cases.

The canonical way of documenting an API in rust is via the `docs.rs`, that
generates a format common for rust projects and has a *significant* benefit
for both writers and readers (for Python users, auto-docs on steroids:
auto-links to classes declarations, references, testing the examples are
part of running the tests, the documentation is written next to the actual
source code, reference to the source code on a single click). Rust users
expect the API documentation to be in `docs.rs` and released as part of
crate. I agree with @Andy that we should stick to docs.rs for this. While
there is always room for improvement, we do have the basics in place.

I think that Fernando is alluding to the fact that we do not have a user
guide / tutorial, and I agree: we are missing one such as tokio
<https://tokio.rs/tokio/tutorial>'s, SIMD
<https://rust-lang.github.io/packed_simd/perf-guide/introduction.html>'s,
Rocket <https://rocket.rs/v0.4/guide/>'s or rust's book
<https://doc.rust-lang.org/book/>, that covers how to use the
library/framework.

The main challenge is to ensure that the guide does not get deprecated.
Looking at what other rust libs are doing, Serde, Tokio and Rocket write
their guides in markdown and test the code on their guides (here: tokio
<https://github.com/tokio-rs/website/tree/master/doc-test>, Rocket
<https://github.com/SergioBenitez/Rocket/tree/v0.4/site/tests>). Rocket use
their own codegen to test the docs, tokio uses doc_comment
<https://docs.rs/doc-comment/0.3.3/doc_comment/>

> The point of this (small) crate is to allow you to add doc comments from
macros or to test external markdown files' code blocks through rustdoc.

and serde uses rust-skeptic <https://github.com/budziq/rust-skeptic>.

Thus, one idea is to write the guide in markdown on each (arrow and
datafusion) crate, run the examples there as part of the testing with
doc_comment <https://docs.rs/doc-comment/0.3.3/doc_comment/> or rust-skeptic
<https://github.com/budziq/rust-skeptic>, and include these on arrow's
official documentation on build (we would need to depend on a third-party
Sphinx extension <https://www.sphinx-doc.org/en/master/usage/markdown.html>
for this).

This way, we keep the examples up-to-date, and the style and location close
to other implementation's documentation.

Would this be an option?

Best,
Jorge



On Sat, Oct 17, 2020 at 12:48 AM Fernando Herrera <
fernando.j.herrera@gmail.com> wrote:

> I understand the concern, especially with the project changing that
> quickly. However, I haven't found a good material that I can use to learn
> how to use the crate. I know that each module has a lot of tests (which I'm
> thankful for) but going from one test case to the other doesn't work well
> as learning material. It is a bit hard to find a starting point within the
> project, especially if it's your first time seeing the code. Should one
> start with the datatypes.rs or with the builder.rs?
>
> Also, I think it would help a lot to have a more relaxed approach (like
> "learning rust with entirely too many lists") rather than a reference
> approach (like the RTF). I see the RTF as something you use to find
> references regarding the code, rather than a learning material I would use
> to grasp what can be done with the crate. That's why I was suggesting a
> book format, like the one that is used for Ballista. If you want a
> reference material you can always have a look at the documentation created
> within the crate.
>
> What do you think?
>
> @Andy Grove... is it possible to take part in your incoming presentation?
>
>
> On Fri, Oct 16, 2020 at 5:23 PM Micah Kornfield <em...@gmail.com>
> wrote:
>
> > >
> > > We should be careful with the balance of content between the
> Restructured
> > > Text Format documentation and the documentation in the crate that gets
> > > published to docs.rs though. The rustdoc documentation is unit-tested
> to
> > > ensure that it is always up to date and we will have to manually update
> > the
> > > RTF documentation for each release, and the project is still evolving
> > > rather quickly.
> >
> >
> > If rust offers this out of the box then that definitely seems preferable.
> > At some point it would be nice to enable doctest [1] for all of our
> > snippets in the main repo.
> >
> > [1] https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html
> >
> > On Fri, Oct 16, 2020 at 3:17 PM Andy Grove <an...@gmail.com>
> wrote:
> >
> > > I think that it would be great to produce this kind of content. I'm
> > giving
> > > a presentation on Arrow to my local Rust meetup (virtually) next week
> and
> > > these are similar to the topics I will be covering there.
> > >
> > > We should be careful with the balance of content between the
> Restructured
> > > Text Format documentation and the documentation in the crate that gets
> > > published to docs.rs though. The rustdoc documentation is unit-tested
> to
> > > ensure that it is always up to date and we will have to manually update
> > the
> > > RTF documentation for each release, and the project is still evolving
> > > rather quickly.
> > >
> > > If the sample code included in RTF also exists as examples in the repo
> > that
> > > get tested then we can just copy and paste the contents over each time
> we
> > > release perhaps.
> > >
> > > Andy.
> > >
> > >
> > >
> > > On Fri, Oct 16, 2020 at 3:59 PM Micah Kornfield <emkornfield@gmail.com
> >
> > > wrote:
> > >
> > > > Java and C++ have tutorials in Restructured Text Format in the docs
> > > folder
> > > > [1].  I think creating something similar for Rust might be the best
> > place
> > > > to start.  These are rendered on the website.  For example Java is
> > > located
> > > > at [2].
> > > >
> > > >
> > > > [1] https://github.com/apache/arrow/tree/master/docs/source
> > > > [2] https://arrow.apache.org/docs/java/index.html
> > > >
> > > > On Fri, Oct 16, 2020 at 2:48 PM Fernando Herrera <
> > > > fernando.j.herrera@gmail.com> wrote:
> > > >
> > > > > I was working on the blog post I mentioned before regarding Arrow
> > usage
> > > > > (rust) and how to use the different elements available in the
> create.
> > > > After
> > > > > some thought, these were the topics I want to include:
> > > > >
> > > > >    1. Arrays examples and how they look like
> > > > >    Basic arrays and nested arrays
> > > > >    The buffer structure and how data is stored
> > > > >    Builders usage
> > > > >    Examples of complex arrays and how to construct them (using
> > builders
> > > > and
> > > > >    from)
> > > > >    2. What is a record batch?
> > > > >    How to construct a record batch
> > > > >    How a RecordBatch is used with IPC
> > > > >    3. How to read files?
> > > > >    CSV files and Parquet files
> > > > >    4. How to share information
> > > > >    What is Arrow flight?
> > > > >    How to set up a server with Rust
> > > > >    Examples
> > > > >    5. How to query information from arrays?
> > > > >    Datafusion examples
> > > > >
> > > > > However, as I was working on the examples
> > > > > <
> > https://github.com/elferherrera/test_example/blob/master/src/main.rs>
> > > > > that
> > > > > I was planning to use (most of them came from the Arrow
> repository) I
> > > > > thought that the best format would be a book, something similar to
> > the
> > > > Rust
> > > > > book. I think this format will help us to fully explain how each
> > > > > constructor can be used in detail and how each of the data arrays
> can
> > > be
> > > > > used and manipulated.
> > > > >
> > > > > What do you think about it?
> > > > >
> > > > > I could start the book using the examples in the repository and the
> > > tests
> > > > > done as a base. However, I cannot find a quick tutorial on setting
> > up a
> > > > > book like that, let alone how to host it. I know it has to be made
> > > using
> > > > > .md files, but that's as far as I have got. Can somebody give me a
> > > > pointer
> > > > > on setting up something like that?
> > > > >
> > > > > Regards
> > > > >
> > > > > On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan <ma...@markfarnan.com>
> > > wrote:
> > > > >
> > > > > > I would agree with this.
> > > > > >
> > > > > > I’ve been working with the GO Arrow library last few weeks, and
> > took
> > > a
> > > > > > while to get head around it all / how to use etc.
> > > > > > Even then not sure i’ve got it right.
> > > > > >
> > > > > > Usage examples would be great.
> > > > > >
> > > > > > Regards
> > > > > >
> > > > > > Mark
> > > > > >
> > > > > > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera <
> > > > > > fernando.j.herrera@gmail.com> wrote:
> > > > > > >
> > > > > > > I was wondering if besides this blog post there should be
> another
> > > on
> > > > > with
> > > > > > > an example of usage. I think that is one of the key things
> > missing
> > > > for
> > > > > > > Arrow in general. This example should show the problems that
> > Arrow
> > > is
> > > > > > > solving and how to implement the solution in real life.
> > > > > > >
> > > > > > > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <
> > andygrove73@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > >> There has been a huge amount of activity in the Rust
> subproject
> > > for
> > > > > the
> > > > > > >> 2.0.0 release and I think that we should write a Rust-specific
> > > blog
> > > > > > post to
> > > > > > >> go on the Arrow blog.
> > > > > > >>
> > > > > > >> I made a brief start at a Google doc, which is mostly just
> > bullet
> > > > > points
> > > > > > >> listing some things we could talk about. I'm sure I've missed
> > some
> > > > > > things,
> > > > > > >> and maybe we have too many things to talk about so we might
> want
> > > to
> > > > > try
> > > > > > and
> > > > > > >> summarize some of this.
> > > > > > >>
> > > > > > >> Here is the doc ... I would appreciate any help anyone can
> > provide
> > > > > with
> > > > > > >> this. Perhaps if each contributor could flesh out the content
> > > around
> > > > > > things
> > > > > > >> they directly worked on or are knowledgeable about, that would
> > be
> > > > > great.
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
> > > > > > >>
> > > > > > >> Thanks,
> > > > > > >>
> > > > > > >> Andy.
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Rust] Blog post for 2.0.0

Posted by Fernando Herrera <fe...@gmail.com>.
I understand the concern, especially with the project changing that
quickly. However, I haven't found a good material that I can use to learn
how to use the crate. I know that each module has a lot of tests (which I'm
thankful for) but going from one test case to the other doesn't work well
as learning material. It is a bit hard to find a starting point within the
project, especially if it's your first time seeing the code. Should one
start with the datatypes.rs or with the builder.rs?

Also, I think it would help a lot to have a more relaxed approach (like
"learning rust with entirely too many lists") rather than a reference
approach (like the RTF). I see the RTF as something you use to find
references regarding the code, rather than a learning material I would use
to grasp what can be done with the crate. That's why I was suggesting a
book format, like the one that is used for Ballista. If you want a
reference material you can always have a look at the documentation created
within the crate.

What do you think?

@Andy Grove... is it possible to take part in your incoming presentation?


On Fri, Oct 16, 2020 at 5:23 PM Micah Kornfield <em...@gmail.com>
wrote:

> >
> > We should be careful with the balance of content between the Restructured
> > Text Format documentation and the documentation in the crate that gets
> > published to docs.rs though. The rustdoc documentation is unit-tested to
> > ensure that it is always up to date and we will have to manually update
> the
> > RTF documentation for each release, and the project is still evolving
> > rather quickly.
>
>
> If rust offers this out of the box then that definitely seems preferable.
> At some point it would be nice to enable doctest [1] for all of our
> snippets in the main repo.
>
> [1] https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html
>
> On Fri, Oct 16, 2020 at 3:17 PM Andy Grove <an...@gmail.com> wrote:
>
> > I think that it would be great to produce this kind of content. I'm
> giving
> > a presentation on Arrow to my local Rust meetup (virtually) next week and
> > these are similar to the topics I will be covering there.
> >
> > We should be careful with the balance of content between the Restructured
> > Text Format documentation and the documentation in the crate that gets
> > published to docs.rs though. The rustdoc documentation is unit-tested to
> > ensure that it is always up to date and we will have to manually update
> the
> > RTF documentation for each release, and the project is still evolving
> > rather quickly.
> >
> > If the sample code included in RTF also exists as examples in the repo
> that
> > get tested then we can just copy and paste the contents over each time we
> > release perhaps.
> >
> > Andy.
> >
> >
> >
> > On Fri, Oct 16, 2020 at 3:59 PM Micah Kornfield <em...@gmail.com>
> > wrote:
> >
> > > Java and C++ have tutorials in Restructured Text Format in the docs
> > folder
> > > [1].  I think creating something similar for Rust might be the best
> place
> > > to start.  These are rendered on the website.  For example Java is
> > located
> > > at [2].
> > >
> > >
> > > [1] https://github.com/apache/arrow/tree/master/docs/source
> > > [2] https://arrow.apache.org/docs/java/index.html
> > >
> > > On Fri, Oct 16, 2020 at 2:48 PM Fernando Herrera <
> > > fernando.j.herrera@gmail.com> wrote:
> > >
> > > > I was working on the blog post I mentioned before regarding Arrow
> usage
> > > > (rust) and how to use the different elements available in the create.
> > > After
> > > > some thought, these were the topics I want to include:
> > > >
> > > >    1. Arrays examples and how they look like
> > > >    Basic arrays and nested arrays
> > > >    The buffer structure and how data is stored
> > > >    Builders usage
> > > >    Examples of complex arrays and how to construct them (using
> builders
> > > and
> > > >    from)
> > > >    2. What is a record batch?
> > > >    How to construct a record batch
> > > >    How a RecordBatch is used with IPC
> > > >    3. How to read files?
> > > >    CSV files and Parquet files
> > > >    4. How to share information
> > > >    What is Arrow flight?
> > > >    How to set up a server with Rust
> > > >    Examples
> > > >    5. How to query information from arrays?
> > > >    Datafusion examples
> > > >
> > > > However, as I was working on the examples
> > > > <
> https://github.com/elferherrera/test_example/blob/master/src/main.rs>
> > > > that
> > > > I was planning to use (most of them came from the Arrow repository) I
> > > > thought that the best format would be a book, something similar to
> the
> > > Rust
> > > > book. I think this format will help us to fully explain how each
> > > > constructor can be used in detail and how each of the data arrays can
> > be
> > > > used and manipulated.
> > > >
> > > > What do you think about it?
> > > >
> > > > I could start the book using the examples in the repository and the
> > tests
> > > > done as a base. However, I cannot find a quick tutorial on setting
> up a
> > > > book like that, let alone how to host it. I know it has to be made
> > using
> > > > .md files, but that's as far as I have got. Can somebody give me a
> > > pointer
> > > > on setting up something like that?
> > > >
> > > > Regards
> > > >
> > > > On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan <ma...@markfarnan.com>
> > wrote:
> > > >
> > > > > I would agree with this.
> > > > >
> > > > > I’ve been working with the GO Arrow library last few weeks, and
> took
> > a
> > > > > while to get head around it all / how to use etc.
> > > > > Even then not sure i’ve got it right.
> > > > >
> > > > > Usage examples would be great.
> > > > >
> > > > > Regards
> > > > >
> > > > > Mark
> > > > >
> > > > > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera <
> > > > > fernando.j.herrera@gmail.com> wrote:
> > > > > >
> > > > > > I was wondering if besides this blog post there should be another
> > on
> > > > with
> > > > > > an example of usage. I think that is one of the key things
> missing
> > > for
> > > > > > Arrow in general. This example should show the problems that
> Arrow
> > is
> > > > > > solving and how to implement the solution in real life.
> > > > > >
> > > > > > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <
> andygrove73@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > >> There has been a huge amount of activity in the Rust subproject
> > for
> > > > the
> > > > > >> 2.0.0 release and I think that we should write a Rust-specific
> > blog
> > > > > post to
> > > > > >> go on the Arrow blog.
> > > > > >>
> > > > > >> I made a brief start at a Google doc, which is mostly just
> bullet
> > > > points
> > > > > >> listing some things we could talk about. I'm sure I've missed
> some
> > > > > things,
> > > > > >> and maybe we have too many things to talk about so we might want
> > to
> > > > try
> > > > > and
> > > > > >> summarize some of this.
> > > > > >>
> > > > > >> Here is the doc ... I would appreciate any help anyone can
> provide
> > > > with
> > > > > >> this. Perhaps if each contributor could flesh out the content
> > around
> > > > > things
> > > > > >> they directly worked on or are knowledgeable about, that would
> be
> > > > great.
> > > > > >>
> > > > > >>
> > > > > >>
> > > > >
> > > >
> > >
> >
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
> > > > > >>
> > > > > >> Thanks,
> > > > > >>
> > > > > >> Andy.
> > > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [Rust] Blog post for 2.0.0

Posted by Micah Kornfield <em...@gmail.com>.
>
> We should be careful with the balance of content between the Restructured
> Text Format documentation and the documentation in the crate that gets
> published to docs.rs though. The rustdoc documentation is unit-tested to
> ensure that it is always up to date and we will have to manually update the
> RTF documentation for each release, and the project is still evolving
> rather quickly.


If rust offers this out of the box then that definitely seems preferable.
At some point it would be nice to enable doctest [1] for all of our
snippets in the main repo.

[1] https://www.sphinx-doc.org/en/master/usage/extensions/doctest.html

On Fri, Oct 16, 2020 at 3:17 PM Andy Grove <an...@gmail.com> wrote:

> I think that it would be great to produce this kind of content. I'm giving
> a presentation on Arrow to my local Rust meetup (virtually) next week and
> these are similar to the topics I will be covering there.
>
> We should be careful with the balance of content between the Restructured
> Text Format documentation and the documentation in the crate that gets
> published to docs.rs though. The rustdoc documentation is unit-tested to
> ensure that it is always up to date and we will have to manually update the
> RTF documentation for each release, and the project is still evolving
> rather quickly.
>
> If the sample code included in RTF also exists as examples in the repo that
> get tested then we can just copy and paste the contents over each time we
> release perhaps.
>
> Andy.
>
>
>
> On Fri, Oct 16, 2020 at 3:59 PM Micah Kornfield <em...@gmail.com>
> wrote:
>
> > Java and C++ have tutorials in Restructured Text Format in the docs
> folder
> > [1].  I think creating something similar for Rust might be the best place
> > to start.  These are rendered on the website.  For example Java is
> located
> > at [2].
> >
> >
> > [1] https://github.com/apache/arrow/tree/master/docs/source
> > [2] https://arrow.apache.org/docs/java/index.html
> >
> > On Fri, Oct 16, 2020 at 2:48 PM Fernando Herrera <
> > fernando.j.herrera@gmail.com> wrote:
> >
> > > I was working on the blog post I mentioned before regarding Arrow usage
> > > (rust) and how to use the different elements available in the create.
> > After
> > > some thought, these were the topics I want to include:
> > >
> > >    1. Arrays examples and how they look like
> > >    Basic arrays and nested arrays
> > >    The buffer structure and how data is stored
> > >    Builders usage
> > >    Examples of complex arrays and how to construct them (using builders
> > and
> > >    from)
> > >    2. What is a record batch?
> > >    How to construct a record batch
> > >    How a RecordBatch is used with IPC
> > >    3. How to read files?
> > >    CSV files and Parquet files
> > >    4. How to share information
> > >    What is Arrow flight?
> > >    How to set up a server with Rust
> > >    Examples
> > >    5. How to query information from arrays?
> > >    Datafusion examples
> > >
> > > However, as I was working on the examples
> > > <https://github.com/elferherrera/test_example/blob/master/src/main.rs>
> > > that
> > > I was planning to use (most of them came from the Arrow repository) I
> > > thought that the best format would be a book, something similar to the
> > Rust
> > > book. I think this format will help us to fully explain how each
> > > constructor can be used in detail and how each of the data arrays can
> be
> > > used and manipulated.
> > >
> > > What do you think about it?
> > >
> > > I could start the book using the examples in the repository and the
> tests
> > > done as a base. However, I cannot find a quick tutorial on setting up a
> > > book like that, let alone how to host it. I know it has to be made
> using
> > > .md files, but that's as far as I have got. Can somebody give me a
> > pointer
> > > on setting up something like that?
> > >
> > > Regards
> > >
> > > On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan <ma...@markfarnan.com>
> wrote:
> > >
> > > > I would agree with this.
> > > >
> > > > I’ve been working with the GO Arrow library last few weeks, and took
> a
> > > > while to get head around it all / how to use etc.
> > > > Even then not sure i’ve got it right.
> > > >
> > > > Usage examples would be great.
> > > >
> > > > Regards
> > > >
> > > > Mark
> > > >
> > > > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera <
> > > > fernando.j.herrera@gmail.com> wrote:
> > > > >
> > > > > I was wondering if besides this blog post there should be another
> on
> > > with
> > > > > an example of usage. I think that is one of the key things missing
> > for
> > > > > Arrow in general. This example should show the problems that Arrow
> is
> > > > > solving and how to implement the solution in real life.
> > > > >
> > > > > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <andygrove73@gmail.com
> >
> > > > wrote:
> > > > >
> > > > >> There has been a huge amount of activity in the Rust subproject
> for
> > > the
> > > > >> 2.0.0 release and I think that we should write a Rust-specific
> blog
> > > > post to
> > > > >> go on the Arrow blog.
> > > > >>
> > > > >> I made a brief start at a Google doc, which is mostly just bullet
> > > points
> > > > >> listing some things we could talk about. I'm sure I've missed some
> > > > things,
> > > > >> and maybe we have too many things to talk about so we might want
> to
> > > try
> > > > and
> > > > >> summarize some of this.
> > > > >>
> > > > >> Here is the doc ... I would appreciate any help anyone can provide
> > > with
> > > > >> this. Perhaps if each contributor could flesh out the content
> around
> > > > things
> > > > >> they directly worked on or are knowledgeable about, that would be
> > > great.
> > > > >>
> > > > >>
> > > > >>
> > > >
> > >
> >
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
> > > > >>
> > > > >> Thanks,
> > > > >>
> > > > >> Andy.
> > > > >>
> > > >
> > > >
> > >
> >
>

Re: [Rust] Blog post for 2.0.0

Posted by Andy Grove <an...@gmail.com>.
I think that it would be great to produce this kind of content. I'm giving
a presentation on Arrow to my local Rust meetup (virtually) next week and
these are similar to the topics I will be covering there.

We should be careful with the balance of content between the Restructured
Text Format documentation and the documentation in the crate that gets
published to docs.rs though. The rustdoc documentation is unit-tested to
ensure that it is always up to date and we will have to manually update the
RTF documentation for each release, and the project is still evolving
rather quickly.

If the sample code included in RTF also exists as examples in the repo that
get tested then we can just copy and paste the contents over each time we
release perhaps.

Andy.



On Fri, Oct 16, 2020 at 3:59 PM Micah Kornfield <em...@gmail.com>
wrote:

> Java and C++ have tutorials in Restructured Text Format in the docs folder
> [1].  I think creating something similar for Rust might be the best place
> to start.  These are rendered on the website.  For example Java is located
> at [2].
>
>
> [1] https://github.com/apache/arrow/tree/master/docs/source
> [2] https://arrow.apache.org/docs/java/index.html
>
> On Fri, Oct 16, 2020 at 2:48 PM Fernando Herrera <
> fernando.j.herrera@gmail.com> wrote:
>
> > I was working on the blog post I mentioned before regarding Arrow usage
> > (rust) and how to use the different elements available in the create.
> After
> > some thought, these were the topics I want to include:
> >
> >    1. Arrays examples and how they look like
> >    Basic arrays and nested arrays
> >    The buffer structure and how data is stored
> >    Builders usage
> >    Examples of complex arrays and how to construct them (using builders
> and
> >    from)
> >    2. What is a record batch?
> >    How to construct a record batch
> >    How a RecordBatch is used with IPC
> >    3. How to read files?
> >    CSV files and Parquet files
> >    4. How to share information
> >    What is Arrow flight?
> >    How to set up a server with Rust
> >    Examples
> >    5. How to query information from arrays?
> >    Datafusion examples
> >
> > However, as I was working on the examples
> > <https://github.com/elferherrera/test_example/blob/master/src/main.rs>
> > that
> > I was planning to use (most of them came from the Arrow repository) I
> > thought that the best format would be a book, something similar to the
> Rust
> > book. I think this format will help us to fully explain how each
> > constructor can be used in detail and how each of the data arrays can be
> > used and manipulated.
> >
> > What do you think about it?
> >
> > I could start the book using the examples in the repository and the tests
> > done as a base. However, I cannot find a quick tutorial on setting up a
> > book like that, let alone how to host it. I know it has to be made using
> > .md files, but that's as far as I have got. Can somebody give me a
> pointer
> > on setting up something like that?
> >
> > Regards
> >
> > On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan <ma...@markfarnan.com> wrote:
> >
> > > I would agree with this.
> > >
> > > I’ve been working with the GO Arrow library last few weeks, and took a
> > > while to get head around it all / how to use etc.
> > > Even then not sure i’ve got it right.
> > >
> > > Usage examples would be great.
> > >
> > > Regards
> > >
> > > Mark
> > >
> > > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera <
> > > fernando.j.herrera@gmail.com> wrote:
> > > >
> > > > I was wondering if besides this blog post there should be another on
> > with
> > > > an example of usage. I think that is one of the key things missing
> for
> > > > Arrow in general. This example should show the problems that Arrow is
> > > > solving and how to implement the solution in real life.
> > > >
> > > > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <an...@gmail.com>
> > > wrote:
> > > >
> > > >> There has been a huge amount of activity in the Rust subproject for
> > the
> > > >> 2.0.0 release and I think that we should write a Rust-specific blog
> > > post to
> > > >> go on the Arrow blog.
> > > >>
> > > >> I made a brief start at a Google doc, which is mostly just bullet
> > points
> > > >> listing some things we could talk about. I'm sure I've missed some
> > > things,
> > > >> and maybe we have too many things to talk about so we might want to
> > try
> > > and
> > > >> summarize some of this.
> > > >>
> > > >> Here is the doc ... I would appreciate any help anyone can provide
> > with
> > > >> this. Perhaps if each contributor could flesh out the content around
> > > things
> > > >> they directly worked on or are knowledgeable about, that would be
> > great.
> > > >>
> > > >>
> > > >>
> > >
> >
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Andy.
> > > >>
> > >
> > >
> >
>

Re: [Rust] Blog post for 2.0.0

Posted by Micah Kornfield <em...@gmail.com>.
Java and C++ have tutorials in Restructured Text Format in the docs folder
[1].  I think creating something similar for Rust might be the best place
to start.  These are rendered on the website.  For example Java is located
at [2].


[1] https://github.com/apache/arrow/tree/master/docs/source
[2] https://arrow.apache.org/docs/java/index.html

On Fri, Oct 16, 2020 at 2:48 PM Fernando Herrera <
fernando.j.herrera@gmail.com> wrote:

> I was working on the blog post I mentioned before regarding Arrow usage
> (rust) and how to use the different elements available in the create. After
> some thought, these were the topics I want to include:
>
>    1. Arrays examples and how they look like
>    Basic arrays and nested arrays
>    The buffer structure and how data is stored
>    Builders usage
>    Examples of complex arrays and how to construct them (using builders and
>    from)
>    2. What is a record batch?
>    How to construct a record batch
>    How a RecordBatch is used with IPC
>    3. How to read files?
>    CSV files and Parquet files
>    4. How to share information
>    What is Arrow flight?
>    How to set up a server with Rust
>    Examples
>    5. How to query information from arrays?
>    Datafusion examples
>
> However, as I was working on the examples
> <https://github.com/elferherrera/test_example/blob/master/src/main.rs>
> that
> I was planning to use (most of them came from the Arrow repository) I
> thought that the best format would be a book, something similar to the Rust
> book. I think this format will help us to fully explain how each
> constructor can be used in detail and how each of the data arrays can be
> used and manipulated.
>
> What do you think about it?
>
> I could start the book using the examples in the repository and the tests
> done as a base. However, I cannot find a quick tutorial on setting up a
> book like that, let alone how to host it. I know it has to be made using
> .md files, but that's as far as I have got. Can somebody give me a pointer
> on setting up something like that?
>
> Regards
>
> On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan <ma...@markfarnan.com> wrote:
>
> > I would agree with this.
> >
> > I’ve been working with the GO Arrow library last few weeks, and took a
> > while to get head around it all / how to use etc.
> > Even then not sure i’ve got it right.
> >
> > Usage examples would be great.
> >
> > Regards
> >
> > Mark
> >
> > > On Oct 14, 2020, at 4:08 PM, Fernando Herrera <
> > fernando.j.herrera@gmail.com> wrote:
> > >
> > > I was wondering if besides this blog post there should be another on
> with
> > > an example of usage. I think that is one of the key things missing for
> > > Arrow in general. This example should show the problems that Arrow is
> > > solving and how to implement the solution in real life.
> > >
> > > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <an...@gmail.com>
> > wrote:
> > >
> > >> There has been a huge amount of activity in the Rust subproject for
> the
> > >> 2.0.0 release and I think that we should write a Rust-specific blog
> > post to
> > >> go on the Arrow blog.
> > >>
> > >> I made a brief start at a Google doc, which is mostly just bullet
> points
> > >> listing some things we could talk about. I'm sure I've missed some
> > things,
> > >> and maybe we have too many things to talk about so we might want to
> try
> > and
> > >> summarize some of this.
> > >>
> > >> Here is the doc ... I would appreciate any help anyone can provide
> with
> > >> this. Perhaps if each contributor could flesh out the content around
> > things
> > >> they directly worked on or are knowledgeable about, that would be
> great.
> > >>
> > >>
> > >>
> >
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
> > >>
> > >> Thanks,
> > >>
> > >> Andy.
> > >>
> >
> >
>

Re: [Rust] Blog post for 2.0.0

Posted by Fernando Herrera <fe...@gmail.com>.
I was working on the blog post I mentioned before regarding Arrow usage
(rust) and how to use the different elements available in the create. After
some thought, these were the topics I want to include:

   1. Arrays examples and how they look like
   Basic arrays and nested arrays
   The buffer structure and how data is stored
   Builders usage
   Examples of complex arrays and how to construct them (using builders and
   from)
   2. What is a record batch?
   How to construct a record batch
   How a RecordBatch is used with IPC
   3. How to read files?
   CSV files and Parquet files
   4. How to share information
   What is Arrow flight?
   How to set up a server with Rust
   Examples
   5. How to query information from arrays?
   Datafusion examples

However, as I was working on the examples
<https://github.com/elferherrera/test_example/blob/master/src/main.rs> that
I was planning to use (most of them came from the Arrow repository) I
thought that the best format would be a book, something similar to the Rust
book. I think this format will help us to fully explain how each
constructor can be used in detail and how each of the data arrays can be
used and manipulated.

What do you think about it?

I could start the book using the examples in the repository and the tests
done as a base. However, I cannot find a quick tutorial on setting up a
book like that, let alone how to host it. I know it has to be made using
.md files, but that's as far as I have got. Can somebody give me a pointer
on setting up something like that?

Regards

On Thu, Oct 15, 2020 at 3:18 PM Mark Farnan <ma...@markfarnan.com> wrote:

> I would agree with this.
>
> I’ve been working with the GO Arrow library last few weeks, and took a
> while to get head around it all / how to use etc.
> Even then not sure i’ve got it right.
>
> Usage examples would be great.
>
> Regards
>
> Mark
>
> > On Oct 14, 2020, at 4:08 PM, Fernando Herrera <
> fernando.j.herrera@gmail.com> wrote:
> >
> > I was wondering if besides this blog post there should be another on with
> > an example of usage. I think that is one of the key things missing for
> > Arrow in general. This example should show the problems that Arrow is
> > solving and how to implement the solution in real life.
> >
> > On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <an...@gmail.com>
> wrote:
> >
> >> There has been a huge amount of activity in the Rust subproject for the
> >> 2.0.0 release and I think that we should write a Rust-specific blog
> post to
> >> go on the Arrow blog.
> >>
> >> I made a brief start at a Google doc, which is mostly just bullet points
> >> listing some things we could talk about. I'm sure I've missed some
> things,
> >> and maybe we have too many things to talk about so we might want to try
> and
> >> summarize some of this.
> >>
> >> Here is the doc ... I would appreciate any help anyone can provide with
> >> this. Perhaps if each contributor could flesh out the content around
> things
> >> they directly worked on or are knowledgeable about, that would be great.
> >>
> >>
> >>
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
> >>
> >> Thanks,
> >>
> >> Andy.
> >>
>
>

Re: [Rust] Blog post for 2.0.0

Posted by Mark Farnan <ma...@markfarnan.com>.
I would agree with this. 

I’ve been working with the GO Arrow library last few weeks, and took a while to get head around it all / how to use etc.
Even then not sure i’ve got it right.

Usage examples would be great.

Regards

Mark

> On Oct 14, 2020, at 4:08 PM, Fernando Herrera <fe...@gmail.com> wrote:
> 
> I was wondering if besides this blog post there should be another on with
> an example of usage. I think that is one of the key things missing for
> Arrow in general. This example should show the problems that Arrow is
> solving and how to implement the solution in real life.
> 
> On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <an...@gmail.com> wrote:
> 
>> There has been a huge amount of activity in the Rust subproject for the
>> 2.0.0 release and I think that we should write a Rust-specific blog post to
>> go on the Arrow blog.
>> 
>> I made a brief start at a Google doc, which is mostly just bullet points
>> listing some things we could talk about. I'm sure I've missed some things,
>> and maybe we have too many things to talk about so we might want to try and
>> summarize some of this.
>> 
>> Here is the doc ... I would appreciate any help anyone can provide with
>> this. Perhaps if each contributor could flesh out the content around things
>> they directly worked on or are knowledgeable about, that would be great.
>> 
>> 
>> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
>> 
>> Thanks,
>> 
>> Andy.
>> 


Re: [Rust] Blog post for 2.0.0

Posted by Fernando Herrera <fe...@gmail.com>.
I was wondering if besides this blog post there should be another on with
an example of usage. I think that is one of the key things missing for
Arrow in general. This example should show the problems that Arrow is
solving and how to implement the solution in real life.

On Tue, Oct 13, 2020 at 10:12 AM Andy Grove <an...@gmail.com> wrote:

> There has been a huge amount of activity in the Rust subproject for the
> 2.0.0 release and I think that we should write a Rust-specific blog post to
> go on the Arrow blog.
>
> I made a brief start at a Google doc, which is mostly just bullet points
> listing some things we could talk about. I'm sure I've missed some things,
> and maybe we have too many things to talk about so we might want to try and
> summarize some of this.
>
> Here is the doc ... I would appreciate any help anyone can provide with
> this. Perhaps if each contributor could flesh out the content around things
> they directly worked on or are knowledgeable about, that would be great.
>
>
> https://docs.google.com/document/d/1RY7oa7ldi4RnyFzk3_5NHiiQl7IcvZgXFq3FYr5iwFc/edit?usp=sharing
>
> Thanks,
>
> Andy.
>