You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Fernando Herrera <fe...@gmail.com> on 2021/01/31 17:25:21 UTC

[RUST] Arrow guide

Hi all,

During the past months I have been trying to read and understand the code
base for the Rust implementation of Arrow. At the beginning I was just
reading the code and figuring out what each part or module was used for.
Unfortunately this approach didn't work very well and had to start from
scratch. The next time while trying to understand it I was also writing
descriptions of the things I was studying and how to implement them. This
approach led me to writing up a small Arrow guide.

At this point is not complete and has several chapters missing, but that's
the point of this mail. I was wondering if someone that wants to work (or
is already working) on the Rust side would like to help me make the guide
better and richer.

The first sections can be found here:
https://elferherrera.github.io/arrow_guide/introduction.html

And the repo is here:
https://github.com/elferherrera/arrow_guide/

The guide at the moment is written with mdbook and uses the doc-comment
crate to check all the code. Also, the book is pulling the Arrow crate from
git directly, so it is always reading the most recent api.

I hope someone finds these writings useful and if you are willing to help
me just let me know.

Thanks,
Fernando

Re: [RUST] Arrow guide

Posted by Fernando Herrera <fe...@gmail.com>.
Hi Andrew. Thanks as well for the kind words. It has been an interesting
exercise writing the guide to understand all the code you all have written.
I know it is not complete but at least it has given me an idea of the
topics I want to dig deeper and how to use Arrow (I'm looking at you
Datafusion). Btw, Arrow helped me solve an issue we were having in the
company processing trillions of combinations of strings. I'm thinking of
writing a simplified example of this in the guide as well.

Regarding the link, if you think the guide is good enough to be in the
README, please be my guest. I think this weekend I will be adding a
pipeline to automate any pushes to master to update the branch gh-pages to
update the book.

By the way, if you want to add some ideas or simplified code examples but
don't have the time to write a detailed description of the code, just copy
the code to the examples folder and later I can do the write up with the
explanation of the code.

Fernando



On Tue, Feb 2, 2021 at 11:28 AM Andrew Lamb <al...@influxdata.com> wrote:

> I also started reading this book and what I have read so far is quite
> impressive - thank you Fernando.
>
> While keeping the code in a separate repo for now makes sense, what do you
> think about including a link to your guide in the Rust Arrow crate's
> README.md?
>
> Andrew
>
>
> On Mon, Feb 1, 2021 at 2:31 PM Fernando Herrera <
> fernando.j.herrera@gmail.com> wrote:
>
> > Thanks Jorge. It does mean a lot your comments, and please, do help me
> get
> > it better.
> >
> > I was wondering as well to put it inside the arrow crate but at the
> > beginning I think it is going to be changing a lot, so I think it would
> be
> > a good idea to keep it in a separate repo so we can iterate on it as much
> > as possible.
> >
> > What about creating a Rust Arrow group in github to keep the fast
> changing
> > projects apart in different repos but with in the same group?
> >
> > Fernando,
> >
> > On Mon, 1 Feb 2021, 17:28 Jorge Cardoso Leitão, <
> jorgecarleitao@gmail.com>
> > wrote:
> >
> > > I went through it, and I have to say that it is really well written and
> > > contains non-trivial knowledge about the arrow crate. Thank you very
> much
> > > for this, Fernando.
> > >
> > > In my opinion alone, the guide or a variation of it could be
> incorporated
> > > into the arrow repo and released together with the crate, as is
> standard
> > in
> > > other rust projects. I for one would contribute and put time into
> > enhancing
> > > and maintaining it as part of the rust implementation, review changes
> to
> > it
> > > by other contributors, and keep it up to date.
> > >
> > > Best,
> > > Jorge
> > >
> > >
> > >
> > >
> > > On Sun, Jan 31, 2021 at 6:25 PM Fernando Herrera <
> > > fernando.j.herrera@gmail.com> wrote:
> > >
> > > > Hi all,
> > > >
> > > > During the past months I have been trying to read and understand the
> > code
> > > > base for the Rust implementation of Arrow. At the beginning I was
> just
> > > > reading the code and figuring out what each part or module was used
> > for.
> > > > Unfortunately this approach didn't work very well and had to start
> from
> > > > scratch. The next time while trying to understand it I was also
> writing
> > > > descriptions of the things I was studying and how to implement them.
> > This
> > > > approach led me to writing up a small Arrow guide.
> > > >
> > > > At this point is not complete and has several chapters missing, but
> > > that's
> > > > the point of this mail. I was wondering if someone that wants to work
> > (or
> > > > is already working) on the Rust side would like to help me make the
> > guide
> > > > better and richer.
> > > >
> > > > The first sections can be found here:
> > > > https://elferherrera.github.io/arrow_guide/introduction.html
> > > >
> > > > And the repo is here:
> > > > https://github.com/elferherrera/arrow_guide/
> > > >
> > > > The guide at the moment is written with mdbook and uses the
> doc-comment
> > > > crate to check all the code. Also, the book is pulling the Arrow
> crate
> > > from
> > > > git directly, so it is always reading the most recent api.
> > > >
> > > > I hope someone finds these writings useful and if you are willing to
> > help
> > > > me just let me know.
> > > >
> > > > Thanks,
> > > > Fernando
> > > >
> > >
> >
>

Re: [RUST] Arrow guide

Posted by Andrew Lamb <al...@influxdata.com>.
I also started reading this book and what I have read so far is quite
impressive - thank you Fernando.

While keeping the code in a separate repo for now makes sense, what do you
think about including a link to your guide in the Rust Arrow crate's
README.md?

Andrew


On Mon, Feb 1, 2021 at 2:31 PM Fernando Herrera <
fernando.j.herrera@gmail.com> wrote:

> Thanks Jorge. It does mean a lot your comments, and please, do help me get
> it better.
>
> I was wondering as well to put it inside the arrow crate but at the
> beginning I think it is going to be changing a lot, so I think it would be
> a good idea to keep it in a separate repo so we can iterate on it as much
> as possible.
>
> What about creating a Rust Arrow group in github to keep the fast changing
> projects apart in different repos but with in the same group?
>
> Fernando,
>
> On Mon, 1 Feb 2021, 17:28 Jorge Cardoso Leitão, <jo...@gmail.com>
> wrote:
>
> > I went through it, and I have to say that it is really well written and
> > contains non-trivial knowledge about the arrow crate. Thank you very much
> > for this, Fernando.
> >
> > In my opinion alone, the guide or a variation of it could be incorporated
> > into the arrow repo and released together with the crate, as is standard
> in
> > other rust projects. I for one would contribute and put time into
> enhancing
> > and maintaining it as part of the rust implementation, review changes to
> it
> > by other contributors, and keep it up to date.
> >
> > Best,
> > Jorge
> >
> >
> >
> >
> > On Sun, Jan 31, 2021 at 6:25 PM Fernando Herrera <
> > fernando.j.herrera@gmail.com> wrote:
> >
> > > Hi all,
> > >
> > > During the past months I have been trying to read and understand the
> code
> > > base for the Rust implementation of Arrow. At the beginning I was just
> > > reading the code and figuring out what each part or module was used
> for.
> > > Unfortunately this approach didn't work very well and had to start from
> > > scratch. The next time while trying to understand it I was also writing
> > > descriptions of the things I was studying and how to implement them.
> This
> > > approach led me to writing up a small Arrow guide.
> > >
> > > At this point is not complete and has several chapters missing, but
> > that's
> > > the point of this mail. I was wondering if someone that wants to work
> (or
> > > is already working) on the Rust side would like to help me make the
> guide
> > > better and richer.
> > >
> > > The first sections can be found here:
> > > https://elferherrera.github.io/arrow_guide/introduction.html
> > >
> > > And the repo is here:
> > > https://github.com/elferherrera/arrow_guide/
> > >
> > > The guide at the moment is written with mdbook and uses the doc-comment
> > > crate to check all the code. Also, the book is pulling the Arrow crate
> > from
> > > git directly, so it is always reading the most recent api.
> > >
> > > I hope someone finds these writings useful and if you are willing to
> help
> > > me just let me know.
> > >
> > > Thanks,
> > > Fernando
> > >
> >
>

Re: [RUST] Arrow guide

Posted by Fernando Herrera <fe...@gmail.com>.
Thanks Jorge. It does mean a lot your comments, and please, do help me get
it better.

I was wondering as well to put it inside the arrow crate but at the
beginning I think it is going to be changing a lot, so I think it would be
a good idea to keep it in a separate repo so we can iterate on it as much
as possible.

What about creating a Rust Arrow group in github to keep the fast changing
projects apart in different repos but with in the same group?

Fernando,

On Mon, 1 Feb 2021, 17:28 Jorge Cardoso Leitão, <jo...@gmail.com>
wrote:

> I went through it, and I have to say that it is really well written and
> contains non-trivial knowledge about the arrow crate. Thank you very much
> for this, Fernando.
>
> In my opinion alone, the guide or a variation of it could be incorporated
> into the arrow repo and released together with the crate, as is standard in
> other rust projects. I for one would contribute and put time into enhancing
> and maintaining it as part of the rust implementation, review changes to it
> by other contributors, and keep it up to date.
>
> Best,
> Jorge
>
>
>
>
> On Sun, Jan 31, 2021 at 6:25 PM Fernando Herrera <
> fernando.j.herrera@gmail.com> wrote:
>
> > Hi all,
> >
> > During the past months I have been trying to read and understand the code
> > base for the Rust implementation of Arrow. At the beginning I was just
> > reading the code and figuring out what each part or module was used for.
> > Unfortunately this approach didn't work very well and had to start from
> > scratch. The next time while trying to understand it I was also writing
> > descriptions of the things I was studying and how to implement them. This
> > approach led me to writing up a small Arrow guide.
> >
> > At this point is not complete and has several chapters missing, but
> that's
> > the point of this mail. I was wondering if someone that wants to work (or
> > is already working) on the Rust side would like to help me make the guide
> > better and richer.
> >
> > The first sections can be found here:
> > https://elferherrera.github.io/arrow_guide/introduction.html
> >
> > And the repo is here:
> > https://github.com/elferherrera/arrow_guide/
> >
> > The guide at the moment is written with mdbook and uses the doc-comment
> > crate to check all the code. Also, the book is pulling the Arrow crate
> from
> > git directly, so it is always reading the most recent api.
> >
> > I hope someone finds these writings useful and if you are willing to help
> > me just let me know.
> >
> > Thanks,
> > Fernando
> >
>

Re: [RUST] Arrow guide

Posted by Jorge Cardoso Leitão <jo...@gmail.com>.
I went through it, and I have to say that it is really well written and
contains non-trivial knowledge about the arrow crate. Thank you very much
for this, Fernando.

In my opinion alone, the guide or a variation of it could be incorporated
into the arrow repo and released together with the crate, as is standard in
other rust projects. I for one would contribute and put time into enhancing
and maintaining it as part of the rust implementation, review changes to it
by other contributors, and keep it up to date.

Best,
Jorge




On Sun, Jan 31, 2021 at 6:25 PM Fernando Herrera <
fernando.j.herrera@gmail.com> wrote:

> Hi all,
>
> During the past months I have been trying to read and understand the code
> base for the Rust implementation of Arrow. At the beginning I was just
> reading the code and figuring out what each part or module was used for.
> Unfortunately this approach didn't work very well and had to start from
> scratch. The next time while trying to understand it I was also writing
> descriptions of the things I was studying and how to implement them. This
> approach led me to writing up a small Arrow guide.
>
> At this point is not complete and has several chapters missing, but that's
> the point of this mail. I was wondering if someone that wants to work (or
> is already working) on the Rust side would like to help me make the guide
> better and richer.
>
> The first sections can be found here:
> https://elferherrera.github.io/arrow_guide/introduction.html
>
> And the repo is here:
> https://github.com/elferherrera/arrow_guide/
>
> The guide at the moment is written with mdbook and uses the doc-comment
> crate to check all the code. Also, the book is pulling the Arrow crate from
> git directly, so it is always reading the most recent api.
>
> I hope someone finds these writings useful and if you are willing to help
> me just let me know.
>
> Thanks,
> Fernando
>

Re: [RUST] Arrow guide

Posted by Wes McKinney <we...@gmail.com>.
To state the obvious, it would be great to have some community maintained
documentation (beyond generated API docs) for the Rust library. Writing
documentation almost always causes the quality of a code base to improve
because the process brings up rough edges, inconsistencies, or missing
features.

On Sun, Jan 31, 2021 at 11:47 AM Benjamin Blodgett <
benjaminblodgett@gmail.com> wrote:

> This is great, thanks for this!
>
> On Sun, Jan 31, 2021 at 9:25 AM Fernando Herrera <
> fernando.j.herrera@gmail.com> wrote:
>
> > Hi all,
> >
> > During the past months I have been trying to read and understand the code
> > base for the Rust implementation of Arrow. At the beginning I was just
> > reading the code and figuring out what each part or module was used for.
> > Unfortunately this approach didn't work very well and had to start from
> > scratch. The next time while trying to understand it I was also writing
> > descriptions of the things I was studying and how to implement them. This
> > approach led me to writing up a small Arrow guide.
> >
> > At this point is not complete and has several chapters missing, but
> that's
> > the point of this mail. I was wondering if someone that wants to work (or
> > is already working) on the Rust side would like to help me make the guide
> > better and richer.
> >
> > The first sections can be found here:
> > https://elferherrera.github.io/arrow_guide/introduction.html
> >
> > And the repo is here:
> > https://github.com/elferherrera/arrow_guide/
> >
> > The guide at the moment is written with mdbook and uses the doc-comment
> > crate to check all the code. Also, the book is pulling the Arrow crate
> from
> > git directly, so it is always reading the most recent api.
> >
> > I hope someone finds these writings useful and if you are willing to help
> > me just let me know.
> >
> > Thanks,
> > Fernando
> >
>

Re: [RUST] Arrow guide

Posted by Benjamin Blodgett <be...@gmail.com>.
This is great, thanks for this!

On Sun, Jan 31, 2021 at 9:25 AM Fernando Herrera <
fernando.j.herrera@gmail.com> wrote:

> Hi all,
>
> During the past months I have been trying to read and understand the code
> base for the Rust implementation of Arrow. At the beginning I was just
> reading the code and figuring out what each part or module was used for.
> Unfortunately this approach didn't work very well and had to start from
> scratch. The next time while trying to understand it I was also writing
> descriptions of the things I was studying and how to implement them. This
> approach led me to writing up a small Arrow guide.
>
> At this point is not complete and has several chapters missing, but that's
> the point of this mail. I was wondering if someone that wants to work (or
> is already working) on the Rust side would like to help me make the guide
> better and richer.
>
> The first sections can be found here:
> https://elferherrera.github.io/arrow_guide/introduction.html
>
> And the repo is here:
> https://github.com/elferherrera/arrow_guide/
>
> The guide at the moment is written with mdbook and uses the doc-comment
> crate to check all the code. Also, the book is pulling the Arrow crate from
> git directly, so it is always reading the most recent api.
>
> I hope someone finds these writings useful and if you are willing to help
> me just let me know.
>
> Thanks,
> Fernando
>