You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Antoine Pitrou <an...@python.org> on 2018/12/12 14:13:20 UTC

C++ documentation overhaul

Hello,

We are doing a refactor of the C++ documentation which will appear in
0.12.0.

Currently, the main entry point of the C++ documentation is a
Doxygen-generated API documentation in the traditional format, together
with a couple MarkDown pages covering some example use cases.

The rewrite integrates the C++ API documentation in a larger Sphinx
documentation also holding the format specification and Python docs.
This allows us to add cross-references very easily and make the whole
documentation more cohesive.

To accompany this transformation, I have started writing some prose
documentation about fundamental concepts in the C++ API.  I have
uploaded a snapshot build of this work-in-progress here:
https://pitrou.net/arrowdevdoc/cpp/index.html

Comments and suggestions are welcome.

Regards

Antoine.

Re: C++ documentation overhaul

Posted by Wes McKinney <we...@gmail.com>.
hi Antoine,

Thank you for taking the lead on this initiative -- I think many will
agree it is long overdue =) Luckily many of our abstractions and APIs
have stabilized enough that it doesn't seem like many parts of the
basic documentation will have to be written and rewritten!

I started reading through this and seems great so far (a typo or two,
but nothing major). I will leave comments comments in your PR up
today. I meant to yesterday but got distracted with some other things

- Wes
On Wed, Dec 12, 2018 at 8:13 AM Antoine Pitrou <an...@python.org> wrote:
>
>
> Hello,
>
> We are doing a refactor of the C++ documentation which will appear in
> 0.12.0.
>
> Currently, the main entry point of the C++ documentation is a
> Doxygen-generated API documentation in the traditional format, together
> with a couple MarkDown pages covering some example use cases.
>
> The rewrite integrates the C++ API documentation in a larger Sphinx
> documentation also holding the format specification and Python docs.
> This allows us to add cross-references very easily and make the whole
> documentation more cohesive.
>
> To accompany this transformation, I have started writing some prose
> documentation about fundamental concepts in the C++ API.  I have
> uploaded a snapshot build of this work-in-progress here:
> https://pitrou.net/arrowdevdoc/cpp/index.html
>
> Comments and suggestions are welcome.
>
> Regards
>
> Antoine.

Re: C++ documentation overhaul

Posted by Antonio Cavallo <an...@gmail.com>.
Thanks Krisztian.
I'm moving my first baby steps in the project, and documenting them as I go
through them


On Mon, 17 Dec 2018 at 12:30, Krisztián Szűcs <sz...@gmail.com>
wrote:

> You can also build the documentations via docker-compose, see:
> https://github.com/apache/arrow/blob/master/docker-compose.yml#L206
>
> You can inspect the required steps from the Dockerfile itself:
> https://github.com/apache/arrow/blob/master/docs/Dockerfile
>
> On Mon, Dec 17, 2018 at 10:50 AM Antoine Pitrou <an...@python.org>
> wrote:
>
> >
> > Hi Antonio,
> >
> > It seems like we lack a documentation of how to build the documentation
> ;-)
> >
> > Currently, this is what you need to do:
> >
> > 1) Compile and install PyArrow
> > 2) Run "doxygen" in the "cpp/apidoc/" directory - this will generate the
> > XML files for the C++ API documentation
> > 3) Run "make html" in the "docs/" directory - this will generate the
> > documentation using Sphinx
> >
> > I guess that step 2) is what you're missing right now?
> >
> > As for the Parquet file: does it prevent you from building the Parquet
> > documentation?
> >
> > Regards
> >
> > Antoine.
> >
> >
> >
> > Le 17/12/2018 à 00:15, Antonio Cavallo a écrit :
> > > Hi Antoine,
> > > I've just got at some point in the documentation build (macos using
> conda
> > > and python 3.7) following the instructions in:
> > > arrow/docs/source/python/development.rst
> > >
> > > So far so good but I had a crash while reading the parquest file (I've
> > > opened a jira qithe details
> > https://issues.apache.org/jira/browse/ARROW-4050
> > > ).
> > >
> > > So I removed the parquet documentation.. but I'm still having issues
> with
> > > the arrow/docs/source/python/generated part: how do I create it?
> > >
> > > Thanks
> > >
> > >
> > >
> > > On Fri, 14 Dec 2018 at 16:20, Antoine Pitrou <an...@python.org>
> wrote:
> > >
> > >>
> > >> Hi Antonio,
> > >>
> > >> Everything is done in the main Arrow repository in a regular fashion
> > >> (e.g. you can open Pull Requests there).  Help on the documentation is
> > >> welcome, as many aspects are missing currently.
> > >>
> > >> Feel free to ask any questions!
> > >>
> > >> Regards
> > >>
> > >> Antoine.
> > >>
> > >>
> > >> Le 14/12/2018 à 16:09, Antonio Cavallo a écrit :
> > >>> Hi Antoine,
> > >>> I'm trying to learn about arrow, would it possible for me to help
> with
> > >> the
> > >>> documentation?
> > >>>
> > >>> Do you have a repository I can contribute to?
> > >>> Thanks"
> > >>>
> > >>> On Wed, 12 Dec 2018 at 09:13, Antoine Pitrou <an...@python.org>
> > wrote:
> > >>>
> > >>>>
> > >>>> Hello,
> > >>>>
> > >>>> We are doing a refactor of the C++ documentation which will appear
> in
> > >>>> 0.12.0.
> > >>>>
> > >>>> Currently, the main entry point of the C++ documentation is a
> > >>>> Doxygen-generated API documentation in the traditional format,
> > together
> > >>>> with a couple MarkDown pages covering some example use cases.
> > >>>>
> > >>>> The rewrite integrates the C++ API documentation in a larger Sphinx
> > >>>> documentation also holding the format specification and Python docs.
> > >>>> This allows us to add cross-references very easily and make the
> whole
> > >>>> documentation more cohesive.
> > >>>>
> > >>>> To accompany this transformation, I have started writing some prose
> > >>>> documentation about fundamental concepts in the C++ API.  I have
> > >>>> uploaded a snapshot build of this work-in-progress here:
> > >>>> https://pitrou.net/arrowdevdoc/cpp/index.html
> > >>>>
> > >>>> Comments and suggestions are welcome.
> > >>>>
> > >>>> Regards
> > >>>>
> > >>>> Antoine.
> > >>>>
> > >>>
> > >>
> > >
> >
>

Re: C++ documentation overhaul

Posted by Krisztián Szűcs <sz...@gmail.com>.
You can also build the documentations via docker-compose, see:
https://github.com/apache/arrow/blob/master/docker-compose.yml#L206

You can inspect the required steps from the Dockerfile itself:
https://github.com/apache/arrow/blob/master/docs/Dockerfile

On Mon, Dec 17, 2018 at 10:50 AM Antoine Pitrou <an...@python.org> wrote:

>
> Hi Antonio,
>
> It seems like we lack a documentation of how to build the documentation ;-)
>
> Currently, this is what you need to do:
>
> 1) Compile and install PyArrow
> 2) Run "doxygen" in the "cpp/apidoc/" directory - this will generate the
> XML files for the C++ API documentation
> 3) Run "make html" in the "docs/" directory - this will generate the
> documentation using Sphinx
>
> I guess that step 2) is what you're missing right now?
>
> As for the Parquet file: does it prevent you from building the Parquet
> documentation?
>
> Regards
>
> Antoine.
>
>
>
> Le 17/12/2018 à 00:15, Antonio Cavallo a écrit :
> > Hi Antoine,
> > I've just got at some point in the documentation build (macos using conda
> > and python 3.7) following the instructions in:
> > arrow/docs/source/python/development.rst
> >
> > So far so good but I had a crash while reading the parquest file (I've
> > opened a jira qithe details
> https://issues.apache.org/jira/browse/ARROW-4050
> > ).
> >
> > So I removed the parquet documentation.. but I'm still having issues with
> > the arrow/docs/source/python/generated part: how do I create it?
> >
> > Thanks
> >
> >
> >
> > On Fri, 14 Dec 2018 at 16:20, Antoine Pitrou <an...@python.org> wrote:
> >
> >>
> >> Hi Antonio,
> >>
> >> Everything is done in the main Arrow repository in a regular fashion
> >> (e.g. you can open Pull Requests there).  Help on the documentation is
> >> welcome, as many aspects are missing currently.
> >>
> >> Feel free to ask any questions!
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >> Le 14/12/2018 à 16:09, Antonio Cavallo a écrit :
> >>> Hi Antoine,
> >>> I'm trying to learn about arrow, would it possible for me to help with
> >> the
> >>> documentation?
> >>>
> >>> Do you have a repository I can contribute to?
> >>> Thanks"
> >>>
> >>> On Wed, 12 Dec 2018 at 09:13, Antoine Pitrou <an...@python.org>
> wrote:
> >>>
> >>>>
> >>>> Hello,
> >>>>
> >>>> We are doing a refactor of the C++ documentation which will appear in
> >>>> 0.12.0.
> >>>>
> >>>> Currently, the main entry point of the C++ documentation is a
> >>>> Doxygen-generated API documentation in the traditional format,
> together
> >>>> with a couple MarkDown pages covering some example use cases.
> >>>>
> >>>> The rewrite integrates the C++ API documentation in a larger Sphinx
> >>>> documentation also holding the format specification and Python docs.
> >>>> This allows us to add cross-references very easily and make the whole
> >>>> documentation more cohesive.
> >>>>
> >>>> To accompany this transformation, I have started writing some prose
> >>>> documentation about fundamental concepts in the C++ API.  I have
> >>>> uploaded a snapshot build of this work-in-progress here:
> >>>> https://pitrou.net/arrowdevdoc/cpp/index.html
> >>>>
> >>>> Comments and suggestions are welcome.
> >>>>
> >>>> Regards
> >>>>
> >>>> Antoine.
> >>>>
> >>>
> >>
> >
>

Re: C++ documentation overhaul

Posted by "Uwe L. Korn" <uw...@xhochy.com>.
I also see this problem. This is due to the underlying filesystem on macOS being case insensitive. The fix is to make your file system case sensitive (this is possible but takes a while) We have two generated files pyarrow.array.rst and pyarrow.Array.rst. For me the latter is the one that reliably wins.

Uwe 

> Am 18.12.2018 um 12:23 schrieb Antoine Pitrou <an...@python.org>:
> 
> 
>> Le 18/12/2018 à 02:44, Antonio Cavallo a écrit :
>> Mmm, done that #1, #2 and #3
>> It looks when I do a make html I receive thid:
>> 
>> /Users/antonio/Projects/cav71.arrow/arrow/docs/source/python/api.rst:30:toctree
>> references unknown document 'python/generated/pyarrow.field'
> 
> I don't get this error here.  Perhaps you can remove the
> "source/python/generated" directory and try again?
> 
> Regards
> 
> Antoine.


Re: C++ documentation overhaul

Posted by Antoine Pitrou <an...@python.org>.
Le 18/12/2018 à 02:44, Antonio Cavallo a écrit :
> Mmm, done that #1, #2 and #3
> It looks when I do a make html I receive thid:
> 
> /Users/antonio/Projects/cav71.arrow/arrow/docs/source/python/api.rst:30:toctree
> references unknown document 'python/generated/pyarrow.field'

I don't get this error here.  Perhaps you can remove the
"source/python/generated" directory and try again?

Regards

Antoine.

Re: C++ documentation overhaul

Posted by Antonio Cavallo <an...@gmail.com>.
Mmm, done that #1, #2 and #3
It looks when I do a make html I receive thid:

/Users/antonio/Projects/cav71.arrow/arrow/docs/source/python/api.rst:30:toctree
references unknown document 'python/generated/pyarrow.field'

I can see api definitions under tough:
find arrow/docs -name generated -type d
./source/python/generated
./_build/html/python/generated
./_build/html/_sources/python/generated
./_build/doctrees/python/generated

PS. just created a pull request to begin that
https://github.com/apache/arrow/pull/3198


On Mon, 17 Dec 2018 at 10:50, Antoine Pitrou <an...@python.org> wrote:

>
> Hi Antonio,
>
> It seems like we lack a documentation of how to build the documentation ;-)
>
> Currently, this is what you need to do:
>
> 1) Compile and install PyArrow
> 2) Run "doxygen" in the "cpp/apidoc/" directory - this will generate the
> XML files for the C++ API documentation
> 3) Run "make html" in the "docs/" directory - this will generate the
> documentation using Sphinx
>
> I guess that step 2) is what you're missing right now?
>
> As for the Parquet file: does it prevent you from building the Parquet
> documentation?
>
> Regards
>
> Antoine.
>
>
>
> Le 17/12/2018 à 00:15, Antonio Cavallo a écrit :
> > Hi Antoine,
> > I've just got at some point in the documentation build (macos using conda
> > and python 3.7) following the instructions in:
> > arrow/docs/source/python/development.rst
> >
> > So far so good but I had a crash while reading the parquest file (I've
> > opened a jira qithe details
> https://issues.apache.org/jira/browse/ARROW-4050
> > ).
> >
> > So I removed the parquet documentation.. but I'm still having issues with
> > the arrow/docs/source/python/generated part: how do I create it?
> >
> > Thanks
> >
> >
> >
> > On Fri, 14 Dec 2018 at 16:20, Antoine Pitrou <an...@python.org> wrote:
> >
> >>
> >> Hi Antonio,
> >>
> >> Everything is done in the main Arrow repository in a regular fashion
> >> (e.g. you can open Pull Requests there).  Help on the documentation is
> >> welcome, as many aspects are missing currently.
> >>
> >> Feel free to ask any questions!
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >>
> >> Le 14/12/2018 à 16:09, Antonio Cavallo a écrit :
> >>> Hi Antoine,
> >>> I'm trying to learn about arrow, would it possible for me to help with
> >> the
> >>> documentation?
> >>>
> >>> Do you have a repository I can contribute to?
> >>> Thanks"
> >>>
> >>> On Wed, 12 Dec 2018 at 09:13, Antoine Pitrou <an...@python.org>
> wrote:
> >>>
> >>>>
> >>>> Hello,
> >>>>
> >>>> We are doing a refactor of the C++ documentation which will appear in
> >>>> 0.12.0.
> >>>>
> >>>> Currently, the main entry point of the C++ documentation is a
> >>>> Doxygen-generated API documentation in the traditional format,
> together
> >>>> with a couple MarkDown pages covering some example use cases.
> >>>>
> >>>> The rewrite integrates the C++ API documentation in a larger Sphinx
> >>>> documentation also holding the format specification and Python docs.
> >>>> This allows us to add cross-references very easily and make the whole
> >>>> documentation more cohesive.
> >>>>
> >>>> To accompany this transformation, I have started writing some prose
> >>>> documentation about fundamental concepts in the C++ API.  I have
> >>>> uploaded a snapshot build of this work-in-progress here:
> >>>> https://pitrou.net/arrowdevdoc/cpp/index.html
> >>>>
> >>>> Comments and suggestions are welcome.
> >>>>
> >>>> Regards
> >>>>
> >>>> Antoine.
> >>>>
> >>>
> >>
> >
>

Re: C++ documentation overhaul

Posted by Antoine Pitrou <an...@python.org>.
Hi Antonio,

It seems like we lack a documentation of how to build the documentation ;-)

Currently, this is what you need to do:

1) Compile and install PyArrow
2) Run "doxygen" in the "cpp/apidoc/" directory - this will generate the
XML files for the C++ API documentation
3) Run "make html" in the "docs/" directory - this will generate the
documentation using Sphinx

I guess that step 2) is what you're missing right now?

As for the Parquet file: does it prevent you from building the Parquet
documentation?

Regards

Antoine.



Le 17/12/2018 à 00:15, Antonio Cavallo a écrit :
> Hi Antoine,
> I've just got at some point in the documentation build (macos using conda
> and python 3.7) following the instructions in:
> arrow/docs/source/python/development.rst
> 
> So far so good but I had a crash while reading the parquest file (I've
> opened a jira qithe details https://issues.apache.org/jira/browse/ARROW-4050
> ).
> 
> So I removed the parquet documentation.. but I'm still having issues with
> the arrow/docs/source/python/generated part: how do I create it?
> 
> Thanks
> 
> 
> 
> On Fri, 14 Dec 2018 at 16:20, Antoine Pitrou <an...@python.org> wrote:
> 
>>
>> Hi Antonio,
>>
>> Everything is done in the main Arrow repository in a regular fashion
>> (e.g. you can open Pull Requests there).  Help on the documentation is
>> welcome, as many aspects are missing currently.
>>
>> Feel free to ask any questions!
>>
>> Regards
>>
>> Antoine.
>>
>>
>> Le 14/12/2018 à 16:09, Antonio Cavallo a écrit :
>>> Hi Antoine,
>>> I'm trying to learn about arrow, would it possible for me to help with
>> the
>>> documentation?
>>>
>>> Do you have a repository I can contribute to?
>>> Thanks"
>>>
>>> On Wed, 12 Dec 2018 at 09:13, Antoine Pitrou <an...@python.org> wrote:
>>>
>>>>
>>>> Hello,
>>>>
>>>> We are doing a refactor of the C++ documentation which will appear in
>>>> 0.12.0.
>>>>
>>>> Currently, the main entry point of the C++ documentation is a
>>>> Doxygen-generated API documentation in the traditional format, together
>>>> with a couple MarkDown pages covering some example use cases.
>>>>
>>>> The rewrite integrates the C++ API documentation in a larger Sphinx
>>>> documentation also holding the format specification and Python docs.
>>>> This allows us to add cross-references very easily and make the whole
>>>> documentation more cohesive.
>>>>
>>>> To accompany this transformation, I have started writing some prose
>>>> documentation about fundamental concepts in the C++ API.  I have
>>>> uploaded a snapshot build of this work-in-progress here:
>>>> https://pitrou.net/arrowdevdoc/cpp/index.html
>>>>
>>>> Comments and suggestions are welcome.
>>>>
>>>> Regards
>>>>
>>>> Antoine.
>>>>
>>>
>>
> 

Re: C++ documentation overhaul

Posted by Antonio Cavallo <an...@gmail.com>.
Hi Antoine,
I've just got at some point in the documentation build (macos using conda
and python 3.7) following the instructions in:
arrow/docs/source/python/development.rst

So far so good but I had a crash while reading the parquest file (I've
opened a jira qithe details https://issues.apache.org/jira/browse/ARROW-4050
).

So I removed the parquet documentation.. but I'm still having issues with
the arrow/docs/source/python/generated part: how do I create it?

Thanks



On Fri, 14 Dec 2018 at 16:20, Antoine Pitrou <an...@python.org> wrote:

>
> Hi Antonio,
>
> Everything is done in the main Arrow repository in a regular fashion
> (e.g. you can open Pull Requests there).  Help on the documentation is
> welcome, as many aspects are missing currently.
>
> Feel free to ask any questions!
>
> Regards
>
> Antoine.
>
>
> Le 14/12/2018 à 16:09, Antonio Cavallo a écrit :
> > Hi Antoine,
> > I'm trying to learn about arrow, would it possible for me to help with
> the
> > documentation?
> >
> > Do you have a repository I can contribute to?
> > Thanks
> >
> > On Wed, 12 Dec 2018 at 09:13, Antoine Pitrou <an...@python.org> wrote:
> >
> >>
> >> Hello,
> >>
> >> We are doing a refactor of the C++ documentation which will appear in
> >> 0.12.0.
> >>
> >> Currently, the main entry point of the C++ documentation is a
> >> Doxygen-generated API documentation in the traditional format, together
> >> with a couple MarkDown pages covering some example use cases.
> >>
> >> The rewrite integrates the C++ API documentation in a larger Sphinx
> >> documentation also holding the format specification and Python docs.
> >> This allows us to add cross-references very easily and make the whole
> >> documentation more cohesive.
> >>
> >> To accompany this transformation, I have started writing some prose
> >> documentation about fundamental concepts in the C++ API.  I have
> >> uploaded a snapshot build of this work-in-progress here:
> >> https://pitrou.net/arrowdevdoc/cpp/index.html
> >>
> >> Comments and suggestions are welcome.
> >>
> >> Regards
> >>
> >> Antoine.
> >>
> >
>

Re: C++ documentation overhaul

Posted by Antoine Pitrou <an...@python.org>.
Hi Antonio,

Everything is done in the main Arrow repository in a regular fashion
(e.g. you can open Pull Requests there).  Help on the documentation is
welcome, as many aspects are missing currently.

Feel free to ask any questions!

Regards

Antoine.


Le 14/12/2018 à 16:09, Antonio Cavallo a écrit :
> Hi Antoine,
> I'm trying to learn about arrow, would it possible for me to help with the
> documentation?
> 
> Do you have a repository I can contribute to?
> Thanks
> 
> On Wed, 12 Dec 2018 at 09:13, Antoine Pitrou <an...@python.org> wrote:
> 
>>
>> Hello,
>>
>> We are doing a refactor of the C++ documentation which will appear in
>> 0.12.0.
>>
>> Currently, the main entry point of the C++ documentation is a
>> Doxygen-generated API documentation in the traditional format, together
>> with a couple MarkDown pages covering some example use cases.
>>
>> The rewrite integrates the C++ API documentation in a larger Sphinx
>> documentation also holding the format specification and Python docs.
>> This allows us to add cross-references very easily and make the whole
>> documentation more cohesive.
>>
>> To accompany this transformation, I have started writing some prose
>> documentation about fundamental concepts in the C++ API.  I have
>> uploaded a snapshot build of this work-in-progress here:
>> https://pitrou.net/arrowdevdoc/cpp/index.html
>>
>> Comments and suggestions are welcome.
>>
>> Regards
>>
>> Antoine.
>>
> 

Re: C++ documentation overhaul

Posted by Antonio Cavallo <an...@gmail.com>.
Hi Antoine,
I'm trying to learn about arrow, would it possible for me to help with the
documentation?

Do you have a repository I can contribute to?
Thanks

On Wed, 12 Dec 2018 at 09:13, Antoine Pitrou <an...@python.org> wrote:

>
> Hello,
>
> We are doing a refactor of the C++ documentation which will appear in
> 0.12.0.
>
> Currently, the main entry point of the C++ documentation is a
> Doxygen-generated API documentation in the traditional format, together
> with a couple MarkDown pages covering some example use cases.
>
> The rewrite integrates the C++ API documentation in a larger Sphinx
> documentation also holding the format specification and Python docs.
> This allows us to add cross-references very easily and make the whole
> documentation more cohesive.
>
> To accompany this transformation, I have started writing some prose
> documentation about fundamental concepts in the C++ API.  I have
> uploaded a snapshot build of this work-in-progress here:
> https://pitrou.net/arrowdevdoc/cpp/index.html
>
> Comments and suggestions are welcome.
>
> Regards
>
> Antoine.
>