You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "danepitkin (via GitHub)" <gi...@apache.org> on 2023/04/03 20:18:04 UTC
[GitHub] [arrow] danepitkin opened a new issue, #34868: [Python] Sharing docstrings between classes
danepitkin opened a new issue, #34868:
URL: https://github.com/apache/arrow/issues/34868
### Describe the enhancement requested
PyArrow duplicates a lot of documentation in order to provide explicit docstring examples. Let's reduce the duplication of docstrings by providing a way to share docstrings between classes. See the way `pandas` did this as an example: https://pandas.pydata.org/docs/development/contributing_docstring.html#sharing-docstrings
A good example of duplication in PyArrow are the classes `Table` and `RecordBatch`. They both provide similar, sometimes identical, top-level implementations and docstrings, while typically only differing in low-level C++ implementation.
Here is an example of duplicative docstring descriptions.
`class RecordBatch:`
```
@property
def nbytes(self):
"""
Total number of bytes consumed by the elements of the record batch.
In other words, the sum of bytes from all buffer ranges referenced.
Unlike `get_total_buffer_size` this method will account for array
offsets.
If buffers are shared between arrays then the shared
portion will only be counted multiple times.
The dictionary of dictionary arrays will always be counted in their
entirety even if the array only references a portion of the dictionary.
Examples
--------
>>> import pyarrow as pa
>>> n_legs = pa.array([2, 2, 4, 4, 5, 100])
>>> animals = pa.array(["Flamingo", "Parrot", "Dog", "Horse", "Brittle stars", "Centipede"])
>>> batch = pa.RecordBatch.from_arrays([n_legs, animals],
... names=["n_legs", "animals"])
>>> batch.nbytes
116
"""
...
```
`class Table:`
```
@property
def nbytes(self):
"""
Total number of bytes consumed by the elements of the table.
In other words, the sum of bytes from all buffer ranges referenced.
Unlike `get_total_buffer_size` this method will account for array
offsets.
If buffers are shared between arrays then the shared
portion will only be counted multiple times.
The dictionary of dictionary arrays will always be counted in their
entirety even if the array only references a portion of the dictionary.
Examples
--------
>>> import pyarrow as pa
>>> import pandas as pd
>>> df = pd.DataFrame({'n_legs': [None, 4, 5, None],
... 'animals': ["Flamingo", "Horse", None, "Centipede"]})
>>> table = pa.Table.from_pandas(df)
>>> table.nbytes
72
"""
...
```
### Component(s)
Python
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] danepitkin commented on issue #34868: [Python] Sharing docstrings between classes
Posted by "danepitkin (via GitHub)" <gi...@apache.org>.
danepitkin commented on issue #34868:
URL: https://github.com/apache/arrow/issues/34868#issuecomment-1496105553
This won't work for Cython until this issue is fixed: https://github.com/python/cpython/issues/91309
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] AlenkaF closed issue #34868: [Python] Sharing docstrings between classes
Posted by "AlenkaF (via GitHub)" <gi...@apache.org>.
AlenkaF closed issue #34868: [Python] Sharing docstrings between classes
URL: https://github.com/apache/arrow/issues/34868
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org