You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/03/24 11:48:38 UTC

[GitHub] [arrow] AlenkaF opened a new pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

AlenkaF opened a new pull request #12704:
URL: https://github.com/apache/arrow/pull/12704


   This PR is adding docstirng examples to:
   
   - /docs/python/generated/pyarrow.parquet.ParquetDataset.html
   - /docs/python/generated/pyarrow.parquet.ParquetFile.html
   - /docs/python/generated/pyarrow.parquet.ParquetWriter.html
   - /docs/python/generated/pyarrow.parquet.read_table.html
   - /docs/python/generated/pyarrow.parquet.write_table.html
   - /docs/python/generated/pyarrow.parquet.write_to_dataset.html


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] edponce commented on pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

Posted by GitBox <gi...@apache.org>.
edponce commented on pull request #12704:
URL: https://github.com/apache/arrow/pull/12704#issuecomment-1078731558


   @AlenkaF Thank you for working on this! Very helpful.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF edited a comment on pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

Posted by GitBox <gi...@apache.org>.
AlenkaF edited a comment on pull request #12704:
URL: https://github.com/apache/arrow/pull/12704#issuecomment-1080291719


   The reason for the failing test is added deprecation warning to `common_metadata` property of `ParquetDataset`:
   
   ```python
   In [1]: import pytest
   In [2]: import pyarrow.parquet as pq
   
   In [3]: with pytest.warns(None) as record:
       ...:     pq.read_table('v0.7.1.parquet',
       ...:                   use_legacy_dataset=True)
       ...: len(record)
   Out[3]: 2
   
   In [4]: record[0].message
   Out[4]: DeprecationWarning("Passing 'use_legacy_dataset=True' to get the legacy behaviour is deprecated as of pyarrow 8.0.0, and the legacy implementation will be removed in a future version.")
   
   In [5]: record[1].message
   Out[5]: DeprecationWarning("'ParquetDataset.common_metadata' attribute is deprecated as of pyarrow 5.0.0 and will be removed in a future version.")
   ```
   I am not sure why the warning for `common_metadata` property is saved and none of the other deprecated `ParquetDataset` properties are.
   
   I will corrected the test to include the new warning and will keep the change for `common_metadata`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12704:
URL: https://github.com/apache/arrow/pull/12704#issuecomment-1078864720


   I would think `test_read_table_doesnt_warn` is failing due to my change in `common_metadata` property but am not sure. I will need to dig deeper into this test to understand it better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12704:
URL: https://github.com/apache/arrow/pull/12704#issuecomment-1081504287


   One more TODO: I need to make the output lines from the docstring examples short and pass doctest also. One way is to use [doctest.ELLIPSIS](https://docs.python.org/2/library/doctest.html#doctest.ELLIPSIS) or `\` to break the line.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12704:
URL: https://github.com/apache/arrow/pull/12704#issuecomment-1077543762


   I wasn't able to build the docs (Sphinx hangs currently) to check the latest change for `csv.write_to_dataset` and `common_metadata`. Will do that as soon as the build of the docs starts working. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12704:
URL: https://github.com/apache/arrow/pull/12704#issuecomment-1080635925


   This PR was checked with `pytest --doctest-modules python/pyarrow/parquet.py`.
   
   I do need to look at CI errors, one looks related.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12704:
URL: https://github.com/apache/arrow/pull/12704#issuecomment-1077571730


   https://issues.apache.org/jira/browse/ARROW-15428


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] AlenkaF commented on pull request #12704: ARROW-15428: [Python] Address docstrings in Parquet classes and functions

Posted by GitBox <gi...@apache.org>.
AlenkaF commented on pull request #12704:
URL: https://github.com/apache/arrow/pull/12704#issuecomment-1080291719


   The reason for the failing test is added deprecation warning to `common_metadata` property of `ParquetDataset`:
   
   ```python
   In [1]: import pytest
   In [2]: import pyarrow.parquet as pq
   
   In [3]: with pytest.warns(None) as record:
       ...:     pq.read_table('v0.7.1.parquet',
       ...:                   use_legacy_dataset=True)
       ...: len(record)
   Out[3]: 2
   
   In [4]: record[0].message
   Out[4]: DeprecationWarning("Passing 'use_legacy_dataset=True' to get the legacy behaviour is deprecated as of pyarrow 8.0.0, and the legacy implementation will be removed in a future version.")
   
   In [5]: record[1].message
   Out[5]: DeprecationWarning("'ParquetDataset.common_metadata' attribute is deprecated as of pyarrow 5.0.0 and will be removed in a future version.")
   ```
   As the `common_metadata` property is added to the `ParquetDataset` constructor the warning is caught even if the property isn't explicitly used.
   
   I will corrected the test to include the new warning and will keep the change for `common_metadata`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org