You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/01 18:20:53 UTC

[GitHub] [arrow] austin3dickey opened a new pull request, #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only

austin3dickey opened a new pull request, #13289:
URL: https://github.com/apache/arrow/pull/13289

   As a best practice, most of the optional configuration arguments in `write_dataset()` should be keyword-only. This PR enforces that.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane commented on pull request #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only

Posted by GitBox <gi...@apache.org>.
jonkeane commented on PR #13289:
URL: https://github.com/apache/arrow/pull/13289#issuecomment-1144158490

   The change itself will automatically be added to the changelog as part of the release process (For example: https://arrow.apache.org/release/8.0.0.html was compiled at release time).
   
   Personally, I think that's sufficient notice for something like this — but open to other opinions if we think we should warn others before the release.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] austin3dickey commented on a diff in pull request #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only

Posted by GitBox <gi...@apache.org>.
austin3dickey commented on code in PR #13289:
URL: https://github.com/apache/arrow/pull/13289#discussion_r887207524


##########
python/pyarrow/tests/test_dataset.py:
##########
@@ -1796,6 +1796,12 @@ def test_dictionary_partitioning_outer_nulls_raises(tempdir):
         ds.write_dataset(table, tempdir, format='ipc', partitioning=part)
 
 
+def test_positional_keywords_raises(tempdir):
+    table = pa.table({'a': ['x', 'y', None], 'b': ['x', 'y', 'z']})
+    with pytest.raises(TypeError):
+        ds.write_dataset(table, tempdir, "basename-{i}.parquet")

Review Comment:
   good idea!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane commented on pull request #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only

Posted by GitBox <gi...@apache.org>.
jonkeane commented on PR #13289:
URL: https://github.com/apache/arrow/pull/13289#issuecomment-1144201395

   Python / AMD64 Conda Python 3.9 Sphinx & Numpydoc is failing on master, and I don't think this PR is making anything worse for it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane closed pull request #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only

Posted by GitBox <gi...@apache.org>.
jonkeane closed pull request #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only
URL: https://github.com/apache/arrow/pull/13289


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] ursabot commented on pull request #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only

Posted by GitBox <gi...@apache.org>.
ursabot commented on PR #13289:
URL: https://github.com/apache/arrow/pull/13289#issuecomment-1145322259

   Benchmark runs are scheduled for baseline = 8295bdc2e86e657c59724c3e56da474e5414cb39 and contender = 2ffc10a43b2b9a397bfeba993993172082f9722b. 2ffc10a43b2b9a397bfeba993993172082f9722b is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/29533ef26af5474fad6a1d8ee3c9dfe1...3bd32d2b99d24655bb0cb3c1ee217c55/)
   [Finished :arrow_down:0.16% :arrow_up:0.08%] [test-mac-arm](https://conbench.ursa.dev/compare/runs/0a35f217218048329ba9dd54dee772f1...8c23f9700ff8458ea74fcb49a75e742c/)
   [Failed :arrow_down:0.52% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/ca21ac493f014c529f4332f0832d9c7b...29bb4d905f884364817467d3a55eb0dd/)
   [Finished :arrow_down:0.2% :arrow_up:0.04%] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/92ba80a06e284f20a78d7f5f25f0039d...ccd3f7b5edfb43b882480efc0171ecd7/)
   Buildkite builds:
   [Finished] [`2ffc10a4` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/873)
   [Finished] [`2ffc10a4` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/873)
   [Failed] [`2ffc10a4` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/865)
   [Finished] [`2ffc10a4` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/875)
   [Finished] [`8295bdc2` ec2-t3-xlarge-us-east-2](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ec2-t3-xlarge-us-east-2/builds/872)
   [Finished] [`8295bdc2` test-mac-arm](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-test-mac-arm/builds/872)
   [Failed] [`8295bdc2` ursa-i9-9960x](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-i9-9960x/builds/862)
   [Finished] [`8295bdc2` ursa-thinkcentre-m75q](https://buildkite.com/apache-arrow/arrow-bci-benchmark-on-ursa-thinkcentre-m75q/builds/874)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
   test-mac-arm: Supported benchmark langs: C++, Python, R
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13289:
URL: https://github.com/apache/arrow/pull/13289#issuecomment-1143985738

   https://issues.apache.org/jira/browse/ARROW-14632


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] jonkeane commented on a diff in pull request #13289: ARROW-14632: [Python] Make write_dataset arguments keyword-only

Posted by GitBox <gi...@apache.org>.
jonkeane commented on code in PR #13289:
URL: https://github.com/apache/arrow/pull/13289#discussion_r887199144


##########
python/pyarrow/tests/test_dataset.py:
##########
@@ -1796,6 +1796,12 @@ def test_dictionary_partitioning_outer_nulls_raises(tempdir):
         ds.write_dataset(table, tempdir, format='ipc', partitioning=part)
 
 
+def test_positional_keywords_raises(tempdir):
+    table = pa.table({'a': ['x', 'y', None], 'b': ['x', 'y', 'z']})
+    with pytest.raises(TypeError):
+        ds.write_dataset(table, tempdir, "basename-{i}.parquet")

Review Comment:
   This is minor, but we might want to not use `.parquet` in the basename template so as not to confuse why this isn't being guarded by `@pytest.mark.parquet` below. `ipc` or `arrow` would be better IMO
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org