You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by GitBox <gi...@apache.org> on 2022/10/19 12:22:21 UTC

[GitHub] [arrow-site] jorisvandenbossche commented on a diff in pull request #242: [Website] Version 10.0.0 release blog post

jorisvandenbossche commented on code in PR #242:
URL: https://github.com/apache/arrow-site/pull/242#discussion_r999371508


##########
_posts/2022-10-17-10.0.0-release.md:
##########
@@ -209,10 +213,54 @@ Many important features, enhancements, and bug fixes are included in this releas
 
 ## Python notes
 
-The PyArrow-specific C++ code, previously part of Arrow C++ codebase, is now integrated into PyArrow. The tests are run automatically
-as part of the PyArrow test suite. See: [ARROW-16340](https://issues.apache.org/jira/browse/ARROW-16340) and  [ARROW-17122](https://issues.apache.org/jira/browse/ARROW-17122).
-The build process is not affected by the change but may have to be updated in future releases with the deprecation of
-`ARROW_PYTHON` CMake variable ([ARROW-17868](https://issues.apache.org/jira/browse/ARROW-17868)).
+Compatibility notes:
+
+* Some deprecated APIs (deprecated at least since pyarrow 1.0.0) have been removed: the
+  `RecordBatchReader get_next_batch` method, `DataType.num_children` attribute, etc
+  ([ARROW-17649](https://issues.apache.org/jira/browse/ARROW-17649)).
+* When writing to Arrow IPC file format with `pyarrow.dataset.write_dataset` using
+  `format="ipc"` or `format="arrow"`, the default extension for the resulting files is
+  changed to `.arrow` instead of `.feather`. You can still use `format="feather"` to
+  write identical files but using the `.feather` extension
+  ([ARROW-17089](https://issues.apache.org/jira/browse/ARROW-17089)).
+
+New features and improvements:
+
+* Filters in `pq.read_table()` can be passed as an expression in addition to the legacy
+  list of tuples. For example, `filters=pc.field("col") < 4` is equivalent to
+  `filters=[("col", "<", 4)]`
+  ([ARROW-17483](https://issues.apache.org/jira/browse/ARROW-17483)).
+* The `batch_readahead` and `fragment_readahead` arguments for scanning Datasets are
+  exposed in Python ([ARROW-17299](https://issues.apache.org/jira/browse/ARROW-17299)).
+* ExtensionArrays can now be created from a storage array through the `pa.array(..)`
+  constructor ([ARROW-17834](https://issues.apache.org/jira/browse/ARROW-17834)).
+* Converting ListArrays containing ExtensionArray values to numpy or pandas works by
+  falling back to the storage array
+  ([ARROW-17813](https://issues.apache.org/jira/browse/ARROW-17813)).
+* The `pyarrow.substrait.run_query()` function gained a `table_provider` keyword to run
+  the query against in-memory tables
+  ([ARROW-17521](https://issues.apache.org/jira/browse/ARROW-17521)).
+* The `StructType` class gained a `field()` method to retrieve a child field
+  ([ARROW-17131](https://issues.apache.org/jira/browse/ARROW-17131)).
+* Casting Tables to a new schema now honors the nullability flag in the target schema
+  ([ARROW-16651](https://issues.apache.org/jira/browse/ARROW-16651)).
+* Parquet files are now explicitly closed after reading
+  ([ARROW-13763](https://issues.apache.org/jira/browse/ARROW-13763)).
+
+Further, the Python bindings benefit from improvements in the C++ library
+(e.g. new compute functions); see the C++ notes above for additional details.
+
+**Build notes**
+
+The PyArrow-specific C++ code, previously part of Arrow C++ codebase, is now integrated
+into PyArrow. The tests are run automatically as part of the PyArrow test suite. See:
+[ARROW-16340](https://issues.apache.org/jira/browse/ARROW-16340) and
+[ARROW-17122](https://issues.apache.org/jira/browse/ARROW-17122).

Review Comment:
   Thanks!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org