You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "AlenkaF (via GitHub)" <gi...@apache.org> on 2023/04/07 07:30:13 UTC

[GitHub] [arrow] AlenkaF commented on a diff in pull request #33925: GH-33923: [Docs] Tensor canonical extension type specification

AlenkaF commented on code in PR #33925:
URL: https://github.com/apache/arrow/pull/33925#discussion_r1160503199


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -72,4 +72,57 @@ same rules as laid out above, and provide backwards compatibility guarantees.
 Official List
 =============
 
-No canonical extension types have been standardized yet.
+Fixed shape tensor
+==================
+
+* Extension name: `arrow.fixed_shape_tensor`.
+
+* The storage type of the extension: ``FixedSizeList`` where:
+
+  * **value_type** is the data type of individual tensors and
+    is an instance of ``pyarrow.DataType`` or ``pyarrow.Field``.
+  * **list_size** is the product of all the elements in tensor shape.
+
+* Extension type parameters:
+
+  * **value_type** = Arrow DataType of the tensor elements
+  * **shape** = shape of the contained tensors as a tuple
+
+* Description of the serialization:
+
+  The metadata must be a valid JSON object including shape of
+  the contained tensors as an array with key **"shape"**
+
+  - example: ``{ "shape": [2, 5]}``
+
+  and optional:
+
+  - **"dim_names"** holds explicit names to tensor dimensions
+    as an array. The length of it should be equal to the shape
+    length and equal to the number of dimensions.
+
+    Example of dim_names metadata for NCHW ordered data:
+
+    ``{ "shape": [100, 200, 500], "dim_names": ["C", "H", "W"]}``
+
+    The ``dim_names`` metadata can be added if the dimensions have
+    well-known names that can map to the physical order (row-major).
+
+  - **"permutation"** holds indexes of the desired ordering of the
+    original dimensions. Also defined as an array.

Review Comment:
   @rok I have added an example of permutation for representing NCHW data as NHWC in the Python bindings documentation PR: https://github.com/apache/arrow/pull/34957/files



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org