You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "rok (via GitHub)" <gi...@apache.org> on 2023/02/01 14:35:58 UTC

[GitHub] [arrow] rok commented on a diff in pull request #33925: GH-33923: [Docs] Tensor canonical extension type specification

rok commented on code in PR #33925:
URL: https://github.com/apache/arrow/pull/33925#discussion_r1093310032


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -72,4 +72,30 @@ same rules as laid out above, and provide backwards compatibility guarantees.
 Official List
 =============
 
-No canonical extension types have been standardized yet.
+Fixed shape tensor
+==================
+
+* Extension name: `arrow.fixed_shape_tensor`.
+
+* The storage type of the extension: ``FixedSizeList`` where:
+
+  * **value_type** is the data type of individual tensors and
+    is an instance of ``pyarrow.DataType`` or ``pyarrow.Field``.
+  * **list_size** is the product of all the elements in tensor shape.
+
+* Extension type parameters:
+
+  * **value_type** = Arrow DataType of the tensor elements
+  * **shape** = shape of the contained tensors as a tuple
+  * **is_row_major** = boolean indicating the order of elements

Review Comment:
   In the [Zulip discussion](https://ursalabs.zulipchat.com/#narrow/stream/180245-dev/topic/Canonical.20extension.20type.20for.20tensors/near/324882917) we are leaning towards canonical type always storing row-major and letting applications store strides in metadata. Any arguments for or against from you or your users would be most welcome at this point!



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org