You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "paleolimbot (via GitHub)" <gi...@apache.org> on 2023/02/02 13:45:04 UTC

[GitHub] [arrow] paleolimbot commented on a diff in pull request #33925: GH-33923: [Docs] Tensor canonical extension type specification

paleolimbot commented on code in PR #33925:
URL: https://github.com/apache/arrow/pull/33925#discussion_r1094551041


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -72,4 +72,30 @@ same rules as laid out above, and provide backwards compatibility guarantees.
 Official List
 =============
 
-No canonical extension types have been standardized yet.
+Fixed shape tensor
+==================
+
+* Extension name: `arrow.fixed_shape_tensor`.
+
+* The storage type of the extension: ``FixedSizeList`` where:
+
+  * **value_type** is the data type of individual tensors and
+    is an instance of ``pyarrow.DataType`` or ``pyarrow.Field``.
+  * **list_size** is the product of all the elements in tensor shape.
+
+* Extension type parameters:
+
+  * **value_type** = Arrow DataType of the tensor elements
+  * **shape** = shape of the contained tensors as a tuple
+
+* Description of the serialization:
+
+  The metadata must be a valid JSON object including shape of
+  the contained tensors as an array with key "shape".
+
+  For example: `{ "shape": [2, 5]}`
+
+.. note::
+
+  Elements in an fixed shape tensor extension array are stored
+  in row-major/C-contiguous order.

Review Comment:
   Just a note that this will prevent storing R matrices as zero-copy. I seem to remember an earlier version of this had an option for storing column-major tensors zero-copy?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org