You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "izveigor (via GitHub)" <gi...@apache.org> on 2023/04/11 19:31:51 UTC

[GitHub] [arrow] izveigor opened a new issue, #35052: Different primitive types in different languages

izveigor opened a new issue, #35052:
URL: https://github.com/apache/arrow/issues/35052

   ### Describe the bug, including details regarding any error messages, version, and platform.
   
   Primitive types differ depending on the programming language.
   
   | Language | Return types of function "is_primitive" | Source |
   | ----------- | ----------------------------------------- | -------- |
   | Rust          | UInt8, UInt16, UInt32, UInt64, Int8,  Int16, Int32, Int64, Float16, Float32, Float64, Decimal128, Decimal256,             Date32, Date64, Timestamp, Time32, Time64, Duration, Interval | [link](https://github.com/apache/arrow-rs/blob/master/arrow-schema/src/datatype.rs#L316-L356)
   | Go            | BOOL, UINT8, INT8, UINT16, INT16, UINT32, INT32, UINT64, INT64, FLOAT16, FLOAT32, FLOAT64, DATE32, DATE64, TIME32, TIME64, TIMESTAMP, DURATION, INTERVAL_MONTHS, INTERVAL_DAY_TIME, INTERVAL_MONTH_DAY_NANO|[link](https://github.com/apache/arrow/blob/main/go/arrow/datatype.go#L301-L309)|
   | Python     |  _Type_NA, _Type_BOOL, _Type_UINT8, _Type_INT8, _Type_UINT16, _Type_INT16, _Type_UINT32, _Type_INT32, _Type_UINT64, _Type_INT64, _Type_TIMESTAMP, _Type_DATE32, _Type_TIME32, _Type_TIME64, _Type_DATE64, _Type_HALF_FLOAT, _Type_FLOAT, _Type_DOUBLE| [link](https://github.com/apache/arrow/blob/main/python/pyarrow/types.pxi#L3185-L3196)|
   
   I think, there must be some rule that determines whether a type belongs to primitive types.
   
   ### Component(s)
   
   Go, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] izveigor commented on issue #35052: Different primitive types in different languages

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on issue #35052:
URL: https://github.com/apache/arrow/issues/35052#issuecomment-1505713578

   I think, it's also a great idea to define all types of tensors, because they are also different.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] tustvold commented on issue #35052: Different primitive types in different languages

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #35052:
URL: https://github.com/apache/arrow/issues/35052#issuecomment-1505739365

   For Rust we don't define booleans as primitives because it is a separate array type, `BooleanArray` vs `PrimitiveArray<T>`.
   
   This arises because there is a non-trivial behaviour and API difference between arrays of aligned scalars, and bit packed bools, with the former having native language support, e.g. `[i8]`, `[u32]`, support transparent zero-copy slicing, etc...
   
   I suspect there may be differing definitions of what constitutes a primitive type, is it being a native scalar value (which is what Rust uses), or does it reflect the buffer layout 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #35052: Different primitive types in different languages

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #35052:
URL: https://github.com/apache/arrow/issues/35052#issuecomment-1512495303

   > I don't understand the main principe by which a type is assigned to a primitive. Seen from the user's point of view, the primitive types are the opposite nested. If this is due to the peculiarities of the language, then which ones?
   
   "Not nested" is one definition of primitive (and it matches the one I have in my head) but it seems like not all implementations have chosen this definition.  For example, if I interpret @tustvold 's comment correctly I believe it says that Rust has chosen "primitive" to mean "maps directly to a rust primitive array".  Since "primitive" is not defined by the spec it is probably valid (as @alamb mentions) for each implementation to have a different definition.
   
   > Tensor problem. As written above, tensor types are also different. In my opinion, a tensor should accept all primitive types.
   
   The rust tensor implementation is older and predates recent discussion on a formalized definition for tensors added in https://github.com/apache/arrow/pull/33925  Given the definition in #33925 I don't see any use of the word "primitive" or, in fact, any limitation to the possible types.  So I think it would be legal (if not altogether sensible) to have tensors of nested types.  For example, a tensor of strings should be legal.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] alamb commented on issue #35052: Different primitive types in different languages

Posted by "alamb (via GitHub)" <gi...@apache.org>.
alamb commented on issue #35052:
URL: https://github.com/apache/arrow/issues/35052#issuecomment-1505816117

   My initial reading of this ticket is that I would **expect** the native type mappings of Arrow -->  languages to differ somewhat given different languages have different notions of what "primitive types" are.  Most languages have native integer and float support, but from there the differences get substantial as highlighted above
   
   So in other words maybe this is "not a bug, working as expected"
   
   @izveigor  I wonder if you could provide some background about why you are raising this issue (like what problem does varying native types in different language bindings cause)?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] izveigor closed issue #35052: Different primitive types in different languages

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor closed issue #35052: Different primitive types in different languages
URL: https://github.com/apache/arrow/issues/35052


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] westonpace commented on issue #35052: Different primitive types in different languages

Posted by "westonpace (via GitHub)" <gi...@apache.org>.
westonpace commented on issue #35052:
URL: https://github.com/apache/arrow/issues/35052#issuecomment-1504290514

   It might be nice to agree on other type categories too (temporal, numeric, etc.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] izveigor commented on issue #35052: Different primitive types in different languages

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on issue #35052:
URL: https://github.com/apache/arrow/issues/35052#issuecomment-1507526221

   
   I didn't accurately describe the problem, I will try to ask some questions that I did not understand.
   1) I don't understand the main principe by which a type is assigned to a primitive. Seen from the user's point of view, the primitive types are the opposite nested. If this is due to the peculiarities of the language, then which ones?
   2) Tensor problem. As written above, tensor types are also different. In my opinion, a tensor should accept all primitive types.
   
   I think the answers to these questions will be of interest to users who do not quite understand the definition.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] izveigor commented on issue #35052: Different primitive types in different languages

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on issue #35052:
URL: https://github.com/apache/arrow/issues/35052#issuecomment-1512632846

   Thanks for the answer, @westonpace. I think after that I have no questions left about primitive types.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] izveigor commented on issue #35052: Different primitive types in different languages

Posted by "izveigor (via GitHub)" <gi...@apache.org>.
izveigor commented on issue #35052:
URL: https://github.com/apache/arrow/issues/35052#issuecomment-1505715670

   @tustvold @andygrove @Dandandan @alamb


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org