You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Robert Dale (JIRA)" <ji...@apache.org> on 2016/09/01 19:45:21 UTC
[jira] [Commented] (TINKERPOP-1427) GraphSON 2.0 needs collection types and consistent number typing.

    [ https://issues.apache.org/jira/browse/TINKERPOP-1427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15456421#comment-15456421 ] 

Robert Dale commented on TINKERPOP-1427:
----------------------------------------

But therein lies the problem.  Unlike float, double where we can point to an IEEE spec, integer and long are not defined consistently in programming languages unless you mean just Java. They are nothing more than weak labels for physical storage size.  So either storage size is concisely defined or it should be ignored altogether by calling it a number. After all, the distinction between a short, integer, and long is really just a memory optimization. This was part of my original argument in favor of using JSON native number type.

Let's work through an example. Python is a great example of a weakly typed system that will run into this issue. Python2 has only two data types for integers, but only one is applicable here for this exercise. Its 'int' can be 32 bits or up to 64 bits depending on platform. Python3 now only has one data type for integers which can be arbitrarily long.  How does python differentiate between a short, int, long? Will there be logic in the GLV to handle number storage size based on value, which seems dangerous, will it be schema-aware, or will the user have to handle this? If the user is responsible, how is it coerced so that the GLV can interpret it? What would this actually look like? 

I'm still of the opinion that for integers a number is a number is a number. Anything less would be uncivilized.


> GraphSON 2.0 needs collection types and consistent number typing.
> -----------------------------------------------------------------
>
>                 Key: TINKERPOP-1427
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-1427
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: io
>    Affects Versions: 3.2.1
>            Reporter: Marko A. Rodriguez
>              Labels: breaking
>
> Before 3.3.0, we need to get the story around collections straight. Currently we are relying on JSON collections to represent our collections, but this isn't always sound -- e.g. JSON maps can only have string keys.
> Thus, we need:
> {code}
> g:Map
> g:List
> g:Set
> {code}
> Note that all of these would just be JSON lists:
> {code}
> {@type:"g:Map", @value:[key1,value1,key2,value2,key3,value3,...]}
> {@type:"g:List", @value:[value1, value2, value3,...]}
> {@type:"g:Set", @value:[value1, value2, value3,...]}
> {code}
> ---
> Next, these data structures are exactly what play into {{aggregateTo}} in sideEffects for {{RemoteConnection}}. Thus, we should use these types and, as well, get rid of {{none}} as the aggregate would be a real type like {{g:Int32}}.
> Also, I think we should abandon this hybrid physical machine naming convention and programming language type convention.
> {code}
> g:Int32 -> g:Integer
> g:Int64 -> g:Long
> g:Double -> g:Double (no change)
> g:Float -> g:Float (no change)
> {code}
> If we want to be consistent either do the above or do the below, though I think the above is cleaner.
> {code}
> g:Int32 -> g:Int32
> g:Int64 -> g:Int64
> g:Float -> g:Float32
> g:Double -> g:Float64
> {code}
> Again, using programming lexicon vs. physical machine lexicon is best and thus, just gut "int32" and "int64."



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)