You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by Marko Rodriguez <ok...@gmail.com> on 2019/06/13 22:25:21 UTC

mm-ADT to TinkerPop3

Hello,

Various stakeholders in Apache TinkerPop have been wondering weather mm-ADT can be leveraged in TinkerPop3. While I originally planned for mm-ADT to form the foundation of TinkerPop4, there are a subset of features in mm-ADT that could really help TP3 moving forward. Here is a preliminary outline of the mm-ADT features that could push the TP3 roadmap.

1. Type system: mm-ADT has a nominal type system for the built-in types and a structural type system for all derived types. Bytecode instructions that CRUD on database data can by statically typed and reasoned on at compile time.

2. Strategies: mm-ADT has a completely different approach to query optimization than TP3. While there are compile-time strategies for manipulating a query into a semantically equivalent, though computationally more efficient form, the concept of “provider strategies” (indices) goes out the window in favor of reference graphs. The primary benefit of the mm-ADT model is that the implementation for providers will be much simpler, less error prone, doesn’t require custom instructions, and is able to naturally capitalize on other internal provider optimizations such as schemas, denormalizations, views, etc.

3. Instruction Set: mm-ADT’s instruction set is less adhoc than TP3. Relational operators are polymorphic. Math operators are polymorphic. Container (collection) operators are polymorphic. Unlike TP3, a “vertex” is just a map like any other map. Thus, has(), value(), where(), select(), etc. operate across all such derivations. Moreover, mm-ADT’s instruction set greatly reduces the number of ways in which an expression can be represented, relying primarily on reference graphs (see #2 above) as the means of optimization. This should help limit the degrees of freedom in the Gremlin language and reduce its apparent complexity to newcomers.

4. References: mm-ADT introduces references (pointers) as first-class citizens. References form one of the primary data types in mm-ADT with numerous usages including:
* Query planning. (providers exposing secondary data access paths via reference graphs -- see #2 above)
* Modeling complex objects. (will not come into play given TP3’s central focus on the property graph data type).
* Bytecode arguments. (nested bytecode are dynamic references and every instruction’s arguments can take references (even the opcode itself!)).
* Remote proxies. (TP3 detached vertices are awkward and limiting in comparison to mm-ADT proxy references).
* Schemas. (will probably not come into play, but “person” vertices are possible in mm-ADT. Thus, if TP3 wants to introduce graph schemas, mm-ADT provides the functionality).

I’ll leave it at that for now. Any questions, please ask.

Take care,
Marko.

http://rredux.com <http://rredux.com/>

Re: mm-ADT to TinkerPop3

Posted by Dmitry Novikov <dm...@neueda.com>.

Sorry, at first I've read this with the wrong context. Nevermind my question. It makes sense now.

On 2019/06/17 14:08:40, Marko Rodriguez <ok...@gmail.com> wrote: 
> Hello,
> 
> > Question about 4. Can not fully understand how references are connected with schemas. Could you please explain it in more detail, or point to explanation if already exists?
> 
> Wow. That is funny you bring this up. I worked all weekend to unify “references” and “types” in mm-ADT. The short to your question is that a schema is just the unification of all the defined types. For a better understanding on how references are just types, I've attached a screenshot of the introduction to the mm-ADT type system.
> 
> Any questions, please ask.
> 
> *** SIDENOTE: I don’t know of any programming language that has realized pointers as being fundamentally types. In mm-ADT where “anonymous types” (lambda types) are prevalent, it was a short step to realize that, in fact, any “grouping” is ultimately a new type.
> 
> Take care,
> Marko.
> 
> http://rredux.com
> 
> 
> 
> 
> > 
> > Regards,
> > Dmitry
> > 
> > On 2019/06/14 16:21:55, Marko Rodriguez <okrammarko@gmail.com <ma...@gmail.com>> wrote: 
> >> Hey,
> >> 
> >>> One thing I wonder at the moment which I don't think has come up in
> >>> relation to mm-ADT discussion yet is DSLs. By every account, people are
> >>> either using DSLs now or as soon as they learn about them, they immediately
> >>> see the value and start to organize their code around them. So, any
> >>> thoughts yet on how DSLs work under mm-ADT (in relation to TP3 and/or
> >>> future) or is the model largely the same as what we do now?
> >> 
> >> mm-ADT is a bytecode specification. While we have a human readable/writable text representation (currently being called mm-ADT-bc), mm-ADT is primarily for machine consumption. Thus, when it comes to higher-level languages like Gremlin or a custom DSL, they would compile to mm-ADT bytecode.  Thus, if Gremlin compiles to mm-ADT, then all the Gremlin DSL infrastructure would just work as is. However, things can get a more interesting.
> >> 
> >> You can create derived types of arbitrary complexity in mm-ADT.
> >> 
> >> [define,person,[name:@string,age:@int,knows:@person*]]
> >> 
> >> From a DSL perspective, users can make their own objects. Look at the friends field. It is not container, but just zero or more person objects (sequence/stream). When this model is embedded in a graph database (and there are different ways to specify the embedding), those people could be referenced via a “knows"-edge.
> >> 
> >> As you can see, there is nothing “graph” here. No vertices, no edges… just a domain model.  But with mm-ADT-bc, you can create processes over that domain model and thus, traverse the “graph”:
> >> 
> >> [db][values,people]      // people is defined, I just don’t show it in this email
> >>    [has,name,eq,marko]
> >>    [values,knows]
> >>    [value,age]
> >>    [sum]
> >> 
> >> There is nothing pretty about mm-ADT-bc to a human user, but that is where DSLs would come in. Languages that make it easy to write mm-ADT-bc.
> >> 
> >> If Gremlin were the higher-level language, the following traversal would create the above bytecode:
> >> 	g.V().has(‘person',‘name’,’marko’).out(‘knows’).values(‘age’).sum()
> >> 
> >> How do you see this being used from your perspective?
> >> 
> >> Marko.
> >> 
> >> http://rredux.com
> >> 
> >> 
> >> 
> >>> 
> >>> 
> >>> On Thu, Jun 13, 2019 at 6:25 PM Marko Rodriguez <okrammarko@gmail.com <ma...@gmail.com> <mailto:okrammarko@gmail.com <ma...@gmail.com>>>
> >>> wrote:
> >>> 
> >>>> Hello,
> >>>> 
> >>>> Various stakeholders in Apache TinkerPop have been wondering weather
> >>>> mm-ADT can be leveraged in TinkerPop3. While I originally planned for
> >>>> mm-ADT to form the foundation of TinkerPop4, there are a subset of features
> >>>> in mm-ADT that could really help TP3 moving forward. Here is a preliminary
> >>>> outline of the mm-ADT features that could push the TP3 roadmap.
> >>>> 
> >>>>       1. Type system: mm-ADT has a nominal type system for the built-in
> >>>> types and a structural type system for all derived types. Bytecode
> >>>> instructions that CRUD on database data can by statically typed and
> >>>> reasoned on at compile time.
> >>>> 
> >>>>       2. Strategies: mm-ADT has a completely different approach to query
> >>>> optimization than TP3. While there are compile-time strategies for
> >>>> manipulating a query into a semantically equivalent, though computationally
> >>>> more efficient form, the concept of “provider strategies” (indices) goes
> >>>> out the window in favor of reference graphs. The primary benefit of the
> >>>> mm-ADT model is that the implementation for providers will be much simpler,
> >>>> less error prone, doesn’t require custom instructions, and is able to
> >>>> naturally capitalize on other internal provider optimizations such as
> >>>> schemas, denormalizations, views, etc.
> >>>> 
> >>>>       3. Instruction Set: mm-ADT’s instruction set is less adhoc than
> >>>> TP3. Relational operators are polymorphic. Math operators are polymorphic.
> >>>> Container (collection) operators are polymorphic. Unlike TP3, a “vertex” is
> >>>> just a map like any other map. Thus, has(), value(), where(), select(),
> >>>> etc. operate across all such derivations. Moreover, mm-ADT’s instruction
> >>>> set greatly reduces the number of ways in which an expression can be
> >>>> represented, relying primarily on reference graphs (see #2 above) as the
> >>>> means of optimization. This should help limit the degrees of freedom in the
> >>>> Gremlin language and reduce its apparent complexity to newcomers.
> >>>> 
> >>>>       4. References: mm-ADT introduces references (pointers) as
> >>>> first-class citizens. References form one of the primary data types in
> >>>> mm-ADT with numerous usages including:
> >>>>               * Query planning. (providers exposing secondary data
> >>>> access paths via reference graphs -- see #2 above)
> >>>>               * Modeling complex objects. (will not come into play given
> >>>> TP3’s central focus on the property graph data type).
> >>>>               * Bytecode arguments. (nested bytecode are dynamic
> >>>> references and every instruction’s arguments can take references (even the
> >>>> opcode itself!)).
> >>>>               * Remote proxies. (TP3 detached vertices are awkward and
> >>>> limiting in comparison to mm-ADT proxy references).
> >>>>               * Schemas. (will probably not come into play, but “person”
> >>>> vertices are possible in mm-ADT. Thus, if TP3 wants to introduce graph
> >>>> schemas, mm-ADT provides the functionality).
> >>>> 
> >>>> I’ll leave it at that for now. Any questions, please ask.
> >>>> 
> >>>> Take care,
> >>>> Marko.
> >>>> 
> >>>> http://rredux.com <http://rredux.com/> <http://rredux.com/ <http://rredux.com/>> <http://rredux.com/ <http://rredux.com/> <http://rredux.com/ <http://rredux.com/>>>
> 
>

Re: mm-ADT to TinkerPop3

Posted by Marko Rodriguez <ok...@gmail.com>.

Hello,

> Question about 4. Can not fully understand how references are connected with schemas. Could you please explain it in more detail, or point to explanation if already exists?

Wow. That is funny you bring this up. I worked all weekend to unify “references” and “types” in mm-ADT. The short to your question is that a schema is just the unification of all the defined types. For a better understanding on how references are just types, I've attached a screenshot of the introduction to the mm-ADT type system.

Any questions, please ask.

*** SIDENOTE: I don’t know of any programming language that has realized pointers as being fundamentally types. In mm-ADT where “anonymous types” (lambda types) are prevalent, it was a short step to realize that, in fact, any “grouping” is ultimately a new type.

Take care,
Marko.

http://rredux.com




> 
> Regards,
> Dmitry
> 
> On 2019/06/14 16:21:55, Marko Rodriguez <okrammarko@gmail.com <ma...@gmail.com>> wrote: 
>> Hey,
>> 
>>> One thing I wonder at the moment which I don't think has come up in
>>> relation to mm-ADT discussion yet is DSLs. By every account, people are
>>> either using DSLs now or as soon as they learn about them, they immediately
>>> see the value and start to organize their code around them. So, any
>>> thoughts yet on how DSLs work under mm-ADT (in relation to TP3 and/or
>>> future) or is the model largely the same as what we do now?
>> 
>> mm-ADT is a bytecode specification. While we have a human readable/writable text representation (currently being called mm-ADT-bc), mm-ADT is primarily for machine consumption. Thus, when it comes to higher-level languages like Gremlin or a custom DSL, they would compile to mm-ADT bytecode.  Thus, if Gremlin compiles to mm-ADT, then all the Gremlin DSL infrastructure would just work as is. However, things can get a more interesting.
>> 
>> You can create derived types of arbitrary complexity in mm-ADT.
>> 
>> [define,person,[name:@string,age:@int,knows:@person*]]
>> 
>> From a DSL perspective, users can make their own objects. Look at the friends field. It is not container, but just zero or more person objects (sequence/stream). When this model is embedded in a graph database (and there are different ways to specify the embedding), those people could be referenced via a “knows"-edge.
>> 
>> As you can see, there is nothing “graph” here. No vertices, no edges… just a domain model.  But with mm-ADT-bc, you can create processes over that domain model and thus, traverse the “graph”:
>> 
>> [db][values,people]      // people is defined, I just don’t show it in this email
>>    [has,name,eq,marko]
>>    [values,knows]
>>    [value,age]
>>    [sum]
>> 
>> There is nothing pretty about mm-ADT-bc to a human user, but that is where DSLs would come in. Languages that make it easy to write mm-ADT-bc.
>> 
>> If Gremlin were the higher-level language, the following traversal would create the above bytecode:
>> 	g.V().has(‘person',‘name’,’marko’).out(‘knows’).values(‘age’).sum()
>> 
>> How do you see this being used from your perspective?
>> 
>> Marko.
>> 
>> http://rredux.com
>> 
>> 
>> 
>>> 
>>> 
>>> On Thu, Jun 13, 2019 at 6:25 PM Marko Rodriguez <okrammarko@gmail.com <ma...@gmail.com> <mailto:okrammarko@gmail.com <ma...@gmail.com>>>
>>> wrote:
>>> 
>>>> Hello,
>>>> 
>>>> Various stakeholders in Apache TinkerPop have been wondering weather
>>>> mm-ADT can be leveraged in TinkerPop3. While I originally planned for
>>>> mm-ADT to form the foundation of TinkerPop4, there are a subset of features
>>>> in mm-ADT that could really help TP3 moving forward. Here is a preliminary
>>>> outline of the mm-ADT features that could push the TP3 roadmap.
>>>> 
>>>>       1. Type system: mm-ADT has a nominal type system for the built-in
>>>> types and a structural type system for all derived types. Bytecode
>>>> instructions that CRUD on database data can by statically typed and
>>>> reasoned on at compile time.
>>>> 
>>>>       2. Strategies: mm-ADT has a completely different approach to query
>>>> optimization than TP3. While there are compile-time strategies for
>>>> manipulating a query into a semantically equivalent, though computationally
>>>> more efficient form, the concept of “provider strategies” (indices) goes
>>>> out the window in favor of reference graphs. The primary benefit of the
>>>> mm-ADT model is that the implementation for providers will be much simpler,
>>>> less error prone, doesn’t require custom instructions, and is able to
>>>> naturally capitalize on other internal provider optimizations such as
>>>> schemas, denormalizations, views, etc.
>>>> 
>>>>       3. Instruction Set: mm-ADT’s instruction set is less adhoc than
>>>> TP3. Relational operators are polymorphic. Math operators are polymorphic.
>>>> Container (collection) operators are polymorphic. Unlike TP3, a “vertex” is
>>>> just a map like any other map. Thus, has(), value(), where(), select(),
>>>> etc. operate across all such derivations. Moreover, mm-ADT’s instruction
>>>> set greatly reduces the number of ways in which an expression can be
>>>> represented, relying primarily on reference graphs (see #2 above) as the
>>>> means of optimization. This should help limit the degrees of freedom in the
>>>> Gremlin language and reduce its apparent complexity to newcomers.
>>>> 
>>>>       4. References: mm-ADT introduces references (pointers) as
>>>> first-class citizens. References form one of the primary data types in
>>>> mm-ADT with numerous usages including:
>>>>               * Query planning. (providers exposing secondary data
>>>> access paths via reference graphs -- see #2 above)
>>>>               * Modeling complex objects. (will not come into play given
>>>> TP3’s central focus on the property graph data type).
>>>>               * Bytecode arguments. (nested bytecode are dynamic
>>>> references and every instruction’s arguments can take references (even the
>>>> opcode itself!)).
>>>>               * Remote proxies. (TP3 detached vertices are awkward and
>>>> limiting in comparison to mm-ADT proxy references).
>>>>               * Schemas. (will probably not come into play, but “person”
>>>> vertices are possible in mm-ADT. Thus, if TP3 wants to introduce graph
>>>> schemas, mm-ADT provides the functionality).
>>>> 
>>>> I’ll leave it at that for now. Any questions, please ask.
>>>> 
>>>> Take care,
>>>> Marko.
>>>> 
>>>> http://rredux.com <http://rredux.com/> <http://rredux.com/ <http://rredux.com/>> <http://rredux.com/ <http://rredux.com/> <http://rredux.com/ <http://rredux.com/>>>

Re: mm-ADT to TinkerPop3

Posted by Dmitry Novikov <dm...@neueda.com>.

Hello,

From users perspective, I am very excited about 1 and 2:

1. Type system makes sense, currently, there is JIRA about introducing types in TP 3.5 - TINKERPOP-2234
2. Making instruction set more consistent:
  * Currently `select` step is used for getting values from maps and labeled steps within a path, which from my perspective is confusing. Making `values` step recommended way to access map, would simplify `select`.
  * Representing Vertex and Edge as maps would simplify serialization. Also, it would make writing queries simpler as there would be fewer steps to remember.

Question about 4. Can not fully understand how references are connected with schemas. Could you please explain it in more detail, or point to explanation if already exists?

Regards,
Dmitry

On 2019/06/14 16:21:55, Marko Rodriguez <ok...@gmail.com> wrote: 
> Hey,
> 
> > One thing I wonder at the moment which I don't think has come up in
> > relation to mm-ADT discussion yet is DSLs. By every account, people are
> > either using DSLs now or as soon as they learn about them, they immediately
> > see the value and start to organize their code around them. So, any
> > thoughts yet on how DSLs work under mm-ADT (in relation to TP3 and/or
> > future) or is the model largely the same as what we do now?
> 
> mm-ADT is a bytecode specification. While we have a human readable/writable text representation (currently being called mm-ADT-bc), mm-ADT is primarily for machine consumption. Thus, when it comes to higher-level languages like Gremlin or a custom DSL, they would compile to mm-ADT bytecode.  Thus, if Gremlin compiles to mm-ADT, then all the Gremlin DSL infrastructure would just work as is. However, things can get a more interesting.
> 
> You can create derived types of arbitrary complexity in mm-ADT.
> 
> [define,person,[name:@string,age:@int,knows:@person*]]
> 
> From a DSL perspective, users can make their own objects. Look at the friends field. It is not container, but just zero or more person objects (sequence/stream). When this model is embedded in a graph database (and there are different ways to specify the embedding), those people could be referenced via a “knows"-edge.
> 
> As you can see, there is nothing “graph” here. No vertices, no edges… just a domain model.  But with mm-ADT-bc, you can create processes over that domain model and thus, traverse the “graph”:
> 
> [db][values,people]      // people is defined, I just don’t show it in this email
>     [has,name,eq,marko]
>     [values,knows]
>     [value,age]
>     [sum]
> 
> There is nothing pretty about mm-ADT-bc to a human user, but that is where DSLs would come in. Languages that make it easy to write mm-ADT-bc.
> 
> If Gremlin were the higher-level language, the following traversal would create the above bytecode:
> 	g.V().has(‘person',‘name’,’marko’).out(‘knows’).values(‘age’).sum()
>     
> How do you see this being used from your perspective?
> 
> Marko.
> 
> http://rredux.com
> 
> 
> 
> > 
> > 
> > On Thu, Jun 13, 2019 at 6:25 PM Marko Rodriguez <okrammarko@gmail.com <ma...@gmail.com>>
> > wrote:
> > 
> >> Hello,
> >> 
> >> Various stakeholders in Apache TinkerPop have been wondering weather
> >> mm-ADT can be leveraged in TinkerPop3. While I originally planned for
> >> mm-ADT to form the foundation of TinkerPop4, there are a subset of features
> >> in mm-ADT that could really help TP3 moving forward. Here is a preliminary
> >> outline of the mm-ADT features that could push the TP3 roadmap.
> >> 
> >>        1. Type system: mm-ADT has a nominal type system for the built-in
> >> types and a structural type system for all derived types. Bytecode
> >> instructions that CRUD on database data can by statically typed and
> >> reasoned on at compile time.
> >> 
> >>        2. Strategies: mm-ADT has a completely different approach to query
> >> optimization than TP3. While there are compile-time strategies for
> >> manipulating a query into a semantically equivalent, though computationally
> >> more efficient form, the concept of “provider strategies” (indices) goes
> >> out the window in favor of reference graphs. The primary benefit of the
> >> mm-ADT model is that the implementation for providers will be much simpler,
> >> less error prone, doesn’t require custom instructions, and is able to
> >> naturally capitalize on other internal provider optimizations such as
> >> schemas, denormalizations, views, etc.
> >> 
> >>        3. Instruction Set: mm-ADT’s instruction set is less adhoc than
> >> TP3. Relational operators are polymorphic. Math operators are polymorphic.
> >> Container (collection) operators are polymorphic. Unlike TP3, a “vertex” is
> >> just a map like any other map. Thus, has(), value(), where(), select(),
> >> etc. operate across all such derivations. Moreover, mm-ADT’s instruction
> >> set greatly reduces the number of ways in which an expression can be
> >> represented, relying primarily on reference graphs (see #2 above) as the
> >> means of optimization. This should help limit the degrees of freedom in the
> >> Gremlin language and reduce its apparent complexity to newcomers.
> >> 
> >>        4. References: mm-ADT introduces references (pointers) as
> >> first-class citizens. References form one of the primary data types in
> >> mm-ADT with numerous usages including:
> >>                * Query planning. (providers exposing secondary data
> >> access paths via reference graphs -- see #2 above)
> >>                * Modeling complex objects. (will not come into play given
> >> TP3’s central focus on the property graph data type).
> >>                * Bytecode arguments. (nested bytecode are dynamic
> >> references and every instruction’s arguments can take references (even the
> >> opcode itself!)).
> >>                * Remote proxies. (TP3 detached vertices are awkward and
> >> limiting in comparison to mm-ADT proxy references).
> >>                * Schemas. (will probably not come into play, but “person”
> >> vertices are possible in mm-ADT. Thus, if TP3 wants to introduce graph
> >> schemas, mm-ADT provides the functionality).
> >> 
> >> I’ll leave it at that for now. Any questions, please ask.
> >> 
> >> Take care,
> >> Marko.
> >> 
> >> http://rredux.com <http://rredux.com/> <http://rredux.com/ <http://rredux.com/>>
> 
>

Re: mm-ADT to TinkerPop3

Posted by Marko Rodriguez <ok...@gmail.com>.

Hey,

> One thing I wonder at the moment which I don't think has come up in
> relation to mm-ADT discussion yet is DSLs. By every account, people are
> either using DSLs now or as soon as they learn about them, they immediately
> see the value and start to organize their code around them. So, any
> thoughts yet on how DSLs work under mm-ADT (in relation to TP3 and/or
> future) or is the model largely the same as what we do now?

mm-ADT is a bytecode specification. While we have a human readable/writable text representation (currently being called mm-ADT-bc), mm-ADT is primarily for machine consumption. Thus, when it comes to higher-level languages like Gremlin or a custom DSL, they would compile to mm-ADT bytecode.  Thus, if Gremlin compiles to mm-ADT, then all the Gremlin DSL infrastructure would just work as is. However, things can get a more interesting.

You can create derived types of arbitrary complexity in mm-ADT.

[define,person,[name:@string,age:@int,knows:@person*]]

From a DSL perspective, users can make their own objects. Look at the friends field. It is not container, but just zero or more person objects (sequence/stream). When this model is embedded in a graph database (and there are different ways to specify the embedding), those people could be referenced via a “knows"-edge.

As you can see, there is nothing “graph” here. No vertices, no edges… just a domain model.  But with mm-ADT-bc, you can create processes over that domain model and thus, traverse the “graph”:

[db][values,people]      // people is defined, I just don’t show it in this email
    [has,name,eq,marko]
    [values,knows]
    [value,age]
    [sum]

There is nothing pretty about mm-ADT-bc to a human user, but that is where DSLs would come in. Languages that make it easy to write mm-ADT-bc.

If Gremlin were the higher-level language, the following traversal would create the above bytecode:
	g.V().has(‘person',‘name’,’marko’).out(‘knows’).values(‘age’).sum()
    
How do you see this being used from your perspective?

Marko.

http://rredux.com



> 
> 
> On Thu, Jun 13, 2019 at 6:25 PM Marko Rodriguez <okrammarko@gmail.com <ma...@gmail.com>>
> wrote:
> 
>> Hello,
>> 
>> Various stakeholders in Apache TinkerPop have been wondering weather
>> mm-ADT can be leveraged in TinkerPop3. While I originally planned for
>> mm-ADT to form the foundation of TinkerPop4, there are a subset of features
>> in mm-ADT that could really help TP3 moving forward. Here is a preliminary
>> outline of the mm-ADT features that could push the TP3 roadmap.
>> 
>>        1. Type system: mm-ADT has a nominal type system for the built-in
>> types and a structural type system for all derived types. Bytecode
>> instructions that CRUD on database data can by statically typed and
>> reasoned on at compile time.
>> 
>>        2. Strategies: mm-ADT has a completely different approach to query
>> optimization than TP3. While there are compile-time strategies for
>> manipulating a query into a semantically equivalent, though computationally
>> more efficient form, the concept of “provider strategies” (indices) goes
>> out the window in favor of reference graphs. The primary benefit of the
>> mm-ADT model is that the implementation for providers will be much simpler,
>> less error prone, doesn’t require custom instructions, and is able to
>> naturally capitalize on other internal provider optimizations such as
>> schemas, denormalizations, views, etc.
>> 
>>        3. Instruction Set: mm-ADT’s instruction set is less adhoc than
>> TP3. Relational operators are polymorphic. Math operators are polymorphic.
>> Container (collection) operators are polymorphic. Unlike TP3, a “vertex” is
>> just a map like any other map. Thus, has(), value(), where(), select(),
>> etc. operate across all such derivations. Moreover, mm-ADT’s instruction
>> set greatly reduces the number of ways in which an expression can be
>> represented, relying primarily on reference graphs (see #2 above) as the
>> means of optimization. This should help limit the degrees of freedom in the
>> Gremlin language and reduce its apparent complexity to newcomers.
>> 
>>        4. References: mm-ADT introduces references (pointers) as
>> first-class citizens. References form one of the primary data types in
>> mm-ADT with numerous usages including:
>>                * Query planning. (providers exposing secondary data
>> access paths via reference graphs -- see #2 above)
>>                * Modeling complex objects. (will not come into play given
>> TP3’s central focus on the property graph data type).
>>                * Bytecode arguments. (nested bytecode are dynamic
>> references and every instruction’s arguments can take references (even the
>> opcode itself!)).
>>                * Remote proxies. (TP3 detached vertices are awkward and
>> limiting in comparison to mm-ADT proxy references).
>>                * Schemas. (will probably not come into play, but “person”
>> vertices are possible in mm-ADT. Thus, if TP3 wants to introduce graph
>> schemas, mm-ADT provides the functionality).
>> 
>> I’ll leave it at that for now. Any questions, please ask.
>> 
>> Take care,
>> Marko.
>> 
>> http://rredux.com <http://rredux.com/> <http://rredux.com/ <http://rredux.com/>>

Re: mm-ADT to TinkerPop3

Posted by Stephen Mallette <sp...@gmail.com>.

I'm in favor of trying to bring concepts that will be used in TP4 back into
TP3 to align them better. It will make the job of migration to TP4 for
providers and users much easier when that time eventually arises. It also
creates a chance to learn a bit about what it takes to do this
implementation which can only help make TP4 better. As for the items you
outlined for the TP3 future:

        1. Type system: mm-ADT has a nominal type system for the built-in
> types and a structural type system for all derived types. Bytecode
> instructions that CRUD on database data can by statically typed and
> reasoned on at compile time.
>

This one seems like a natural choice. We already have it in mind to better
tighten up the type system, so if that can be done with mm-ADT in mind,
then so much the better.


>         2. Strategies: mm-ADT has a completely different approach to query
> optimization than TP3. While there are compile-time strategies for
> manipulating a query into a semantically equivalent, though computationally
> more efficient form, the concept of “provider strategies” (indices) goes
> out the window in favor of reference graphs. The primary benefit of the
> mm-ADT model is that the implementation for providers will be much simpler,
> less error prone, doesn’t require custom instructions, and is able to
> naturally capitalize on other internal provider optimizations such as
> schemas, denormalizations, views, etc.
>

There are some really complex/robust TraversalStrategy implementations out
there. It will be interesting to see how they simplify and what gaps might
possibly present in trying to migrate to this style.


>         3. Instruction Set: mm-ADT’s instruction set is less adhoc than
> TP3. Relational operators are polymorphic. Math operators are polymorphic.
> Container (collection) operators are polymorphic. Unlike TP3, a “vertex” is
> just a map like any other map. Thus, has(), value(), where(), select(),
> etc. operate across all such derivations. Moreover, mm-ADT’s instruction
> set greatly reduces the number of ways in which an expression can be
> represented, relying primarily on reference graphs (see #2 above) as the
> means of optimization. This should help limit the degrees of freedom in the
> Gremlin language and reduce its apparent complexity to newcomers.
>

+1


>         4. References: mm-ADT introduces references (pointers) as
> first-class citizens. References form one of the primary data types in
> mm-ADT with numerous usages including:
>                 * Query planning. (providers exposing secondary data
> access paths via reference graphs -- see #2 above)
>                 * Modeling complex objects. (will not come into play given
> TP3’s central focus on the property graph data type).
>                 * Bytecode arguments. (nested bytecode are dynamic
> references and every instruction’s arguments can take references (even the
> opcode itself!)).
>                 * Remote proxies. (TP3 detached vertices are awkward and
> limiting in comparison to mm-ADT proxy references).
>                 * Schemas. (will probably not come into play, but “person”
> vertices are possible in mm-ADT. Thus, if TP3 wants to introduce graph
> schemas, mm-ADT provides the functionality).


Seems like item 4 is a grab-bag of things we could individually try to
support or not in TP3. Cool.

I like 1, 2, and 3. Each has a major benefit to providers and end-users.
One thing I wonder at the moment which I don't think has come up in
relation to mm-ADT discussion yet is DSLs. By every account, people are
either using DSLs now or as soon as they learn about them, they immediately
see the value and start to organize their code around them. So, any
thoughts yet on how DSLs work under mm-ADT (in relation to TP3 and/or
future) or is the model largely the same as what we do now?


On Thu, Jun 13, 2019 at 6:25 PM Marko Rodriguez <ok...@gmail.com>
wrote:

> Hello,
>
> Various stakeholders in Apache TinkerPop have been wondering weather
> mm-ADT can be leveraged in TinkerPop3. While I originally planned for
> mm-ADT to form the foundation of TinkerPop4, there are a subset of features
> in mm-ADT that could really help TP3 moving forward. Here is a preliminary
> outline of the mm-ADT features that could push the TP3 roadmap.
>
>         1. Type system: mm-ADT has a nominal type system for the built-in
> types and a structural type system for all derived types. Bytecode
> instructions that CRUD on database data can by statically typed and
> reasoned on at compile time.
>
>         2. Strategies: mm-ADT has a completely different approach to query
> optimization than TP3. While there are compile-time strategies for
> manipulating a query into a semantically equivalent, though computationally
> more efficient form, the concept of “provider strategies” (indices) goes
> out the window in favor of reference graphs. The primary benefit of the
> mm-ADT model is that the implementation for providers will be much simpler,
> less error prone, doesn’t require custom instructions, and is able to
> naturally capitalize on other internal provider optimizations such as
> schemas, denormalizations, views, etc.
>
>         3. Instruction Set: mm-ADT’s instruction set is less adhoc than
> TP3. Relational operators are polymorphic. Math operators are polymorphic.
> Container (collection) operators are polymorphic. Unlike TP3, a “vertex” is
> just a map like any other map. Thus, has(), value(), where(), select(),
> etc. operate across all such derivations. Moreover, mm-ADT’s instruction
> set greatly reduces the number of ways in which an expression can be
> represented, relying primarily on reference graphs (see #2 above) as the
> means of optimization. This should help limit the degrees of freedom in the
> Gremlin language and reduce its apparent complexity to newcomers.
>
>         4. References: mm-ADT introduces references (pointers) as
> first-class citizens. References form one of the primary data types in
> mm-ADT with numerous usages including:
>                 * Query planning. (providers exposing secondary data
> access paths via reference graphs -- see #2 above)
>                 * Modeling complex objects. (will not come into play given
> TP3’s central focus on the property graph data type).
>                 * Bytecode arguments. (nested bytecode are dynamic
> references and every instruction’s arguments can take references (even the
> opcode itself!)).
>                 * Remote proxies. (TP3 detached vertices are awkward and
> limiting in comparison to mm-ADT proxy references).
>                 * Schemas. (will probably not come into play, but “person”
> vertices are possible in mm-ADT. Thus, if TP3 wants to introduce graph
> schemas, mm-ADT provides the functionality).
>
> I’ll leave it at that for now. Any questions, please ask.
>
> Take care,
> Marko.
>
> http://rredux.com <http://rredux.com/>
>
>
>
>
>