You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tinkerpop.apache.org by Marko Rodriguez <ok...@gmail.com> on 2015/10/09 20:07:19 UTC

[DISCUSS] A Process Based Graph Reasoner

Hi,

I'm working with Kendall Clark (cc:d) of Stardog4-fame on a blogpost discussing how Gremlin can traverser ontologically implied edges in the Stardog4 graph database. This got me to thinking, it would be "easy" to add reasoning capabilities to Gremlin via something like a "ReasoningStrategy extends DecorationTraversalStrategy."

I wrote up the proposal in this ticket:
	https://issues.apache.org/jira/browse/TINKERPOP3-881

Thoughts?,
Marko.

http://markorodriguez.com

Re: [DISCUSS] A Process Based Graph Reasoner

Posted by Marko Rodriguez <ok...@gmail.com>.

This is pretty cool. Stardog rule syntax.

	http://docs.stardog.com/#_stardog_rules_syntax

Sort of like user-defined steps in Gremlin2.

Marko.

http://markorodriguez.com

On Oct 10, 2015, at 11:08 AM, Marko Rodriguez <ok...@gmail.com> wrote:

> Hello,
> 
> Your ideas on embedding schema information into the graph structure is the pattern that RDF uses where schema/data are all one data structure. Many years ago I had a client that was using TinkerGraph to hold their home-brewed schema ("the partitioned graph") and were traversing it much like an RDF-reasoner to effect CRUD operations on the graph database. Thus, each read/write to the graph database was also various traversals against TinkerGraph. Since the schema didn't change much, it was just a GraphML file that each client loaded up when it connected to the graph database. It worked well and they liked it.
> 
> In my experience with RDF, 90% of the benefit comes from a very tiny relational algebra --- what AllegroGraph calls RDFS++ as OWL is, in most situations, an overdose. The question then is, what is the most efficient way to implement a reasoner. While it is sexy to store the schema in the graph structure itself, its not practical (in my opinion). What about storing it in a TinkerGraph parallel to the graph? Well, I would say, Gremlin is expensive relative to basic Set/List operations. The proposal in the JIRA is for a schema represented in in-memory Set/List data structures.
> 
> 	g.V.out('ancestor').name
> 
> The ReasonerStrategy would do this:
> 
> if(vertexStep.getEdgeLabels().contains("ancestor")) {
>   TraversalHelper.insertTraversalAfter(vertexStep, __.repeat(vertexStep.clone()).emit(),traversal);
>   traversal.removeStep(vertexStep);
> }
> 
> Rippin' fast. Can an "RDFS++"-reasoner be implemented with basic Set/List operations in Java? I bet so -- and thus, the ReasonerStrategy.build()…create() pattern articulated which:
> 
> 	1. Doesn't in any way mutate graph data in the user's graph.
> 	2. Isn't made inconsistent if the user leverages the graph system provider's specific APIs/query language.
> 	3. Can be easily used or not used (its simply a Strategy) and thus, not global to all operations on the graph.
> 	4. Can (I believe) provide typical *useful* SemanticWeb community semantics to the PropertyGraph domain.
> 
> So, in conclusion. I concur, Schema stuff is great and I would like to see how your tp3-contrib/ repository shakes out. Please do share links when you have them. Does the ticket about a "process reasoner" require a Schema, no. The two concepts are related, but one is not foundational to the other. The process reasoner discussed only requires that property keys and edge labels exist and by the Graph interface as it stands, we are sure that providers support this.
> 
> Thank you for your thoughts,
> Marko.
> 
> http://markorodriguez.com
> 
> On Oct 9, 2015, at 4:18 PM, pieter-gmail <pi...@gmail.com> wrote:
> 
>> Oy, so much to say,
>> 
>> Ontology is "study of the nature of being" (of the graph)
>> 
>> The traditional notion of schema is a subset of the rather infinite
>> understanding (Ontology) and I'd say for many the starting point of any
>> understanding.
>> 
>> I would surmise that a reasoning ontology would have to have some
>> knowledge as to the nature(meta) of graph. This would include which
>> labels is associated to which, the multiplicity, uniqueness, order,
>> ownership, constraints... It might be easy as you say but it is
>> ubiquitous and the structural foundation of any ontology.
>> 
>> The problems you mention regarding different providers is something that
>> with time, success and confidence might become less of a issue.
>> 
>> I am of the opinion that much of the above mentioned ontological stuff
>> is a mostly abstract concern for tp3. Just a interface specification.
>> Specifying uniqueness or whatever is an ontological concern, indexes
>> however is an implementation concern of the provider. BTW, the same goes
>> for full text search. Lucene or whatever technology's
>> features/limitations should not be the primary concern of tp3. Within
>> reason of course. No point in specifying that which no one can implement.
>> 
>> In some ways tp3 (or me) is confused about tp3 being a implementation
>> versus a specification. This concerns me a lot when I need to optimize
>> tp3 steps. The more I optimize the less tp3 code execute. Don't get me
>> wrong however, without the default implementation I would never even
>> have started.
>> 
>> Another concern I have regarding all this is tp's agnosticism with
>> respect to typing. An ontology should surely need to have some knowledge
>> about the types it support and reasons over.
>> 
>> My own idea for implementing a schema model for tp3 is far more
>> simplistic to start of with. I am toying with the idea of making it a
>> sort of tp3-contrib lib. That way for any graph implementation an
>> application higher up the stack will be able to access tp3 semantic
>> schema information in a implementation agnostic manner.
>> 
>> The basic idea is to have a special partitioned graph with limited
>> schema information. The default implementation stays with current tp3
>> semantics except for capturing the java type of any property. Basically
>> (not really thought about the details yet) a graph of
>> vertexLabel->edgeLabel->vertexLabel with their respective properties and
>> types.
>> 
>> Providers can then add custom feature like adding 'in', 'out' properties
>> to add multiplicity, order, constraints, transitiveness and... The
>> stricter tp3 becomes with specifying the ontological nature of graphs
>> the richer the standard partitioned schema graph will become. However it
>> will always remain lazy, schemaless, no need to specify anything upfront.
>> 
>> In your 'process reasoner' you explicitly specify the features of an
>> edge, as far as I can see this is not different to what you would have
>> to do with a 'structured reasoner'. In a default 'structured reasoner
>> there is nothing to specify, unless you which to say that some label is
>> 'transitive' or whatever. The time/space constraint to capture and for
>> starting up an existing graphis in general minimal as the schema is so
>> very small compared to the actual data. Somewhere in the ether I have
>> heard that SAP has something like 50000 tables. A lot to understand but
>> not much to load in space and time. The schema-partitaion should also be
>> optional, probably even off by default.
>> 
>> To give you some indication of my own implementation issue with sqlg.
>> Sqlg now supports java 8 java.time
>> LocalDateTime/LocalDate/LocalTime/Duration/Period.
>> Duration and Period and Integers are stored as integers in the rdbms,
>> however their is no way without some schema information to know whether
>> some integer field represents a Duration, Period or just a Integer.
>> vertex.value("duration") should return a java.time.Duration but alas
>> without additional schema support their is no way to know what the type
>> of the field is.
>> 
>> If tp3 decides to have an opinion regarding typing I'd say java
>> primitives, arrays of primitives and java.time.* should be standard
>> without much discussion.
>> 
>> Thanks
>> Pieter
>> 
>> 
>> 
>> On 09/10/2015 21:35, Marko Rodriguez wrote:
>>> Hello,
>>> 
>>> So this ticket is more about a reasoning ontology than it is about a data validation/verification/constraint schema.
>>> 
>>> The former is "easy" to do as its a query time model. The latter is more difficult as we would have to expose some sort of Schema interface for graph system providers to expose schema constraints. Furthermore, each provider tends to do things differently (much like indices and thus TinkerPop is agnostic to the concept of index). For instance, Titan has a pretty rich schema model while Neo4j (I believe) only supports things like UNIQUE on a name (e.g.).
>>> 
>>> You could argue that a Schema system could be developed at the TraversalStrategy level, but then it starts to get hairy when people use "Blueprints" to write to the graph or the native interfaces of the underlying provider (e.g. using Cypher to write data). Now TinkerPop will think data is in one format, but its in another…. 
>>> 
>>> Can you say more as to how you see a validation/verification/constraint model being specified/implemented in a provided-agnostic way for TinkerPop3?
>>> 
>>> Thanks,
>>> Marko.
>>> 
>>> http://markorodriguez.com
>>> 
>>> On Oct 9, 2015, at 1:28 PM, pieter-gmail <pi...@gmail.com> wrote:
>>> 
>>>> Hi,
>>>> 
>>>> Perhaps I am missing exactly what you saying but it seems to me gremlin
>>>> might become schema aware.
>>>> 
>>>> This is something I consider as crucial in understanding any data set.
>>>> Perhaps its from my background but I generally fail to see how the
>>>> NoSql/NoSchema/Document crowd understand their data by looking at rows
>>>> or documents or vertices without a picture of the schema.
>>>> 
>>>> The schema may be lazily created but non the less all systems, I'd say,
>>>> have a implicit schema which imho should be the starting point of any
>>>> analysis.
>>>> 
>>>> This is true even if its some random key putted into a Redis instance.
>>>> 
>>>> While I am on the topic, even the tp3 modern graph, trivial as it may
>>>> be, would be easier for me to 'get' if it was illustrated with a schema
>>>> diagram before the graph itself was illustrated.
>>>> 
>>>> Cheers
>>>> Pieter
>>>> 
>>>> On 09/10/2015 20:07, Marko Rodriguez wrote:
>>>>> ardog4-fame on a blogpost discussing how Gremlin can traverser ontologically implied edges in the Stardo
>>> 
>> 
>

Re: [DISCUSS] A Process Based Graph Reasoner

Posted by Marko Rodriguez <ok...@gmail.com>.

Hello,

Your ideas on embedding schema information into the graph structure is the pattern that RDF uses where schema/data are all one data structure. Many years ago I had a client that was using TinkerGraph to hold their home-brewed schema ("the partitioned graph") and were traversing it much like an RDF-reasoner to effect CRUD operations on the graph database. Thus, each read/write to the graph database was also various traversals against TinkerGraph. Since the schema didn't change much, it was just a GraphML file that each client loaded up when it connected to the graph database. It worked well and they liked it.

In my experience with RDF, 90% of the benefit comes from a very tiny relational algebra --- what AllegroGraph calls RDFS++ as OWL is, in most situations, an overdose. The question then is, what is the most efficient way to implement a reasoner. While it is sexy to store the schema in the graph structure itself, its not practical (in my opinion). What about storing it in a TinkerGraph parallel to the graph? Well, I would say, Gremlin is expensive relative to basic Set/List operations. The proposal in the JIRA is for a schema represented in in-memory Set/List data structures.

	g.V.out('ancestor').name

The ReasonerStrategy would do this:

if(vertexStep.getEdgeLabels().contains("ancestor")) {
  TraversalHelper.insertTraversalAfter(vertexStep, __.repeat(vertexStep.clone()).emit(),traversal);
  traversal.removeStep(vertexStep);
}

Rippin' fast. Can an "RDFS++"-reasoner be implemented with basic Set/List operations in Java? I bet so -- and thus, the ReasonerStrategy.build()…create() pattern articulated which:

	1. Doesn't in any way mutate graph data in the user's graph.
	2. Isn't made inconsistent if the user leverages the graph system provider's specific APIs/query language.
	3. Can be easily used or not used (its simply a Strategy) and thus, not global to all operations on the graph.
	4. Can (I believe) provide typical *useful* SemanticWeb community semantics to the PropertyGraph domain.

So, in conclusion. I concur, Schema stuff is great and I would like to see how your tp3-contrib/ repository shakes out. Please do share links when you have them. Does the ticket about a "process reasoner" require a Schema, no. The two concepts are related, but one is not foundational to the other. The process reasoner discussed only requires that property keys and edge labels exist and by the Graph interface as it stands, we are sure that providers support this.

Thank you for your thoughts,
Marko.

http://markorodriguez.com

On Oct 9, 2015, at 4:18 PM, pieter-gmail <pi...@gmail.com> wrote:

> Oy, so much to say,
> 
> Ontology is "study of the nature of being" (of the graph)
> 
> The traditional notion of schema is a subset of the rather infinite
> understanding (Ontology) and I'd say for many the starting point of any
> understanding.
> 
> I would surmise that a reasoning ontology would have to have some
> knowledge as to the nature(meta) of graph. This would include which
> labels is associated to which, the multiplicity, uniqueness, order,
> ownership, constraints... It might be easy as you say but it is
> ubiquitous and the structural foundation of any ontology.
> 
> The problems you mention regarding different providers is something that
> with time, success and confidence might become less of a issue.
> 
> I am of the opinion that much of the above mentioned ontological stuff
> is a mostly abstract concern for tp3. Just a interface specification.
> Specifying uniqueness or whatever is an ontological concern, indexes
> however is an implementation concern of the provider. BTW, the same goes
> for full text search. Lucene or whatever technology's
> features/limitations should not be the primary concern of tp3. Within
> reason of course. No point in specifying that which no one can implement.
> 
> In some ways tp3 (or me) is confused about tp3 being a implementation
> versus a specification. This concerns me a lot when I need to optimize
> tp3 steps. The more I optimize the less tp3 code execute. Don't get me
> wrong however, without the default implementation I would never even
> have started.
> 
> Another concern I have regarding all this is tp's agnosticism with
> respect to typing. An ontology should surely need to have some knowledge
> about the types it support and reasons over.
> 
> My own idea for implementing a schema model for tp3 is far more
> simplistic to start of with. I am toying with the idea of making it a
> sort of tp3-contrib lib. That way for any graph implementation an
> application higher up the stack will be able to access tp3 semantic
> schema information in a implementation agnostic manner.
> 
> The basic idea is to have a special partitioned graph with limited
> schema information. The default implementation stays with current tp3
> semantics except for capturing the java type of any property. Basically
> (not really thought about the details yet) a graph of
> vertexLabel->edgeLabel->vertexLabel with their respective properties and
> types.
> 
> Providers can then add custom feature like adding 'in', 'out' properties
> to add multiplicity, order, constraints, transitiveness and... The
> stricter tp3 becomes with specifying the ontological nature of graphs
> the richer the standard partitioned schema graph will become. However it
> will always remain lazy, schemaless, no need to specify anything upfront.
> 
> In your 'process reasoner' you explicitly specify the features of an
> edge, as far as I can see this is not different to what you would have
> to do with a 'structured reasoner'. In a default 'structured reasoner
> there is nothing to specify, unless you which to say that some label is
> 'transitive' or whatever. The time/space constraint to capture and for
> starting up an existing graphis in general minimal as the schema is so
> very small compared to the actual data. Somewhere in the ether I have
> heard that SAP has something like 50000 tables. A lot to understand but
> not much to load in space and time. The schema-partitaion should also be
> optional, probably even off by default.
> 
> To give you some indication of my own implementation issue with sqlg.
> Sqlg now supports java 8 java.time
> LocalDateTime/LocalDate/LocalTime/Duration/Period.
> Duration and Period and Integers are stored as integers in the rdbms,
> however their is no way without some schema information to know whether
> some integer field represents a Duration, Period or just a Integer.
> vertex.value("duration") should return a java.time.Duration but alas
> without additional schema support their is no way to know what the type
> of the field is.
> 
> If tp3 decides to have an opinion regarding typing I'd say java
> primitives, arrays of primitives and java.time.* should be standard
> without much discussion.
> 
> Thanks
> Pieter
> 
> 
> 
> On 09/10/2015 21:35, Marko Rodriguez wrote:
>> Hello,
>> 
>> So this ticket is more about a reasoning ontology than it is about a data validation/verification/constraint schema.
>> 
>> The former is "easy" to do as its a query time model. The latter is more difficult as we would have to expose some sort of Schema interface for graph system providers to expose schema constraints. Furthermore, each provider tends to do things differently (much like indices and thus TinkerPop is agnostic to the concept of index). For instance, Titan has a pretty rich schema model while Neo4j (I believe) only supports things like UNIQUE on a name (e.g.).
>> 
>> You could argue that a Schema system could be developed at the TraversalStrategy level, but then it starts to get hairy when people use "Blueprints" to write to the graph or the native interfaces of the underlying provider (e.g. using Cypher to write data). Now TinkerPop will think data is in one format, but its in another…. 
>> 
>> Can you say more as to how you see a validation/verification/constraint model being specified/implemented in a provided-agnostic way for TinkerPop3?
>> 
>> Thanks,
>> Marko.
>> 
>> http://markorodriguez.com
>> 
>> On Oct 9, 2015, at 1:28 PM, pieter-gmail <pi...@gmail.com> wrote:
>> 
>>> Hi,
>>> 
>>> Perhaps I am missing exactly what you saying but it seems to me gremlin
>>> might become schema aware.
>>> 
>>> This is something I consider as crucial in understanding any data set.
>>> Perhaps its from my background but I generally fail to see how the
>>> NoSql/NoSchema/Document crowd understand their data by looking at rows
>>> or documents or vertices without a picture of the schema.
>>> 
>>> The schema may be lazily created but non the less all systems, I'd say,
>>> have a implicit schema which imho should be the starting point of any
>>> analysis.
>>> 
>>> This is true even if its some random key putted into a Redis instance.
>>> 
>>> While I am on the topic, even the tp3 modern graph, trivial as it may
>>> be, would be easier for me to 'get' if it was illustrated with a schema
>>> diagram before the graph itself was illustrated.
>>> 
>>> Cheers
>>> Pieter
>>> 
>>> On 09/10/2015 20:07, Marko Rodriguez wrote:
>>>> ardog4-fame on a blogpost discussing how Gremlin can traverser ontologically implied edges in the Stardo
>> 
>

Re: [DISCUSS] A Process Based Graph Reasoner

Posted by pieter-gmail <pi...@gmail.com>.

Oy, so much to say,

Ontology is "study of the nature of being" (of the graph)

The traditional notion of schema is a subset of the rather infinite
understanding (Ontology) and I'd say for many the starting point of any
understanding.

I would surmise that a reasoning ontology would have to have some
knowledge as to the nature(meta) of graph. This would include which
labels is associated to which, the multiplicity, uniqueness, order,
ownership, constraints... It might be easy as you say but it is
ubiquitous and the structural foundation of any ontology.

The problems you mention regarding different providers is something that
with time, success and confidence might become less of a issue.

I am of the opinion that much of the above mentioned ontological stuff
is a mostly abstract concern for tp3. Just a interface specification.
Specifying uniqueness or whatever is an ontological concern, indexes
however is an implementation concern of the provider. BTW, the same goes
for full text search. Lucene or whatever technology's
features/limitations should not be the primary concern of tp3. Within
reason of course. No point in specifying that which no one can implement.

In some ways tp3 (or me) is confused about tp3 being a implementation
versus a specification. This concerns me a lot when I need to optimize
tp3 steps. The more I optimize the less tp3 code execute. Don't get me
wrong however, without the default implementation I would never even
have started.

Another concern I have regarding all this is tp's agnosticism with
respect to typing. An ontology should surely need to have some knowledge
about the types it support and reasons over.

My own idea for implementing a schema model for tp3 is far more
simplistic to start of with. I am toying with the idea of making it a
sort of tp3-contrib lib. That way for any graph implementation an
application higher up the stack will be able to access tp3 semantic
schema information in a implementation agnostic manner.

The basic idea is to have a special partitioned graph with limited
schema information. The default implementation stays with current tp3
semantics except for capturing the java type of any property. Basically
(not really thought about the details yet) a graph of
vertexLabel->edgeLabel->vertexLabel with their respective properties and
types.

Providers can then add custom feature like adding 'in', 'out' properties
to add multiplicity, order, constraints, transitiveness and... The
stricter tp3 becomes with specifying the ontological nature of graphs
the richer the standard partitioned schema graph will become. However it
will always remain lazy, schemaless, no need to specify anything upfront.

In your 'process reasoner' you explicitly specify the features of an
edge, as far as I can see this is not different to what you would have
to do with a 'structured reasoner'. In a default 'structured reasoner
there is nothing to specify, unless you which to say that some label is
'transitive' or whatever. The time/space constraint to capture and for
starting up an existing graphis in general minimal as the schema is so
very small compared to the actual data. Somewhere in the ether I have
heard that SAP has something like 50000 tables. A lot to understand but
not much to load in space and time. The schema-partitaion should also be
optional, probably even off by default.

To give you some indication of my own implementation issue with sqlg.
Sqlg now supports java 8 java.time
LocalDateTime/LocalDate/LocalTime/Duration/Period.
Duration and Period and Integers are stored as integers in the rdbms,
however their is no way without some schema information to know whether
some integer field represents a Duration, Period or just a Integer.
vertex.value("duration") should return a java.time.Duration but alas
without additional schema support their is no way to know what the type
of the field is.

If tp3 decides to have an opinion regarding typing I'd say java
primitives, arrays of primitives and java.time.* should be standard
without much discussion.

Thanks
Pieter

On 09/10/2015 21:35, Marko Rodriguez wrote:
> Hello,
>
> So this ticket is more about a reasoning ontology than it is about a data validation/verification/constraint schema.
>
> The former is "easy" to do as its a query time model. The latter is more difficult as we would have to expose some sort of Schema interface for graph system providers to expose schema constraints. Furthermore, each provider tends to do things differently (much like indices and thus TinkerPop is agnostic to the concept of index). For instance, Titan has a pretty rich schema model while Neo4j (I believe) only supports things like UNIQUE on a name (e.g.).
>
> You could argue that a Schema system could be developed at the TraversalStrategy level, but then it starts to get hairy when people use "Blueprints" to write to the graph or the native interfaces of the underlying provider (e.g. using Cypher to write data). Now TinkerPop will think data is in one format, but its in another…. 
>
> Can you say more as to how you see a validation/verification/constraint model being specified/implemented in a provided-agnostic way for TinkerPop3?
>
> Thanks,
> Marko.
>
> http://markorodriguez.com
>
> On Oct 9, 2015, at 1:28 PM, pieter-gmail <pi...@gmail.com> wrote:
>
>> Hi,
>>
>> Perhaps I am missing exactly what you saying but it seems to me gremlin
>> might become schema aware.
>>
>> This is something I consider as crucial in understanding any data set.
>> Perhaps its from my background but I generally fail to see how the
>> NoSql/NoSchema/Document crowd understand their data by looking at rows
>> or documents or vertices without a picture of the schema.
>>
>> The schema may be lazily created but non the less all systems, I'd say,
>> have a implicit schema which imho should be the starting point of any
>> analysis.
>>
>> This is true even if its some random key putted into a Redis instance.
>>
>> While I am on the topic, even the tp3 modern graph, trivial as it may
>> be, would be easier for me to 'get' if it was illustrated with a schema
>> diagram before the graph itself was illustrated.
>>
>> Cheers
>> Pieter
>>
>> On 09/10/2015 20:07, Marko Rodriguez wrote:
>>> ardog4-fame on a blogpost discussing how Gremlin can traverser ontologically implied edges in the Stardo
>

Re: [DISCUSS] A Process Based Graph Reasoner

Posted by Marko Rodriguez <ok...@gmail.com>.

Hello,

So this ticket is more about a reasoning ontology than it is about a data validation/verification/constraint schema.

The former is "easy" to do as its a query time model. The latter is more difficult as we would have to expose some sort of Schema interface for graph system providers to expose schema constraints. Furthermore, each provider tends to do things differently (much like indices and thus TinkerPop is agnostic to the concept of index). For instance, Titan has a pretty rich schema model while Neo4j (I believe) only supports things like UNIQUE on a name (e.g.).

You could argue that a Schema system could be developed at the TraversalStrategy level, but then it starts to get hairy when people use "Blueprints" to write to the graph or the native interfaces of the underlying provider (e.g. using Cypher to write data). Now TinkerPop will think data is in one format, but its in another…. 

Can you say more as to how you see a validation/verification/constraint model being specified/implemented in a provided-agnostic way for TinkerPop3?

Thanks,
Marko.

http://markorodriguez.com

On Oct 9, 2015, at 1:28 PM, pieter-gmail <pi...@gmail.com> wrote:

> Hi,
> 
> Perhaps I am missing exactly what you saying but it seems to me gremlin
> might become schema aware.
> 
> This is something I consider as crucial in understanding any data set.
> Perhaps its from my background but I generally fail to see how the
> NoSql/NoSchema/Document crowd understand their data by looking at rows
> or documents or vertices without a picture of the schema.
> 
> The schema may be lazily created but non the less all systems, I'd say,
> have a implicit schema which imho should be the starting point of any
> analysis.
> 
> This is true even if its some random key putted into a Redis instance.
> 
> While I am on the topic, even the tp3 modern graph, trivial as it may
> be, would be easier for me to 'get' if it was illustrated with a schema
> diagram before the graph itself was illustrated.
> 
> Cheers
> Pieter
> 
> On 09/10/2015 20:07, Marko Rodriguez wrote:
>> ardog4-fame on a blogpost discussing how Gremlin can traverser ontologically implied edges in the Stardo
>

Re: [DISCUSS] A Process Based Graph Reasoner

Posted by pieter-gmail <pi...@gmail.com>.

Hi,

Perhaps I am missing exactly what you saying but it seems to me gremlin
might become schema aware.

This is something I consider as crucial in understanding any data set.
Perhaps its from my background but I generally fail to see how the
NoSql/NoSchema/Document crowd understand their data by looking at rows
or documents or vertices without a picture of the schema.

The schema may be lazily created but non the less all systems, I'd say,
have a implicit schema which imho should be the starting point of any
analysis.

This is true even if its some random key putted into a Redis instance.

While I am on the topic, even the tp3 modern graph, trivial as it may
be, would be easier for me to 'get' if it was illustrated with a schema
diagram before the graph itself was illustrated.

Cheers
Pieter

On 09/10/2015 20:07, Marko Rodriguez wrote:
> ardog4-fame on a blogpost discussing how Gremlin can traverser ontologically implied edges in the Stardo