You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Marko A. Rodriguez (JIRA)" <ji...@apache.org> on 2015/08/27 16:55:46 UTC

[jira] [Updated] (TINKERPOP3-571) [Proposal] Provide a way to process arbitrary objects with GraphComputer

     [ https://issues.apache.org/jira/browse/TINKERPOP3-571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Marko A. Rodriguez updated TINKERPOP3-571:
------------------------------------------
    Description: 
I want to be able to do this in OLAP:

{code}
__(12,21,3,4,75).is(gt,10).sum()
{code} 

An idea is this... For any {{Iterator<Object>}}, we need to be able to generate a Graph where:

{code}
g.addVertex(id,12)
g.addVertex(id,21)
g.addVertex(id,3)
g.addVertex(id,4)
g.addVertex(id,75)
{code}

...and then the start of the OLAP traversal has a "hidden prefix" of {{g.V.id}} thus, behind the scenes, what is being executed is:

{code}
g.V.id.is(gt,10).sum()
{code}

If we can make OLAP object processing just as natural as graph processing, then I don't see why Gremlin is not the craziest functional language to date:

  * single machine or distributed
  * supports non-terminal sideEffects (i.e. {{groupCount()}} is not the "end" (sideEffect is key S-x->S))
  * has natural branching constructs -- not just serial stream processing.
  * supports an execution structure that is a graph, not a DAG (e.g. {{repeat()}}, {{back()}}, ...)
      * fundamentally an arbitrary execution graph with {{jump()}} (low-level, but there)
      * this provides turing completeness as we have a non-serial "program counter" (the step ID is analogous to instruction's location in RAM and a traverser can jump to any step ID it wants). Moreover we have two random access memories ({{sack}} (local) and {{sideEffect}} (global)).
  * can be easily expressed in any host language as its represented using a fundamental concept in all programming languages -- fluent function chaining (monoid state modulation)
    * for JVM languages, simply {{import Traversal}} and use the syntax of the importing language (e.g. Jython, Rhino, Groovy, Scala, Clojure, etc.)
  * can be used as a database query language (and/)or an arbitrary data flow language (given this proposal).
  * has numerous execution engines (Giraph,MapReduce,Spark,Fulgora,TinkerGraph,...) with different time/space complexities.
  * has remote execution functionality via GremlinServer (with monitoring).
  * ...and, native support for the most complex data structure there is, the graph (i.e. {{outE}}, {{in}}, etc.)
 
Its mind boggling actually... I can't think of anything else like this. 

[~mbroecheler] @dkuppitz [~spmallette] @joshsh

  was:
I want to be able to do this in OLAP:

{code:groovy}
__(12,21,3,4,75).is(gt,10).sum()
{code} 

An idea is this... For any {{Iterator<Object>}}, we need to be able to generate a Graph where:

{code:groovy}
g.addVertex(id,12)
g.addVertex(id,21)
g.addVertex(id,3)
g.addVertex(id,4)
g.addVertex(id,75)
{code}

...and then the start of the OLAP traversal has a "hidden prefix" of {{g.V.id}} thus, behind the scenes, what is being executed is:

{code:groovy}
g.V.id.is(gt,10).sum()
{code}

If we can make OLAP object processing just as natural as graph processing, then I don't see why Gremlin is not the craziest functional language to date:

  * single machine or distributed
  * supports non-terminal sideEffects (i.e. {{groupCount()}} is not the "end" (sideEffect is key S-x->S))
  * has natural branching constructs -- not just serial stream processing.
  * supports an execution structure that is a graph, not a DAG (e.g. {{repeat()}}, {{back()}}, ...)
      * fundamentally an arbitrary execution graph with {{jump()}} (low-level, but there)
      * this provides turing completeness as we have a non-serial "program counter" (the step ID is analogous to instruction's location in RAM and a traverser can jump to any step ID it wants). Moreover we have two random access memories ({{sack}} (local) and {{sideEffect}} (global)).
  * can be easily expressed in any host language as its represented using a fundamental concept in all programming languages -- fluent function chaining (monoid state modulation)
    * for JVM languages, simply {{import Traversal}} and use the syntax of the importing language (e.g. Jython, Rhino, Groovy, Scala, Clojure, etc.)
  * can be used as a database query language (and/)or an arbitrary data flow language (given this proposal).
  * has numerous execution engines (Giraph,MapReduce,Spark,Fulgora,TinkerGraph,...) with different time/space complexities.
  * has remote execution functionality via GremlinServer (with monitoring).
  * ...and, native support for the most complex data structure there is, the graph (i.e. {{outE}}, {{in}}, etc.)
 
Its mind boggling actually... I can't think of anything else like this. 

[~mbroecheler] @dkuppitz [~spmallette] @joshsh


> [Proposal] Provide a way to process arbitrary objects with GraphComputer
> ------------------------------------------------------------------------
>
>                 Key: TINKERPOP3-571
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP3-571
>             Project: TinkerPop 3
>          Issue Type: Improvement
>          Components: process
>    Affects Versions: 3.0.0-incubating
>            Reporter: Marko A. Rodriguez
>            Assignee: Marko A. Rodriguez
>
> I want to be able to do this in OLAP:
> {code}
> __(12,21,3,4,75).is(gt,10).sum()
> {code} 
> An idea is this... For any {{Iterator<Object>}}, we need to be able to generate a Graph where:
> {code}
> g.addVertex(id,12)
> g.addVertex(id,21)
> g.addVertex(id,3)
> g.addVertex(id,4)
> g.addVertex(id,75)
> {code}
> ...and then the start of the OLAP traversal has a "hidden prefix" of {{g.V.id}} thus, behind the scenes, what is being executed is:
> {code}
> g.V.id.is(gt,10).sum()
> {code}
> If we can make OLAP object processing just as natural as graph processing, then I don't see why Gremlin is not the craziest functional language to date:
>   * single machine or distributed
>   * supports non-terminal sideEffects (i.e. {{groupCount()}} is not the "end" (sideEffect is key S-x->S))
>   * has natural branching constructs -- not just serial stream processing.
>   * supports an execution structure that is a graph, not a DAG (e.g. {{repeat()}}, {{back()}}, ...)
>       * fundamentally an arbitrary execution graph with {{jump()}} (low-level, but there)
>       * this provides turing completeness as we have a non-serial "program counter" (the step ID is analogous to instruction's location in RAM and a traverser can jump to any step ID it wants). Moreover we have two random access memories ({{sack}} (local) and {{sideEffect}} (global)).
>   * can be easily expressed in any host language as its represented using a fundamental concept in all programming languages -- fluent function chaining (monoid state modulation)
>     * for JVM languages, simply {{import Traversal}} and use the syntax of the importing language (e.g. Jython, Rhino, Groovy, Scala, Clojure, etc.)
>   * can be used as a database query language (and/)or an arbitrary data flow language (given this proposal).
>   * has numerous execution engines (Giraph,MapReduce,Spark,Fulgora,TinkerGraph,...) with different time/space complexities.
>   * has remote execution functionality via GremlinServer (with monitoring).
>   * ...and, native support for the most complex data structure there is, the graph (i.e. {{outE}}, {{in}}, etc.)
>  
> Its mind boggling actually... I can't think of anything else like this. 
> [~mbroecheler] @dkuppitz [~spmallette] @joshsh



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)