You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by sp...@apache.org on 2015/11/09 14:01:28 UTC

[5/8] incubator-tinkerpop git commit: Added "the next ten minutes" to the tutorial.

Added "the next ten minutes" to the tutorial.

Dropped the idea of a "tutorial" book and made the "getting started" standalone as an "article".


Project: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/commit/f7b070de
Tree: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/tree/f7b070de
Diff: http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/diff/f7b070de

Branch: refs/heads/master
Commit: f7b070dec1915a8fc9400f34e02369d026e14528
Parents: dde1dca
Author: Stephen Mallette <sp...@genoprime.com>
Authored: Thu Nov 5 16:30:07 2015 -0500
Committer: Stephen Mallette <sp...@genoprime.com>
Committed: Thu Nov 5 16:30:07 2015 -0500

----------------------------------------------------------------------
 docs/src/tutorials-getting-started.asciidoc     | 284 +++++++++++++++++--
 docs/src/tutorials.asciidoc                     |  27 --
 .../images/modern-edge-1-to-3-1-gremlin.png     | Bin 0 -> 11607 bytes
 .../images/modern-edge-1-to-3-2-gremlin.png     | Bin 0 -> 15248 bytes
 .../images/modern-edge-1-to-3-3-gremlin.png     | Bin 0 -> 18565 bytes
 pom.xml                                         |   7 +-
 6 files changed, 266 insertions(+), 52 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/f7b070de/docs/src/tutorials-getting-started.asciidoc
----------------------------------------------------------------------
diff --git a/docs/src/tutorials-getting-started.asciidoc b/docs/src/tutorials-getting-started.asciidoc
index 6cd99a5..b95a59f 100644
--- a/docs/src/tutorials-getting-started.asciidoc
+++ b/docs/src/tutorials-getting-started.asciidoc
@@ -14,22 +14,28 @@ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 See the License for the specific language governing permissions and
 limitations under the License.
 ////
+
 Getting Started
 ===============
 
-Apache TinkerPop is an open source Graph Computing Framework.  Within itself, TinkerPop represents a large collection
-of capabilities and technologies and in its wider ecosystem an additionally extended world of
-link:http://tinkerpop.incubator.apache.org/#graph-systems[third-party contributed] graph libraries and systems.
-TinkerPop's ecosystem can appear complex to newcomers of all experience, especially when glancing at the
+link:http://tinkerpop.com[Apache TinkerPop] is an open source Graph Computing Framework.  Within itself, TinkerPop
+represents a large collection of capabilities and technologies and in its wider ecosystem an additionally extended
+world of link:http://tinkerpop.incubator.apache.org/#graph-systems[third-party contributed] graph libraries and
+systems. TinkerPop's ecosystem can appear complex to newcomers of all experience, especially when glancing at the
 link:http://tinkerpop.incubator.apache.org/docs/x.y.z/index.html[reference documentation] for the first time.
 
-So, where do you get started with TinkerPop?
+So, where do you get started with TinkerPop? How do you dive in quickly and get productive?  Well - Gremlin, the
+most recognizable citizen of The TinkerPop, is here to help with this thirty minute tutorial.  That's right - in just
+thirty short minutes, you too can be fit to start building graph applications with TinkerPop.  Welcome to _The Gremlin
+Workout - by Gremlin_!
+
+image::gremlin-gym.png[]
 
-In Five Minutes
----------------
+The First Five Minutes
+----------------------
 
-It is quite possible to learn a lot in just five minutes with TinkerPop, but before doing so, introductions are in
-order.  Meet Gremlin, the most recognizable citizen of The TinkerPop!
+It is quite possible to learn a lot in just five minutes with TinkerPop, but before doing so, a proper introduction of
+your trainer is in order.  Meet Gremlin!
 
 image:gremlin-standing.png[width=125,align=center]
 
@@ -77,7 +83,7 @@ trying out, working with a static graph that doesn't change much, unit tests and
 can fit in memory.
 
 TIP: Resist the temptation to "get started" with more complex databases like link:http://thinkaurelius.github.io/titan/[Titan]
-or worrying how to get link:http://tinkerpop.incubator.apache.org/docs/x.y.zg/#gremlin-server[Gremlin Server]
+or to delve into how to get link:http://tinkerpop.incubator.apache.org/docs/x.y.zg/#gremlin-server[Gremlin Server]
 working properly.  Focusing on the basics builds a good foundation for all the other things TinkerPop offers.
 
 To make your process even easier, start with one of TinkerPop's toy graphs.  These are "small" graphs designed to
@@ -131,34 +137,272 @@ some traversals and hopefully learned something about TinkerPop in general.  You
 what there is to know, but those accomplishments will help enable understanding of the more detailed tutorials to
 come.
 
-In Ten More Minutes
--------------------
+The Next Ten Minutes
+--------------------
 
 In the first five minutes of getting started with TinkerPop, you learned some basics for telling Gremlin how to
 traverse a graph.  Of course, there wasn't much discussion about what a graph is.  A graph is a collection of
-vertices (i.e. nodes, dots, etc.) and edges (i.e. relationships, lines, etc.), where a vertex is an entity which
+vertices (i.e. nodes, dots) and edges (i.e. relationships, lines), where a vertex is an entity which
 represents some domain object (e.g. a person, a place, etc.) and an edge represents the relationship between two
 vertices.
 
-image:modern-edge-1-to-3-1.png[width=300,align=center]
+image:modern-edge-1-to-3-1.png[width=300]
 
 The above example shows a graph with two vertices, one with a unique identifier of "1" and another with a unique
 identifier of "3".  There is a edge connecting the two with a unique identifier of "9". It is important to consider
-that the edge has a direction which goes out from vertex "1" and in to vertex "3'.
+that the edge has a direction which goes _out_ from vertex "1" and _in_ to vertex "3'.
 
-IMPORTANT: Most TinkerPop implementations do not allow for identifier assignment.  They will rather assign identifiers
-and ignore assigned identifiers you attempt to assign to them.
+IMPORTANT: Most TinkerPop implementations do not allow for identifier assignment.  They will rather assign
+their own identifiers and ignore assigned identifiers that you attempt to assign to them.
 
 A graph with elements that just have identifiers does not make for much of a database.  To give some meaning to
 this basic structure, vertices and edges can each be given labels to categorize them.
 
-image:modern-edge-1-to-3-2.png[width=300,align=center]
+image:modern-edge-1-to-3-2.png[width=300]
 
 You can now see that a vertex "1" is a "person" and vertex "3" is a "software" vertex.  They are joined by a "created"
 edge which allows you to see that a "person created software".  The "label" and the "id" are reserved attributes of
 vertices and edges, but you can add your own arbitrary properties as well:
 
-image:modern-edge-1-to-3-3.png[width=300,align=center]
+image:modern-edge-1-to-3-3.png[width=325]
+
+This model is referred to as a _property graph_ and it provides a flexible and intuitive way in which to model your
+data.
+
+Creating a Graph
+^^^^^^^^^^^^^^^^
+
+As intuitive as it is to you, it is perhaps more intuitive to Gremlin himself, as vertices, edges and properties make
+up the very elements of his existence. It is indeed helpful to think of our friend, Gremlin, moving about a graph when
+developing traversals, as picturing his position as the link:http://tinkerpop.incubator.apache.org/docs/3.0.2-incubating/#_the_traverser[traverser]
+helps orient where you need him to go next.  Let's use the two vertex, one edge graph we've been discussing above
+as an example.  First, you need to create this graph:
+
+[gremlin-groovy]
+----
+graph = TinkerGraph.open()
+g = graph.traversal()
+v1 = g.addV(id, 1, label, "person", "name", "marko", "age" 29).next()
+v2 = g.addV(id, 3, label, "software", "name", "lop", "lang", "java").next()
+v1.addEdge("created", v2, id, 9, "weight", 0.4)
+----
+
+There are a number of important things to consider in the above code.  First, why didn't we use `graph.addVertex()`?
+Couldn't that just as easily performed the same function?  Yes - it could have, however, TinkerPop encourages
+end-users to utilizes the methods of the `TraversalSource` rather than `Graph`.  The `Graph` methods are considered
+"low-level" and for use by providers developing TinkerPop implementations.  In addition, using `Graph` methods bypass
+features you may find important as you learn more about TinkerPop, such as
+link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#traversalstrategy[traversal strategies].
+
+Second, recall that `id` and `label` are "reserved" for special usage in TinkerPop.  Those "keys" supplied to the
+creation method are statically imported to the console.  You would normally refer to them as `T.id` and `T.label`.
+
+NOTE: The fully qualified name for `T` is `org.apache.tinkerpop.gremlin.structure.T`.
+
+Third, don't forget that you are working with TinkerGraph and so identifier assignment is allowed.  That is _not_ the
+case with most graph databases (don't bother to try with Neo4j).
+
+Finally, the label for an `Edge` is required and is thus part of the method signature of `addEdge()`.  It is the first
+parameter supplied, followed by `Vertex` to which `v1` should be connected.  Therefore, this usage of `addEdge` is
+creating an edge that goes _out_ of `v1` and into `v2` with a label of "created".
+
+Graph Traversal - Staying Simple
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Now that Gremlin knows where the graph data is, you can ask him to get you some data from it by doing a traversal,
+which you can think of as executing some link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#_the_graph_process[process]
+over the structure of the graph. We can form our question in English and then translate it to Gremlin. For this
+initial example, let's ask Gremlin: "What software has Marko created?"
+
+To answer this question, we would want Gremlin to:
+
+. Find "marko" in the graph
+. Walk along the "created" edges to "software" vertices
+. Select the "name" property of the "software" vertices
+
+The English-based steps above largely translate to Gremlin's position in the graph and to the steps we need to take
+to ask him to answer our question. By stringing these steps together, we form a `Traversal` or the sequence of programmatic
+link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#graph-traversal-steps[steps] Gremlin needs to perform
+in order to get you an answer.
+
+Let's start with finding "marko".  This operation is a filtering step as it searches the full set of vertices to match
+those that have the "name" property value of "marko". This can be done with the
+link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#has-step[has()] step as follows:
+
+[gremlin-groovy,modern]
+----
+g.V().has('name','marko')
+----
+
+We can picture this traversal in our little graph with Gremlin sitting on vertex "1".
+
+image:modern-edge-1-to-3-1-gremlin.png[width=325]
+
+When Gremlin is on a vertex or an edge, he has access to all the properties that are available to that element.
+
+IMPORTANT: The above query iterates all the vertices in the graph to get its answer. That's fine for our little example,
+but for multi-million or billion edge graphs that is a big problem. To solve this problem, you should look to use
+indices.  TinkerPop does not provide an abstraction for index management.  You should consult the documentation of the
+graph you have chosen and utilize its native API to create indices that will speed up these types of lookups. Your
+traversals will remain unchanged however, as the indices will be used transparently at execution time.
+
+Now that Gremlin has found "marko", he can now consider the next step in the traversal where we ask him to "walk"
+along "created" edges to "software" vertices. As described earlier, edges have direction, so we have to tell Gremlin
+what direction to follow.  In this case, we want him to traverse on outgoing edges from the "marko" vertex.  For this,
+we use the link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#vertex-steps[outE] step.
+
+[gremlin-groovy,modern]
+----
+g.V().has('name','marko').outE('created')
+----
+
+At this point, you can picture Gremlin moving from the "marko" vertex to the "created" edge.
+
+image:modern-edge-1-to-3-2-gremlin.png[width=325]
+
+To get to the vertex on the other end of the edge, you need to tell Gremlin to move from the edge to the incoming
+vertex with `inV()`.
+
+[gremlin-groovy,modern]
+----
+g.V().has('name','marko').outE('created').inV()
+----
+
+You can now picture Gremlin on the "software" vertex as follows:
+
+image:modern-edge-1-to-3-3-gremlin.png[width=325]
+
+As you are not asking Gremlin to do anything with the properties of the "created" edge, you can simplify the
+statement above with:
+
+[gremlin-groovy,modern]
+----
+g.V().has('name','marko').out('created')
+----
+
+Finally, now that Gremlin has reached the "software that Marko created", he has access to the properties of the
+"software" vertex and you can therefore ask Gremlin to extract the "name" property as follows:
+
+[gremlin-groovy,modern]
+----
+g.V().has('name','marko').out('created').values('name')
+----
+
+You should now be able to see the connection Gremlin has to the structure of the graph and how Gremlin maneuvers from
+vertices to edges and so on.  Your ability to string together steps to ask Gremlin to do more complex things, depends
+on your understanding of these basic concepts.
+
+Graph Traversal - Increasing Complexity
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Armed with the knowledge from the previous section, let's ask Gremlin to do some more complex things. There's not much
+more that can be done with the "baby" graph we had, so let's return to the "modern" toy graph from the "five
+minutes section".  Recall that you can create this `Graph` and establish a `TraversalSource` with:
+
+[gremlin-groovy]
+----
+graph = TinkerFactory.createModern()
+g = graph.traversal()
+----
+
+Earlier we'd used the `has()` step to tell Gremlin how to find the "marko" vertex. Let's look at some other ways to
+use `has()`.  What if we wanted Gremlin to find the "age" values of both "vadas" and "marko"?  In this case we could
+use the `within` comparator with `has()` as follows:
+
+[gremlin-groovy,modern]
+----
+g.V().has('name',within('vadas','marko')).values('age')
+----
+
+It is worth noting that `within` is statically imported to the Gremlin Console (much like `T` is, as described
+earlier).
+
+NOTE: The fully qualified name for `P` is `org.apache.tinkerpop.gremlin.process.traversal.P`.
+
+If we wanted to ask Gremlin the average age of "vadas" and "marko" we could use the
+link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#mean-step[mean()] step as follows:
+
+[gremlin-groovy,modern]
+----
+g.V().has('name',within('vadas','marko')).values('age').mean()
+----
+
+Another method of filtering is seen in the use of the link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#where-step[where]
+step.  We know how to find the "software" that "marko" created:
+
+[gremlin-groovy,modern]
+----
+g.V().has('name','marko').out('created')
+----
+
+Let's extend on that to try to learn who "marko" collaborates with. To do that, we should first picture Gremlin
+standing on the "software" vertex.  To find out who "created" that "software" we need to have Gremlin traverse back
+_in_ along the "created" edges to find the "person" vertices tied to it.
+
+[gremlin-groovy,modern]
+----
+g.V().has('name','marko').out('created').in('created').values('name')
+----
+
+So that's nice, we can see that "peter", "josh" and "marko" are both responsible for creating "lop".  Of course, we already
+know about the involvement of "marko" and it seems strange to say that "marko" collaborates with himself, so excluding
+"marko" from the results seems logical.  The following traversal handles that exclusion:
+
+[gremlin-groovy,modern]
+----
+g.V().has('name','marko').as('exclude').out('created').in('created').where(neq('exclude')).values('name')
+----
+
+We made two additions to the traversal to make it exclude "marko" from the results.  First, we added the
+link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#as-step[as()] step.  The `as()` step is not really a "step",
+but a "step modulator" - something that adds features to a step or the traversal.  Here, the `as('exclude')` labels
+the `has()` step with "exclude" and all values that pass through that step are held in that "label" for later use.  In
+this case, the "marko" vertex is the only vertex to pass through that point, so it is held in "exclude".
+
+The other addition that was made was the `where()` step which is a filter step like `has()`.  The `where()` is
+positioned after the `in()` step that has "person" vertices, which means that the `where()` filter is occurring
+on the list of "marko" collaborators.  The `where()` specifies that the "person" vertices passing through it should
+not equal (i.e. `neq()`) the contents of the "exclude" label.  As it just contains the "marko" vertex, the `where()`
+filters out the "marko" that we get when we traverse back _in_ on the "created" edges.
+
+You will find many uses of `as()`.  Here it is in combination with link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#select-step[select]:
+
+[gremlin-groovy,modern]
+----
+g.V().as('a').out().as('b').out().as('c').select('a','b','c')
+----
+
+In the above example, we tell Gremlin to iterate through all vertices and traverse _out_ twice from each.  Gremlin
+will label each vertex in that path with "a", "b" and "c", respectively.  We can then use `select` to extract the
+contents of that label.
+
+Another common but important step is the link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#group-step[group()]
+step and its related step modulator called link:http://tinkerpop.incubator.apache.org/docs/x.y.z/#by-step[by()]. If
+we wanted to ask Gremlin to group all the vertices in the graph by their vertex label we could do:
+
+[gremlin-groovy,modern]
+----
+g.V().group().by(label)
+----
+
+The use of `by()` here provides the mechanism by which to do the grouping.  In this case, we've asked Gremlin to
+use the `label` (which, again, is an automatic static import from `T` in the console). We can't really tell much
+about our distribution though because we just have vertex unique identifiers as output.  To make that nicer we
+could ask Gremlin to get us the value of the "name" property from those vertices, by supplying another `by()`
+modulator to `group()` to transform the values.
+
+[gremlin-groovy,modern]
+----
+g.V().group().by(label).by('name')
+----
+
+In this section, you have learned a bit more about what property graphs are and how Gremlin interacts with them.
+You also learned how to envision Gremlin moving about a graph and how to use some of the more complex but commonly
+utilized traversal steps. You are now ready to think about TinkerPop in terms of its wider applicability to
+graph computing and application development.
+
+The Final Fifteen Minutes
+-------------------------
+
 
-This model is referred to as a property graph and it provides a flexible and intuitive way in which to model your data.
 

http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/f7b070de/docs/src/tutorials.asciidoc
----------------------------------------------------------------------
diff --git a/docs/src/tutorials.asciidoc b/docs/src/tutorials.asciidoc
deleted file mode 100644
index 754b9e9..0000000
--- a/docs/src/tutorials.asciidoc
+++ /dev/null
@@ -1,27 +0,0 @@
-////
-Licensed to the Apache Software Foundation (ASF) under one or more
-contributor license agreements.  See the NOTICE file distributed with
-this work for additional information regarding copyright ownership.
-The ASF licenses this file to You under the Apache License, Version 2.0
-(the "License"); you may not use this file except in compliance with
-the License.  You may obtain a copy of the License at
-
-  http://www.apache.org/licenses/LICENSE-2.0
-
-Unless required by applicable law or agreed to in writing, software
-distributed under the License is distributed on an "AS IS" BASIS,
-WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-See the License for the specific language governing permissions and
-limitations under the License.
-////
-image::apache-tinkerpop-logo.png[width=500]
-
-:toc-position: left
-
-Tutorials
-=========
-
-Tutorials are a companion set of documentation to TinkerPop's standard link:http://tinkerpop.incubator.apache.org/docs/x.y.z/index.html[reference documentation].
-The tutorials provide more context and example for specific topics in a way that should be more approachable to users.
-
-include::tutorials-getting-started.asciidoc[]

http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/f7b070de/docs/static/images/modern-edge-1-to-3-1-gremlin.png
----------------------------------------------------------------------
diff --git a/docs/static/images/modern-edge-1-to-3-1-gremlin.png b/docs/static/images/modern-edge-1-to-3-1-gremlin.png
new file mode 100755
index 0000000..19b4d3f
Binary files /dev/null and b/docs/static/images/modern-edge-1-to-3-1-gremlin.png differ

http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/f7b070de/docs/static/images/modern-edge-1-to-3-2-gremlin.png
----------------------------------------------------------------------
diff --git a/docs/static/images/modern-edge-1-to-3-2-gremlin.png b/docs/static/images/modern-edge-1-to-3-2-gremlin.png
new file mode 100755
index 0000000..0df3ef2
Binary files /dev/null and b/docs/static/images/modern-edge-1-to-3-2-gremlin.png differ

http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/f7b070de/docs/static/images/modern-edge-1-to-3-3-gremlin.png
----------------------------------------------------------------------
diff --git a/docs/static/images/modern-edge-1-to-3-3-gremlin.png b/docs/static/images/modern-edge-1-to-3-3-gremlin.png
new file mode 100755
index 0000000..4513489
Binary files /dev/null and b/docs/static/images/modern-edge-1-to-3-3-gremlin.png differ

http://git-wip-us.apache.org/repos/asf/incubator-tinkerpop/blob/f7b070de/pom.xml
----------------------------------------------------------------------
diff --git a/pom.xml b/pom.xml
index 84e0fba..f569cef 100644
--- a/pom.xml
+++ b/pom.xml
@@ -769,16 +769,13 @@ limitations under the License.
                                 </goals>
                                 <configuration>
                                     <sourceDirectory>${asciidoc.input.dir}</sourceDirectory>
-                                    <sourceDocumentName>tutorials.asciidoc</sourceDocumentName>
+                                    <sourceDocumentName>tutorials-getting-started.asciidoc</sourceDocumentName>
                                     <outputDirectory>${htmlsingle.output.dir}</outputDirectory>
                                     <backend>html5</backend>
-                                    <doctype>book</doctype>
+                                    <doctype>article</doctype>
                                     <attributes>
                                         <imagesdir>images</imagesdir>
                                         <encoding>UTF-8</encoding>
-                                        <toc>true</toc>
-                                        <toclevels>3</toclevels>
-                                        <toc-position>left</toc-position>
                                         <!--<iconsdir>images/icons</iconsdir>-->
                                         <!-- AsciiDoctor CSS3-based theme configuration -->
                                         <stylesdir>${asciidoctor.style.dir}</stylesdir>