You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by sp...@apache.org on 2016/10/12 17:42:06 UTC

tinkerpop git commit: Added a new recipe for pagination. CTR

Repository: tinkerpop
Updated Branches:
  refs/heads/master 72ed7951f -> 91b89430a


Added a new recipe for pagination. CTR


Project: http://git-wip-us.apache.org/repos/asf/tinkerpop/repo
Commit: http://git-wip-us.apache.org/repos/asf/tinkerpop/commit/91b89430
Tree: http://git-wip-us.apache.org/repos/asf/tinkerpop/tree/91b89430
Diff: http://git-wip-us.apache.org/repos/asf/tinkerpop/diff/91b89430

Branch: refs/heads/master
Commit: 91b89430a421ca9a9d8a79a43d1af7c0e262ad64
Parents: 72ed795
Author: Stephen Mallette <sp...@genoprime.com>
Authored: Wed Oct 12 13:41:47 2016 -0400
Committer: Stephen Mallette <sp...@genoprime.com>
Committed: Wed Oct 12 13:41:47 2016 -0400

----------------------------------------------------------------------
 CHANGELOG.asciidoc                    |   1 +
 docs/src/recipes/index.asciidoc       |   8 +--
 docs/src/recipes/pagination.asciidoc  |  77 +++++++++++++++++++++++++++++
 docs/static/images/gremlin-paging.png | Bin 0 -> 248956 bytes
 4 files changed, 83 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/91b89430/CHANGELOG.asciidoc
----------------------------------------------------------------------
diff --git a/CHANGELOG.asciidoc b/CHANGELOG.asciidoc
index ca52015..b62bb24 100644
--- a/CHANGELOG.asciidoc
+++ b/CHANGELOG.asciidoc
@@ -120,6 +120,7 @@ This release also includes changes from <<release-3-1-4, 3.1.4>>.
 * Changed scope of log4j dependencies so that they would only be used in tests and the binary distributions of Gremlin Console and Server.
 * Deprecated `Io.Builder.registry()` in favor of the newly introduced `Io.Builder.onMapper()`.
 * Added new recipe for "Traversal Induced Values".
+* Added new recipe for "Pagination".
 * Fixed a potential leak of a `ReferenceCounted` resource in Gremlin Server.
 * Added class registrations for `Map.Entry` implementations to `GryoMapper`.
 * Added methods to retrieve `Cluster` settings in `gremlin-driver`.

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/91b89430/docs/src/recipes/index.asciidoc
----------------------------------------------------------------------
diff --git a/docs/src/recipes/index.asciidoc b/docs/src/recipes/index.asciidoc
index 3225264..0cd1ce1 100644
--- a/docs/src/recipes/index.asciidoc
+++ b/docs/src/recipes/index.asciidoc
@@ -38,13 +38,15 @@ Traversal Recipes
 
 include::between-vertices.asciidoc[]
 
-include::shortest-path.asciidoc[]
+include::centrality.asciidoc[]
+
+include::cycle-detection.asciidoc[]
 
 include::if-then-based-grouping.asciidoc[]
 
-include::cycle-detection.asciidoc[]
+include::pagination.asciidoc[]
 
-include::centrality.asciidoc[]
+include::shortest-path.asciidoc[]
 
 include::traversal-induced-values.asciidoc[]
 

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/91b89430/docs/src/recipes/pagination.asciidoc
----------------------------------------------------------------------
diff --git a/docs/src/recipes/pagination.asciidoc b/docs/src/recipes/pagination.asciidoc
new file mode 100644
index 0000000..85f55c8
--- /dev/null
+++ b/docs/src/recipes/pagination.asciidoc
@@ -0,0 +1,77 @@
+////
+Licensed to the Apache Software Foundation (ASF) under one or more
+contributor license agreements.  See the NOTICE file distributed with
+this work for additional information regarding copyright ownership.
+The ASF licenses this file to You under the Apache License, Version 2.0
+(the "License"); you may not use this file except in compliance with
+the License.  You may obtain a copy of the License at
+
+  http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+////
+[[pagination]]
+Pagination
+----------
+
+image:gremlin-paging.png[float=left,width=330]In most database applications, it is oftentimes desireable to return
+discrete blocks of data for a query rather than all of the data that the total results would contain. This approach to
+returning data is referred to as "pagination" and typically involves a situation where the client executing the query
+can specify the start position and end position (or the amount of data to return in lieu of the end position)
+representing the block of data to return. In this way, one could return the first ten records of one hundred, then the
+second ten records and so on, until potentially all one hundred were returned.
+
+In Gremlin, a basic approach to paging would look something like the following:
+
+[gremlin-groovy,modern]
+----
+g.V().hasLabel('person').fold()                            <1>
+g.V().hasLabel('person').fold().as('persons','count').
+               select('persons','count').
+                 by(range(local, 0, 2)).
+                 by(count(local))                          <2>
+g.V().hasLabel('person').fold().as('persons','count').
+               select('persons','count').
+                 by(range(local, 2, 4)).
+                 by(count(local))                          <3>
+----
+
+<1> Gets all the "person" vertices.
+<2> Gets the first two "person" vertices and includes the total number of vertices so that the client knows how many
+it has to page through.
+<3> Gets the final two "person" vertices.
+
+From a functional perspective, the above example shows a fairly standard paging model. Unfortunately, there is a
+problem. To get the total number of vertices, the traversal must first `fold()` them, which iterates out
+the traversal bringing them all into memory. If the number of "person" vertices is large, that step could lead to a
+long running traversal and perhaps one that would simply run out of memory prior to completion. There is no shortcut
+to getting a total count without doing a full iteration of the traversal. If the requirement for a total count is
+removed then the traversals become more simple:
+
+[gremlin-groovy,modern]
+----
+g.V().hasLabel('person').range(0,2)
+g.V().hasLabel('person').range(2,4)
+----
+
+NOTE: The first traversal above could also be written as `g.V().hasLabel('person').limit(2)`.
+
+In this case, there is no way to know the total count so the only way to know if the end of the results have been
+reached is to count the results from each paged result to see if there's less than the number expected or simply zero
+results. In that case, further requests for additional pages would be uncessary. Of course, this approach is not
+free of problems either. Most graph databases will not optimize the `range()` step, meaning that the second traversal
+will repeat the iteration of the first two vertices to get to the second set of two vertices. In other words, for the
+second traversal, the graph will still read four vertices even though there was only a request for two.
+
+The only way to completely avoid that problem is to re-use the same traversal instance:
+
+[gremlin-groovy,modern]
+----
+t = g.V().hasLabel('person');[]
+t.next(2)
+t.next(2)
+----
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/tinkerpop/blob/91b89430/docs/static/images/gremlin-paging.png
----------------------------------------------------------------------
diff --git a/docs/static/images/gremlin-paging.png b/docs/static/images/gremlin-paging.png
new file mode 100644
index 0000000..9bae33f
Binary files /dev/null and b/docs/static/images/gremlin-paging.png differ