You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@tinkerpop.apache.org by sp...@apache.org on 2020/03/05 19:48:18 UTC

[tinkerpop] 02/04: Merge branch '3.3-dev' into 3.4-dev

This is an automated email from the ASF dual-hosted git repository.

spmallette pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/tinkerpop.git

commit 36ac7c3dbe5d66bbe7ed0269c93f54678b2ec4bf
Merge: 0f1edda 3116801
Author: Stephen Mallette <sp...@genoprime.com>
AuthorDate: Thu Mar 5 14:41:01 2020 -0500

    Merge branch '3.3-dev' into 3.4-dev

 docs/src/reference/gremlin-variants.asciidoc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --cc docs/src/reference/gremlin-variants.asciidoc
index ea1ee16,780d69e..00a3623c
--- a/docs/src/reference/gremlin-variants.asciidoc
+++ b/docs/src/reference/gremlin-variants.asciidoc
@@@ -77,240 -39,9 +77,240 @@@ anchor:connecting-via-java[
  [[gremlin-java]]
  == Gremlin-Java
  
 -image:gremlin-java-drawing.png[width=130,float=right] Apache TinkerPop's Gremlin-Java implements Gremlin within the Java8
 -language and can be used by any Java8 compliant virtual machine. Gremlin-Java is considered the canonical, reference
 +image:gremlin-java-drawing.png[width=130,float=right] Apache TinkerPop's Gremlin-Java implements Gremlin within the
 +Java language and can be used by any Java Virtual Machine. Gremlin-Java is considered the canonical, reference
  implementation of Gremlin and serves as the foundation by which all other Gremlin language variants should emulate.
 +As the Gremlin Traversal Machine that processes Gremlin queries is also written in Java, it can be used in all three
 +connection methods described in the <<connecting-gremlin,Connecting Gremlin>> Section.
 +
 +[source,xml]
 +----
 +<dependency>
 +   <groupId>org.apache.tinkerpop</groupId>
 +   <artifactId>gremlin-core</artifactId>
 +   <version>x.y.z</version>
 +</dependency>
 +
 +<!-- when using Gremlin Server or Remote Gremlin Provider a driver is required -->
 +<dependency>
 +   <groupId>org.apache.tinkerpop</groupId>
 +   <artifactId>gremlin-driver</artifactId>
 +   <version>x.y.z</version>
 +</dependency>
 +----
 +
 +=== Connecting
 +
 +The pattern for connecting is described in <<connecting-gremlin,Connecting Gremlin>> and it basically distills down
 +to creating a `GraphTraversalSource`. For <<connecting-embedded,embedded>> mode, this involves first creating a
 +`Graph` and then spawning the `GraphTraversalSource`:
 +
 +[source,java]
 +----
 +Graph graph = ...;
 +GraphTraversalSource g = graph.traversal();
 +----
 +
 +Using "g" it is then possible to start writing Gremlin. The "g" allows for the setting of many configuration options
 +which affect traversal execution. The <<traversal, Traversal>> Section describes some of these options and some are
 +only suitable with <<connecting-embedded,embedded>> style usage. For remote options however there are some added
 +configurations to consider and this section looks to address those.
 +
 +When connecting to <<connecting-gremlin-server,Gremlin Server>> or <<connecting-rgp,Remote Gremlin Providers>>  it
 +is possible to configure the `DriverRemoteConnection` manually as shown in earlier examples where the host and port
 +are provided as follows:
 +
 +[source,java]
 +----
 +GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using("localhost",8182,"g"));
 +----
 +
 +It is also possible to create it from a configuration. The most basic way to do so involves the following line of code:
 +
 +[source,java]
 +----
 +GraphTraversalSource g = traversal().withRemote('conf/remote-graph.properties');
 +----
 +
 +The `remote-graph.properties` file simply provides connection information to the `GraphTraversalSource` which is used
 +to configure a `RemoteConnection`. That file looks like this:
 +
 +[source,text]
 +----
 +gremlin.remote.remoteConnectionClass=org.apache.tinkerpop.gremlin.driver.remote.DriverRemoteConnection
 +gremlin.remote.driver.clusterFile=conf/remote-objects.yaml
 +gremlin.remote.driver.sourceName=g
 +----
 +
 +The `RemoteConnection` is an interface that provides the transport mechanism for "g" and makes it possible to for
 +that mechanism to be altered (typically by graph providers who have their own protocols). TinkerPop provides one such
 +implementation called the `DriverRemoteConnection` which enables transport over Gremlin Server protocols using the
 +TinkerPop driver. The driver is configured by the specified `gremlin.remote.driver.clusterFile` and the local "g" is
 +bound to the `GraphTraversalSource` on the remote end with `gremlin.remote.driver.sourceName` which in this case is
 +also "g".
 +
 +There are other ways to configure the traversal using `withRemote()` as it has other overloads. It can take an
 +Apache Commons `Configuration` object which would have keys similar to those shown in the properties file and it
 +can also take a `RemoteConnection` instance directly. The latter is interesting in that it means it is possible to
 +programmatically construct all aspects of the `RemoteConnection`. For TinkerPop usage, that might mean directly
 +constructing the `DriverRemoteConnection` and the driver instance that supplies the transport mechanism. For example,
 +the command shown above could be re-written using programmatic construction as follows:
 +
 +[source,java]
 +----
 +Cluster cluster = Cluster.open();
 +GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(cluster, "g"));
 +----
 +
 +Please consider the following example:
 +
 +[gremlin-groovy]
 +----
 +g = traversal().withRemote('conf/remote-graph.properties')
 +g.V().elementMap()
 +g.close()
 +----
 +
 +[source,java]
 +----
 +GraphTraversalSource g = traversal().withRemote("conf/remote-graph.properties");
 +List<Map> list = g.V().elementMap();
 +g.close();
 +----
 +
 +Note the call to `close()` above. The call to `withRemote()` internally instantiates a connection via the driver that
 +can only be released by "closing" the `GraphTraversalSource`. It is important to take that step to release resources
 +created in that step.
 +
 +If working with multiple remote `TraversalSource` instances it is more efficient to construct `Cluster` and `Client
 +objects and then re-use them.
 +
 +[gremlin-groovy]
 +----
 +cluster = Cluster.open('conf/remote-objects.yaml')
 +client = cluster.connect()
 +g = traversal().withRemote(DriverRemoteConnection.using(client, "g"))
 +g.V().elementMap()
 +g.close()
 +client.close()
 +cluster.close()
 +----
 +
 +If the `Client` instance is supplied externally, as is shown above, then it is not closed implicitly by the close of
 +"g".  Closing "g" will have no effect on "client" or "cluster". When supplying them externally, the `Client` and
 +`Cluster` objects must also be closed explicitly. It's worth noting that the close of a `Cluster` will close all
 +`Client` instances spawned by the `Cluster`.
 +
 +IMPORTANT: Bytecode-based traversals use the `TraversalOpProcessor` in Gremlin Server which requires a cache to enable
 +the retrieval of side-effects (if the `Traversal` produces any). That cache can be configured (e.g. controlling
 +eviction times and sizing) in the Gremlin Server configuration file as described <<traversalopprocessor, here>>.
 +
 +Some connection options can also be set on individual requests made through the Java driver using `with()` step
 +on the `TraversalSource`. For instance to set request timeout to 500 milliseconds:
 +
 +[source,java]
 +----
 +GraphTraversalSource g = traversal().withRemote(conf);
 +List<Vertex> vertices = g.with(Tokens.ARGS_EVAL_TIMEOUT, 500L).V().out("knows").toList()
 +----
 +
 +[[java-imports]]
 +=== Common Imports
 +
 +There are a number of classes, functions and tokens that are typically used with Gremlin. The following imports
 +provide most of the common functionality required to use Gremlin:
 +
 +[source,java]
 +----
 +import org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.GraphTraversalSource;
 +import org.apache.tinkerpop.gremlin.process.traversal.IO;
 +import static org.apache.tinkerpop.gremlin.process.traversal.AnonymousTraversalSource.traversal;
 +import static org.apache.tinkerpop.gremlin.process.traversal.Operator.*;
 +import static org.apache.tinkerpop.gremlin.process.traversal.Order.*;
 +import static org.apache.tinkerpop.gremlin.process.traversal.P.*;
 +import static org.apache.tinkerpop.gremlin.process.traversal.Pop.*;
 +import static org.apache.tinkerpop.gremlin.process.traversal.SackFunctions.*;
 +import static org.apache.tinkerpop.gremlin.process.traversal.Scope.*;
 +import static org.apache.tinkerpop.gremlin.process.traversal.TextP.*;
 +import static org.apache.tinkerpop.gremlin.structure.Column.*;
 +import static org.apache.tinkerpop.gremlin.structure.Direction.*;
 +import static org.apache.tinkerpop.gremlin.structure.T.*;
 +import static org.apache.tinkerpop.gremlin.process.traversal.dsl.graph.__.*;
 +----
 +
 +=== Configuration
 +
 +The following table describes the various configuration options for the Gremlin Driver:
 +
 +[width="100%",cols="3,10,^2",options="header"]
 +|=========================================================
 +|Key |Description |Default
 +|connectionPool.channelizer |The fully qualified classname of the client `Channelizer` that defines how to connect to the server. |`Channelizer.WebSocketChannelizer`
 +|connectionPool.enableSsl |Determines if SSL should be enabled or not. If enabled on the server then it must be enabled on the client. |false
 +|connectionPool.keepAliveInterval |Length of time in milliseconds to wait on an idle connection before sending a keep-alive request. Set to zero to disable this feature. |180000
 +|connectionPool.keyStore |The private key in JKS or PKCS#12 format. |_none_
 +|connectionPool.keyStorePassword |The password of the `keyStore` if it is password-protected. |_none_
 +|connectionPool.keyStoreType |`JKS` (Java 8 default) or `PKCS12` (Java 9+ default)|_none_
 +|connectionPool.maxContentLength |The maximum length in bytes that a message can be sent to the server. This number can be no greater than the setting of the same name in the server configuration. |65536
 +|connectionPool.maxInProcessPerConnection |The maximum number of in-flight requests that can occur on a connection. |4
 +|connectionPool.maxSimultaneousUsagePerConnection |The maximum number of times that a connection can be borrowed from the pool simultaneously. |16
 +|connectionPool.maxSize |The maximum size of a connection pool for a host. |8
 +|connectionPool.maxWaitForConnection |The amount of time in milliseconds to wait for a new connection before timing out. |3000
- |connectionPool.maxWaitForSessionClose |The amount of time in milliseconds to wait for a session to close before timing out (does not apply to sessionless connections). |3000
++|connectionPool.maxWaitForClose |The amount of time in milliseconds to wait for pending messages to be returned from the server before closing the connection. |3000
 +|connectionPool.minInProcessPerConnection |The minimum number of in-flight requests that can occur on a connection. |1
 +|connectionPool.minSimultaneousUsagePerConnection |The maximum number of times that a connection can be borrowed from the pool simultaneously. |8
 +|connectionPool.minSize |The minimum size of a connection pool for a host. |2
 +|connectionPool.reconnectInterval |The amount of time in milliseconds to wait before trying to reconnect to a dead host. |1000
 +|connectionPool.resultIterationBatchSize |The override value for the size of the result batches to be returned from the server. |64
 +|connectionPool.sslCipherSuites |The list of JSSE ciphers to support for SSL connections. If specified, only the ciphers that are listed and supported will be enabled. If not specified, the JVM default is used.  |_none_
 +|connectionPool.sslEnabledProtocols |The list of SSL protocols to support for SSL connections. If specified, only the protocols that are listed and supported will be enabled. If not specified, the JVM default is used.  |_none_
 +|connectionPool.sslSkipCertValidation |Configures the `TrustManager` to trust all certs without any validation. Should not be used in production.|false
 +|connectionPool.trustStore |File location for a SSL Certificate Chain to use when SSL is enabled. If this value is not provided and SSL is enabled, the default `TrustManager` will be used. |_none_
 +|connectionPool.trustStorePassword |The password of the `trustStore` if it is password-protected |_none_
 +|connectionPool.validationRequest |A script that is used to test server connectivity. A good script to use is one that evaluates quickly and returns no data. The default simply returns an empty string, but if a graph is required by a particular provider, a good traversal might be `g.inject()`. |_''_
 +|hosts |The list of hosts that the driver will connect to. |localhost
 +|jaasEntry |Sets the `AuthProperties.Property.JAAS_ENTRY` properties for authentication to Gremlin Server. |_none_
 +|nioPoolSize |Size of the pool for handling request/response operations. |available processors
 +|password |The password to submit on requests that require authentication. |_none_
 +|path |The URL path to the Gremlin Server. |_/gremlin_
 +|port |The port of the Gremlin Server to connect to. The same port will be applied for all hosts. |8192
 +|protocol |Sets the `AuthProperties.Property.PROTOCOL` properties for authentication to Gremlin Server. |_none_
 +|serializer.className |The fully qualified class name of the `MessageSerializer` that will be used to communicate with the server. Note that the serializer configured on the client should be supported by the server configuration. |_none_
 +|serializer.config |A `Map` of configuration settings for the serializer. |_none_
 +|username |The username to submit on requests that require authentication. |_none_
 +|workerPoolSize |Size of the pool for handling background work. |available processors * 2
 +|=========================================================
 +
 +Please see the link:https://tinkerpop.apache.org/javadocs/x.y.z/core/org/apache/tinkerpop/gremlin/driver/Cluster.Builder.html[Cluster.Builder javadoc] to get more information on these settings.
 +
 +=== Serialization
 +
 +Remote systems like Gremlin Server and Remote Gremlin Providers respond to requests made in a particular serialization
 +format and respond by serializing results to some format to be interpreted by the client. For JVM-based languages,
 +there are three options for serialization: Gryo, GraphSON and GraphBinary. When using Gryo serialization (the default
 +serializer for the Java driver), it is important that the client and server have the same serializers configured or
 +else one or the other will experience serialization exceptions and fail to always communicate. Discrepancy in
 +serializer registration between client and server can happen fairly easily as graphs will automatically include
 +serializers on the server-side, thus leaving the client to be configured manually. This can be done manually as
 +follows:
 +
 +[source,java]
 +----
 +IoRegistry registry = ...; // an IoRegistry instance exposed by a specific graph provider
 +GryoMapper kryo = GryoMapper.build().addRegistry(registry).create();
 +MessageSerializer serializer = new GryoMessageSerializerV3d0(kryo);
 +Cluster cluster = Cluster.build().
 +                          serializer(serializer).
 +                          create();
 +Client client = cluster.connect();
 +GraphTraversalSource g = traversal().withRemote(DriverRemoteConnection.using(client, "g"));
 +----
 +
 +The `IoRegistry` tells the serializer what classes from the graph provider to auto-register during serialization.
 +Gremlin Server roughly uses this same approach when it configures its serializers, so using this same model will
 +ensure compatibility when making requests. Obviously, it is possible to switch to GraphSON or GraphBinary by building
 +the appropriate `MessageSerializer` (`GraphSONMessageSerializerV3d0` or `GraphBinaryMessageSerializerV1` respectively)
 +in the same way and building that into the `Cluster` object.
  
  === The Lambda Solution