You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Kelvin R. Lawrence (Jira)" <ji...@apache.org> on 2022/03/07 21:14:00 UTC

[jira] [Closed] (TINKERPOP-2679) Update JavaScript driver to support processing messages as a stream

     [ https://issues.apache.org/jira/browse/TINKERPOP-2679?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kelvin R. Lawrence closed TINKERPOP-2679.
-----------------------------------------
    Fix Version/s: 3.6.0
                   3.5.3
         Assignee: Jorge Bay
       Resolution: Fixed

> Update JavaScript driver to support processing messages as a stream
> -------------------------------------------------------------------
>
>                 Key: TINKERPOP-2679
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2679
>             Project: TinkerPop
>          Issue Type: Improvement
>          Components: javascript
>    Affects Versions: 3.5.1
>            Reporter: Tom Kolanko
>            Assignee: Jorge Bay
>            Priority: Minor
>             Fix For: 3.6.0, 3.5.3
>
>
> The JavaScript driver's [_handleMessage|https://github.com/apache/tinkerpop/blob/d4bd5cc5a228fc22442101ccb6a9751653900d32/gremlin-javascript/src/main/javascript/gremlin-javascript/lib/driver/connection.js#L249] receives messages from the gremlin server and stores each message in an object associated with the handler for the specific request. Currently, the driver waits until all the data is available from the gremlin server before allowing further processing of it.
> However, this can lead to cases where a lot of memory is required to hold onto the results before any processing can take place. If we had the abilty to process results as they come in from the gremlin server we could reduce memory in some cases
> If you are open to it I would like to submit a PR where {{submit}} can take an optional callback which is run on each set of data returned from the gremlin server, rather than waiting for the entire result set.
> The following examples assume that you have 100 vertices in your graph.
> current behaviour:
> {code:javascript}
> const result = await client.submit("g.V()")
> console.log(result.toArray()) // 100 - all the vertices in your graph
> {code}
> proposed addition
> {code:javascript}
> await client.submit("g.V()", {}, { batchSize: 25 }, (data) => {
>   console.log(data.toArray().length) // 25 - this callback will be called 4 times (100 / 25 = 4)
> })
> {code}
> If the optional callback is not provided then the default behaviour is unchanged
> I have the changes running locally and the overall performance is unchanged, queries run about the same as they used to, however, for some specific queries memory usage has dropped considerably. 
> With the process-on-message strategy the memory usage will be related to how large the {{batchSize}} is rather than the final result set. Using the default of 64 and testing some specific cases we have I can get the memory to go from 1.2gb to 10mb.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)