You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "Avner Levy (Jira)" <ji...@apache.org> on 2022/06/17 15:13:00 UTC

[jira] [Comment Edited] (TINKERPOP-2708) unhandledRejection upon connection failure

    [ https://issues.apache.org/jira/browse/TINKERPOP-2708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17555654#comment-17555654 ] 

Avner Levy edited comment on TINKERPOP-2708 at 6/17/22 3:12 PM:
----------------------------------------------------------------

I have a problem with the javascript client where if the server restarts the client hangs forever (originally happens with AWS Neptune, but easy to reproduce with Tinkerpop as well).
I've written this small program to demonstrate the issue:

import gremlin from 'gremlin';
const holdMainloop = setInterval(()=>console.log('holding mainloop'), 60000);
const main = async () => {
try {
console.log('hello gremlin');
const traversal = gremlin.process.AnonymousTraversalSource.traversal;
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
const ID = gremlin.process.t.id;
const _ = gremlin.process.statics;
const __ = gremlin.process.P;

const driver = new DriverRemoteConnection('ws://localhost:8182/gremlin', {});

const log = (header)=>\{return (...args)=>console.log(new Date(), header,JSON.stringify(args))};
const LABEL = 'Test';
driver.addListener('log', log('log:'));
driver.addListener('close', log('close:'));
driver.addListener('socketError', log('sockerError:'));

const g = traversal().withRemote(driver);
await g.V().hasLabel(LABEL).drop().next();
await g.addV(LABEL).property(ID,'1').next();
await g.addV(LABEL).property(ID,'2').next();
await g.addE(LABEL).from_({_}.V('1')).to({_}.V('2')).property(ID,'e1').next();
await g.addE(LABEL).from_({_}.V('2')).to({_}.V('1')).property(ID,'e2').next();

while (true) {
try {
const start = Date.now();
console.log(new Date(), 'before query');
await g.with_('evaluationTimeout', 1000).V('1').repeat(_.out()).times(1500).next();
console.log(new Date(), `after query, took ${Date.now() - start} ms`);
} catch (e)

{ console.log('Failed query: ', e); }

}
await driver.close();
} catch (e)

{ console.log('uncaught exception (exit):', e); }

};
try

{ await main(); }

catch (e)

{ console.log('exception in main: ', e); clearInterval(holdMainloop); }

I run the tinkerpop server in a container and kill it after the above program is running for a few seconds.

2022-06-17T15:10:25.602Z before query
2022-06-17T15:10:26.247Z after query, took 645 ms
2022-06-17T15:10:26.247Z before query
2022-06-17T15:10:26.804Z after query, took 557 ms
2022-06-17T15:10:26.804Z before query
2022-06-17T15:10:27.458Z log: ["ws close code=1006 message="]
2022-06-17T15:10:27.459Z close: [1006,""]
2022-06-17T15:11:23.411Z holding mainloop
2022-06-17T15:12:23.414Z holding mainloop
...

I'm using 3.5.2.
Package,json:

{
"name": "playground",
"version": "1.0.0",
"description": "",
"main": "index.js",
"type": "module",
"keywords": [],
"author": "",
"license": "ISC",
"dependencies":

{ "gremlin": "^3.5.2" }

}


was (Author: JIRAUSER291147):
I have a problem with the javascript client where if the server restarts the client hangs forever (originally happens with AWS Neptune, but easy to reproduce with Tinkerpop as well).
I've written this small program to demonstrate the issue:

import gremlin from 'gremlin';
const holdMainloop = setInterval(()=>console.log('holding mainloop'), 60000);
const main = async () => {
try {
console.log('hello gremlin');
const traversal = gremlin.process.AnonymousTraversalSource.traversal;
const DriverRemoteConnection = gremlin.driver.DriverRemoteConnection;
const ID = gremlin.process.t.id;
const _ = gremlin.process.statics;
const __ = gremlin.process.P;

const driver = new DriverRemoteConnection('ws://localhost:8182/gremlin', {});
const log = (header)=>(...args)=>console.log(header,JSON.stringify(args));
const LABEL = 'Test';
driver.addListener('log', log(new Date(), 'log:'));
driver.addListener('close', log(new Date(), 'close:'));
driver.addListener('socketError', log(new Date(), 'sockerError:'));

const g = traversal().withRemote(driver);
await g.V().hasLabel(LABEL).drop().next();
await g.addV(LABEL).property(ID,'1').next();
await g.addV(LABEL).property(ID,'2').next();
await g.addE(LABEL).from_(_.V('1')).to(_.V('2')).property(ID,'e1').next();
await g.addE(LABEL).from_(_.V('2')).to(_.V('1')).property(ID,'e2').next();

while (true) {
try {
const start = Date.now();
console.log(new Date(), 'before query');
await g.with_('evaluationTimeout', 1000).V('1').repeat(_.out()).times(1500).next();
console.log(new Date(), `after query, took ${Date.now() - start} ms`);
} catch (e) {
console.log('Failed query: ', e);
}
}
await driver.close();
} catch (e) {
console.log('uncaught exception (exit):', e);
}
};
try {
await main();
} catch (e) {
console.log('exception in main: ', e);
clearInterval(holdMainloop);
}

I run the tinkerpop server in a container and kill it after the above program is running for a few seconds.

2022-06-17T15:01:32.324Z before query
2022-06-17T15:01:32.661Z after query, took 337 ms
2022-06-17T15:01:32.661Z before query
2022-06-17T15:01:32.944Z after query, took 283 ms
2022-06-17T15:01:32.944Z before query
2022-06-17T15:01:27.927Z ["ws close code=1006 message="]
2022-06-17T15:01:27.927Z [1006,""]
2022-06-17T15:02:27.910Z holding mainloop
2022-06-17T15:03:27.922Z holding mainloop
2022-06-17T15:04:27.928Z holding mainloop
2022-06-17T15:05:27.937Z holding mainloop

I'm using 3.5.2.
Package,json:

{
"name": "playground",
"version": "1.0.0",
"description": "",
"main": "index.js",
"type": "module",
"keywords": [],
"author": "",
"license": "ISC",
"dependencies": {
"gremlin": "^3.5.2"
}
}

> unhandledRejection upon connection failure
> ------------------------------------------
>
>                 Key: TINKERPOP-2708
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2708
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: javascript
>    Affects Versions: 3.5.2
>            Reporter: Jon Brede Skaug
>            Priority: Major
>
> In the Javascript driver  is unable to connect to the graph database for whatever reason an unhandledRejection warning occurs.
> I have tested this with `new DriverRemoteConnection`
> This is a silent error and it won't be able to catch the error due to the way it is handled.
> I've tracked it down to this line:
> [https://github.com/apache/tinkerpop/blob/c22c0141bb7a00f366f929d0e5d3c6379d1004e0/gremlin-javascript/src/main/javascript/gremlin-javascript/lib/driver/connection.js#L156]
> h2. *Solution suggestion*
> A fairly quick solution (but possibly breaking change) to this is by not opening the database in the constructor [(line reference L105)|https://github.com/apache/tinkerpop/blob/c22c0141bb7a00f366f929d0e5d3c6379d1004e0/gremlin-javascript/src/main/javascript/gremlin-javascript/lib/driver/connection.js#L105] but instead forcing the user to run the `DriverRemoteConnection.open()` after the constructor has been initialized. `DriverRemoteConnection.open()` returns a promise which makes more sense and is a bit more intuitive. The current error message gives an error about DNS which is increadibly confusing without deepdiving into the Gremlin driver code and navigating through 3 classes to find the culprit. It's also an error which seems a bit more harmless than it actually is. 
> It'salso  possible to set option.connectOnStartup to "false" by default, this however will require the user to be aware of the possible failure upon setting it to true. I believe forcing the user to run .open() after initializing the class may be more robust. 
> By doing it this way the user can instead handle the error raised by DriverRemoteConnection.open() by using promise.catch() or an async function using await. Promise.catch() is as provided:
> {code:java}
> this.drc.open().catch(err => {
>     console.log("Unable to open connection to database", err);
> });{code}
> h2. *{{Temporary work around example}}*
>  
> {code:java}
> // Using promises
> const drc = new DriverRemoteConnection(url, {connectOnStartup: false}); 
> drc.open().catch(err => {
>    // Handle error upon open, i.e using retry and backoff logic, notify an alarm system or setting a global variable to reject requests.
> });{code}
>  
> h2. *The issue with not handling the error properly:*
> Not handling the error properly means that if you pass in an invalid URL or the gremlin compatible database is down, it won't be able to handle the connection error before a transaction is attempted.
> In the future Node.js unhandledRejection will terminate the Node.js process. This can cause critical failure of processes upon boot and may even cause DDoS situations where processes may flood the gremlin compatible database with connection attempts due to processes failing and being reinstated over and over by a process monitor.
>  



--
This message was sent by Atlassian Jira
(v8.20.7#820007)