You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tinkerpop.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/09/02 14:56:00 UTC

[jira] [Commented] (TINKERPOP-2405) gremlinpython: traversal hangs when the connection is established but the servers stops responding later

    [ https://issues.apache.org/jira/browse/TINKERPOP-2405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17189290#comment-17189290 ] 

ASF GitHub Bot commented on TINKERPOP-2405:
-------------------------------------------

spmallette merged pull request #1316:
URL: https://github.com/apache/tinkerpop/pull/1316


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> gremlinpython: traversal hangs when the connection is established but the servers stops responding later
> --------------------------------------------------------------------------------------------------------
>
>                 Key: TINKERPOP-2405
>                 URL: https://issues.apache.org/jira/browse/TINKERPOP-2405
>             Project: TinkerPop
>          Issue Type: Bug
>          Components: python
>    Affects Versions: 3.4.6
>         Environment:  Ubuntu 18.04, Flask 1.1.1, python 3.8.1, Amazon Neptune, Gremlin Server
>            Reporter: Guilherme Quentel Melo
>            Assignee: Stephen Mallette
>            Priority: Major
>
> On a HTTP server that connects to Amazon Neptune, I've seen some situations where a request just hangs and never returns any response. While investigating this, I found out that it hangs right when it is going to query Neptune.
> The problem is that if the connection to Gremlin/Neptune is established and after that the server does not respond any more, the gremlin connection never times out, making the process/thread wait forever for a response that will never come.
> h1. How to reproduce
> # Start a local gremlin server on the default port 8182
> # On a terminal, run {{nc}} to listen on port 8183 with {{nc -lk 8183}}
> # Run the following python code to connect to the *8183* port:
> {code:python}
> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
> from gremlin_python.process.anonymous_traversal import traversal
> remote_connection = DriverRemoteConnection("ws://127.0.0.1:8183/gremlin", "g")                                               
> g = traversal().withRemote(remote_connection)                                                                                
> g.V().limit(1).toList()
> {code}
> # You will see the connection request on {{nc}} output. First time, don't do anything and it will timeout saying the connection couldn't be established.
> # Now repeat the steps, but make nc respond to establish the connection. The quickest way I found is to manually relay the message the real gremlin server:
> ## Copy the whole request from {{nc -l}} output
> ## On another terminal, open a connection to the gremlin server with {{nc 127.0.0.1 8182}}
> ## Paste the request you copied before to {{nc 127.0.0.1 8182}} terminal
> ## Copy the gremlin server response and paste into {{nc -l}} output
> ## The connection will be established and the {{nc -l}} will receive some unprintable chars corresponding to {{g.V().limit(1).toList()}}
> ## Now, if there is no response from {{nc -l}} process, the python code will hang forever.
> h1. Possible solution
> As I looked into it, the problem seems that the {{TornadoTransport}} implementation does not pass any timeout when reading (and writing) messages. So, passing a timeout to {{self._loop.run_sync}} can solve the issue, at least raising an exception when the server does not respond.
> If I change the example above:
> {code:python}
> from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection
> from gremlin_python.driver.tornado.transport import TornadoTransport                                                         
> from gremlin_python.process.anonymous_traversal import traversal
> class CustomTornadoTransport(TornadoTransport): 
>     def read(self): 
>         return self._loop.run_sync(lambda: self._ws.read_message(), timeout=5)
> remote_connection = DriverRemoteConnection("ws://127.0.0.1:8183/gremlin", "g", transport_factory=CustomTornadoTransport)
> g = traversal().withRemote(remote_connection)                                                                                
> g.V().limit(1).toList()
> {code}
> and repeat the same steps, {{g.V().limit(1).toList()}} times out after not getting any response from the server for 5 seconds.
> I'm not sure if there should be any timeout for writing, but it seems it should definitely be set for read operations.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)