You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@knox.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2019/09/06 08:30:00 UTC

[jira] [Work logged] (KNOX-1997) For websocket connections, seeing java.lang.NullPointerException when a data frame arrives on websocket while connection is still being setup

     [ https://issues.apache.org/jira/browse/KNOX-1997?focusedWorklogId=307700&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-307700 ]

ASF GitHub Bot logged work on KNOX-1997:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 06/Sep/19 08:29
            Start Date: 06/Sep/19 08:29
    Worklog Time Spent: 10m 
      Work Description: goelrajat commented on pull request #143: KNOX-1997: Adding changes to buffer messages from backend in onMessag…
URL: https://github.com/apache/knox/pull/143
 
 
   ## What changes were proposed in this pull request?
   
   Adding changes to buffer messages from backend in onMessageText() if frontend session related data structures have not been setup i.e. remote is null. Message buffer will be flushed when remote is set by other thread executing onWebSocketConnect() API. To synchronize reading/flushing buffer, added a lock
   
   ## How was this patch tested?
   
   Race condition was easily reproducible with our service UI and backedn when integrated with Knox. After this change, I no longer see the NullPointerException. Since this is a race condition, its hard to add a unit test case for it. In my case, the race condition was getting hit when service backend used to send a data frame immediately after opening a websocket connection but Knox hadn't setup the connection related data structures. So I added a test case for scenario where send a message to frontend while opening a websocket.
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 307700)
    Remaining Estimate: 0h
            Time Spent: 10m

> For websocket connections, seeing java.lang.NullPointerException when a data frame arrives on websocket while connection is still being setup
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KNOX-1997
>                 URL: https://issues.apache.org/jira/browse/KNOX-1997
>             Project: Apache Knox
>          Issue Type: Bug
>          Components: Server
>    Affects Versions: 1.0.0, 1.1.0, 1.2.0, 1.3.0
>         Environment: HDP 3.1.0
>            Reporter: Rajat Goel
>            Assignee: Rajat Goel
>            Priority: Major
>             Fix For: 1.4.0
>
>         Attachments: KNOX-1997.patch, gateway.log_26Aug.gz
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> I am seeing following exception in Knox gateway logs:
> {code:java}
> 2019-08-23 12:43:46,347 WARN websockets.ProxyInboundClient (AbstractEventDriver.java:unhandled(251)) - Unhandled Error (closing connection)_
>  java.lang.NullPointerException
>  at org.apache.knox.gateway.websockets.ProxyWebSocketAdapter$3.onMessageText(ProxyWebSocketAdapter.java:260)
>  at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:87)
>  at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:78)
>  at org.eclipse.jetty.websocket.jsr356.messages.TextWholeMessage.messageComplete(TextWholeMessage.java:56)
>  at org.eclipse.jetty.websocket.jsr356.endpoints.JsrEndpointEventDriver.onTextFrame(JsrEndpointEventDriver.java:218)
>  at org.eclipse.jetty.websocket.common.events.AbstractEventDriver.incomingFrame(AbstractEventDriver.java:162)
>  at org.eclipse.jetty.websocket.common.WebSocketSession.incomingFrame(WebSocketSession.java:476)
>  at org.eclipse.jetty.websocket.common.extensions.ExtensionStack.incomingFrame(ExtensionStack.java:220)
>  at org.eclipse.jetty.websocket.common.Parser.notifyFrame(Parser.java:219)
>  at org.eclipse.jetty.websocket.common.Parser.parse(Parser.java:244)
>  at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onConnectionStateChange(AbstractWebSocketConnection.java:337)
>  at org.eclipse.jetty.websocket.common.io.IOState.notifyStateListeners(IOState.java:184)
>  at org.eclipse.jetty.websocket.common.io.IOState.onOpened(IOState.java:447)
>  at org.eclipse.jetty.websocket.common.WebSocketSession.open(WebSocketSession.java:619)
>  at org.eclipse.jetty.websocket.common.WebSocketSession.onOpened(WebSocketSession.java:544)
>  at org.eclipse.jetty.io.AbstractConnection.onOpened(AbstractConnection.java:209)
>  at org.eclipse.jetty.io.AbstractConnection.onOpen(AbstractConnection.java:202)
>  at org.eclipse.jetty.websocket.common.io.AbstractWebSocketConnection.onOpen(AbstractWebSocketConnection.java:446)
>  at org.eclipse.jetty.io.AbstractEndPoint.upgrade(AbstractEndPoint.java:440)
>  at org.eclipse.jetty.websocket.client.WebSocketUpgradeRequest.upgrade(WebSocketUpgradeRequest.java:624)
>  at org.eclipse.jetty.client.http.HttpChannelOverHTTP.exchangeTerminating(HttpChannelOverHTTP.java:118)
>  at org.eclipse.jetty.client.HttpReceiver.terminateResponse(HttpReceiver.java:462)
>  at org.eclipse.jetty.client.HttpReceiver.responseSuccess(HttpReceiver.java:416)  {code}
> I added more debug logs in Knox to analyze the issue and observed following was happening:
>  * Knox generates proper URL for backend websocket connection
> {code:java}
> 2019-08-23 12:43:46,068 DEBUG gateway.websockets (GatewayWebsocketHandler.java:createWebSocket(148)) - Message: Rajat .. backednUrl = ws://rafd001-mst-01.cloud.in.guavus.com:11011/_sock/411/anq2kfzp/websocket{code}
>  * Websocket upgrade successful at HTTP protocol layer. Knox starts setting up its session management internal data structures for this connection 
> {code:java}
> 2019-08-23 12:43:46,313 DEBUG client.HttpReceiver (HttpReceiver.java:responseSuccess(403)) - Response success HttpResponse[HTTP/1.1 101 Switching Protocols]@708f1c60
>  _2019-08-23 12:43:46,324 DEBUG common.WebSocketSession (WebSocketSession.java:doStart(248)) - starting - WebSocketSession[websocket=JsrEndpointEventDriver[org.apache.knox.gateway.websockets.ProxyInboundClient],behavior=CLIENT,connection=WebSocketClientConnection@1528d316::SocketChannelEndPoint@10d45033
> {[rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011|http://rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011]<->/[192.168.133.69:50702|http://192.168.133.69:50702/],OPEN,fill=-,flush=-,to=23/300000}
> {io=0/0,kio=0,kro=1} > r:rafd001-mst-01.cloud.in.guavus.com/192.168.133.69:11011,closed=false)=>HttpChannelOverHTTP@26de2594(exchange=HttpExchange@7e8b6268 req=TERMINATED/null@null res=TERMINATED/null@null)[send=HttpSenderOverHTTP@73b692d4(req=QUEUED,snd=COMPLETED,failure=null)[HttpGenerator@4e3bb764\{s=START}],recv=HttpReceiverOverHTTP@7d5a172b(rsp=IDLE,failure=null)[HttpParser\{s=START,0 of1}]],remote=null,incoming=JsrEndpointEventDriver[org.apache.knox.gateway.websockets.ProxyInboundClient],outgoing=ExtensionStack[queueSize=0,extensions=[],incoming=org.eclipse.jetty.websocket.jsr356.JsrSession,outgoing=org.eclipse.jetty.websocket.client.io.WebSocketClientConnection]]{code}
>  * A data frame arrives on the connection from CDAP backend and Knox starts processing it in a separate thread but internal data structures have not yet been established fully:
> {code:java}
> 2019-08-23 12:43:46,346 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onMessageText(257)) - Message: Rajat .. In onMessageText .. remote is nulll
>  2019-08-23 12:43:46,347 WARN websockets.ProxyInboundClient (AbstractEventDriver.java:unhandled(251)) - Unhandled Error (closing connection)
>  java.lang.NullPointerException
>  at org.apache.knox.gateway.websockets.ProxyWebSocketAdapter$3.onMessageText(ProxyWebSocketAdapter.java:260)
>  at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:87)
>  at org.apache.knox.gateway.websockets.ProxyInboundClient$2.onMessage(ProxyInboundClient.java:78){code}
>  * As a result of this exception, Knox starts closing this connection as well as connection with front end.
> {code:java}
> 2019-08-23 12:43:46,347 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onError(246)) - Message: Rajat .. inside onError
>  2019-08-23 12:43:46,348 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:cleanupOnError(183)) - Message: Rajat .. inside cleanupOnError
>  2019-08-23 12:43:46,348 ERROR gateway.websockets (ProxyWebSocketAdapter.java:cleanupOnError(185)) - Error: java.lang.NullPointerException
>  2019-08-23 12:43:46,349 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:closeQuietly(294)) - Message: Rajat .. in closeQuietly{code}
>  * Internal data structures are now setup in the other thread
> {code:java}
> 2019-08-23 12:43:46,356 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onWebSocketConnect(113)) - Message: Rajat ... frontEndSession = WebSocketSession[websocket=JettyListenerEventDriver[org.apache.knox.gateway.websockets.ProxyWebSocketAdapter],behavior=SERVER,connection=WebSocketServerConnection@151585ba::DecryptedEndPoint@4463419a {/[192.168.110.254:61755|http://192.168.110.254:61755/]<>/[192.168.133.69:8443|http://192.168.133.69:8443/],OPEN,fill=,flush=,to=423/300000}
> ,remote=WebSocketRemoteEndpoint@66632a26[batching=true],incoming=JettyListenerEventDriver[org.apache.knox.gateway.websockets.ProxyWebSocketAdapter],outgoing=ExtensionStack[queueSize=0,extensions=[permessage-deflate],incoming=org.eclipse.jetty.websocket.common.extensions.compress.PerMessageDeflateExtension,outgoing=org.eclipse.jetty.websocket.common.extensions.compress.PerMessageDeflateExtension]]
>  2019-08-23 12:43:46,358 DEBUG gateway.websockets (ProxyWebSocketAdapter.java:onWebSocketConnect(119)) - Message: Rajat ...front remote = WebSocketRemoteEndpoint@66632a26[batching=true]{code}
>  * Fronted UI sees connection closed and send a new connection request.
> Looks like there is a race condition/improper synchronization between two Knox threads: one thread opening a websocket connection, setting up the connection session and the other thread processing a message/data packet sent by UI service backend.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)