You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@superset.apache.org by GitBox <gi...@apache.org> on 2020/02/21 23:57:43 UTC

[GitHub] [incubator-superset] robdiciuccio opened a new issue #9190: [SIP-39] Global Async Query Support

robdiciuccio opened a new issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190
 
 
   ## [SIP-39] Global Async Query Support
   
   ### Motivation
   
   - Loading time on dashboards with many charts is excessive due to the number of concurrent connections to the server.
   - Synchronous connections left open waiting for result data to be returned block other requests to the server.
   - Browsers limit the number of concurrent connections to the same host to six, causing bottlenecks.
   - Some of this bottleneck is currently mitigated via the dashboard tabs feature, which defers loading of charts in background tabs. This lazy-loading could also be applied to charts outside of the viewport.
   - [There is a proposal for SQL Lab to move away from tabs in the UI](https://github.com/apache/incubator-superset/issues/8655), altering the current async query polling mechanisms.
   - Provide a standardized query interface for dashboards, charts and SQL Lab queries.
   
   ### Proposed Change
   
   Provide a configuration setting to enable async data loading for charts (dashboards, Explore) and SQL Lab. Instead of multiple synchronous requests to the server to load dashboard charts, we issue async requests to the server which enqueue a background job and return a job ID. A single persistent websocket connection to the server is opened to listen for results in a realtime fashion.
   
   ## Websockets via a sidecar application
   
   **Pros**
   
   - Excellent websocket browser support
   - Realtime synchronous communication channel
   - Browser connections limits not an issue ([255 in Chrome](https://bugs.chromium.org/p/chromium/issues/detail?id=429419))
   - Supports bi-directional communication (possible future use cases)
   - Authentication/authorization via initial HTTP request that is then upgraded to a WSS connection.
   
   **Cons**
   
   - Minor modifications to proxies, load balancers to support websocket connections
       - Sticky sessions not required at the load balancer if the reconnection story is sound
   - Requires persistent connection to the server for *each tab*
       - [SharedWorker](https://developer.mozilla.org/en-US/docs/Web/API/SharedWorker) also evaluated, but support is lacking
   - AWS "Classic" ELB [does not support Websockets](https://aws.amazon.com/elasticloadbalancing/features/). Requires use of NLB (TCP) or ALB in AWS environments.
   
   ![Superset async query arch - Websockets v2 (2)](https://user-images.githubusercontent.com/296227/75080027-9f84f900-54bf-11ea-8494-ebd531e33450.png)
   
   ### Approach
   
   Each open tab of Superset would create a unique "channel" ID to subscribe to. A websocket connection is established with the standalone websocket server as an HTTP request that is then upgraded to `wss://` if authentication with the main Flask app is successful. Requests for charts or queries in SQL Lab are sent via HTTP to the Flask web app, including the tab's channel ID. The server enqueues a celery job and returns a job ID to the caller. When the job completes, a notification is sent over the WSS connection, which triggers another HTTP request to fetch the cached data and display the results.
   
   **Why a separate application?**
   
   The current Flask application is not well suited for persistent websocket connections. We have evaluated several Python and Flask-based solutions, including [flask-SocketIO](https://flask-socketio.readthedocs.io/en/latest/) and [others](https://pgjones.gitlab.io/quart/), and have found the architectural changes to the Flask web app to be overly invasive. For that reason, we propose that the websocket service be a standalone application, decoupled from the main Flask application, with minimal integration points. Superset's extensive use of Javascript and the mature Node.js [websocket libraries](https://www.npmjs.com/package/ws) make Node.js and TypeScript ([SIP-36](https://github.com/apache/incubator-superset/issues/9101)) an obvious choice for implementing the sidecar application.
   
   **Reconnection**
   
   The nature of persistent connections is that they will, at some point, disconnect. The system should be able to reconnect and "catch up" on any missed events. We evaluated several PubSub solutions (mainly in Redis) that could enable this durable reconnection story, and have determined that [Redis Streams](https://redis.io/topics/streams-intro) (Redis ≥ 5.0) fits this use case well. By storing a last received event ID, the client can pass that ID when reconnecting to fetch all messages in the channel from that point forward. For security reasons, we should periodically force the client to reconnect to revalidate authentication status.
   
   **Why not send result data over the socket connection?**
   
   While it is possible to send result data over the websocket connection, keeping the scope of the standalone service to event notifications will reduce the security footprint of the sidecar application. Fetching (potentially sensitive) data will still require necessary authentication and authorization checks at load time by routing through the Flask web app. Sending large datasets over the websocket protocol introduces potential unknown performance and consistency issues of its own. Websockets are not streams, and "[the client will only be notified about a new message once all of the frames have been received and the original message payload has been reconstructed.](https://codeburst.io/polling-vs-sse-vs-websocket-how-to-choose-the-right-one-1859e4e13bd9)"
   
   **Query Cancellation**
   
   Queries may be "cancelled" by calling the `/superset/stop_query` endpoint in SQL Lab, which simply sets `query.status = QueryStatus.STOPPED` for the running query. This cancellation logic is currently implemented only for queries running against [hive](https://github.com/apache/incubator-superset/blob/607cfd1f29736590fbba397c4f8a04526be66aff/superset/db_engine_specs/hive.py#L264) and [presto](https://github.com/apache/incubator-superset/blob/607cfd1f29736590fbba397c4f8a04526be66aff/superset/db_engine_specs/presto.py#L730) databases. Queries that have been enqueued could be cancelled prior to executing the query by adding a check in the celery worker logic. It is also possible to [revoke a Celery task](https://docs.celeryproject.org/en/stable/userguide/workers.html#revoke-revoking-tasks), which will skip execution of the task, but it won’t terminate an already executing task. Due to the limited query cancellation support in DB-API drivers, some of which is [discussed here](https://github.com/apache/incubator-superset/issues/747), comprehensive query cancellation functionality should be explored in a separate SIP. That said, query cancellation requests may still be issued to the existing (or similar) endpoint when users intentionally navigate away from a dashboard or Explore view with charts in a loading state.
   
   **Query deduplication**
   
   The `query` table in the Superset metadata database currently includes only queries run via SQL Lab. Adapting this for use with dashboards and charts may have a larger impact than we're willing to accept at this time. Instead, a separate key-value store (e.g. Redis) may be used for tracking and preventing duplicate queries. Using a fast KV store also allows us to check for duplicate queries more efficiently in the web request cycle.
   
   Each query issued to the backend can be fingerprinted using a hashing algorithm (SHA-256 or similar) to generate a unique key based on the following:
   
   - Normalized SQL query text
   - Database ID
   - Database schema
   
   Prior to executing a query, a hash (key) is generated and checked against a key-value store. If the key does not exist, it is stored with a configured TTL, containing an object value with the following properties:
   
   - Query state (running, success, error)
   - Query type (chart, sql_lab)
   - Result key (cache key)
   
   Another key is created to track the Channels and Job IDs that should be notified when this query completes (e.g. `<hash>_jobs`→ `List[ChannelId:JobId]`). If a duplicate query is issued while currently running, the Job ID is pushed onto the list, and all relevant channels are notified via websocket when the query completes. If a query is issued and a cache key exists with `state == "success"`, a notification is triggered immediately via websocket to the client. If queries are "force" refreshed, query deduplication is performed only for currently running queries.
   
   ### New or Changed Public Interfaces
   
   - New configuration options
   - Additional API endpoints (see Migration Plan, below)
   
   ### New dependencies (optional)
   
   - Redis ≥ 5.0 (pubsub, Redis Streams) or alternative
   - Node.js runtime for websocket application
   
   ### Migration Plan and Compatibility
   
   Asynchronous query operations are currently an optional enhancement to Superset, and should remain that way. Configuring and running Celery workers should not be required for basic Superset operation. As such, this proposed websocket approach to async query operations should be an optional enhancement to Superset, available via a configuration flag.
   
   For users who opt to run Superset in full async mode, the following requirements will apply under the current proposal:
   
   - Chart cache backend
   - SQL Lab results backend
   - Pubsub backend (current recommendation: Redis Streams)
   - Running the sidecar Node.js websocket application
   
   [Browsers that do not support websockets](https://caniuse.com/#feat=websockets) (very few) should fallback to synchronous operation or short polling.
   
   **Migration plan:**
   
   - Introduce new configuration options to enable asynchronous operation in Superset. Pass these values to the frontend as feature flags (as appropriate)
   - Introduce new API endpoints in Superset's Flask web application for async query/data loading for charts and queries
   - Introduce new API endpoints in Superset's Flask web application required for interaction with the sidecar application
   - Build and deploy the Node.js websocket server application
   - Build and deploy async frontend code to consume websocket events, under feature flag
   
   ### Rejected Alternatives
   
   ### SSE (EventSource) over HTTP/2
   
   [Server-Sent Events](https://developer.mozilla.org/en-US/docs/Web/API/Server-sent_events/Using_server-sent_events) (SSE) streams data over a multiplexed HTTP/2 connection. SSE is a native HTML5 feature that allows the server to keep the HTTP connection open and push data to the client. Thanks to the multiplexing feature of the HTTP/2 protocol, the number of concurrent requests per domain is not limited to 6-8, but it is virtually unlimited.
   
   **Pros**
   
   - Single, multiplexed connection to the server (via HTTP/2)
       - Connections can be limited via `SETTINGS_MAX_CONCURRENT_STREAMS`, but this [should be no smaller than 100](https://http2.github.io/http2-spec/)
   - Events pushed to the client from the server in realtime
   - Supports streaming data and traditional HTTP authentication methods
   - Minor adjustments to proxy, load balancers
   - Can use async [generators](https://medium.com/code-zen/python-generator-and-html-server-sent-events-3cdf14140e56) with Flask & gevent/eventlet (untested)
   
   **Cons**
   
   - HTTP/2 not supported in [some browsers](https://caniuse.com/#feat=http2) (IE)
   - SSE requires [polyfill](https://github.com/Yaffle/EventSource) for [some browsers](https://caniuse.com/#feat=eventsource)
   - Requires persistent connection to the server for *each tab*
   - HTTP/2 requires TLS (not really a con, per se)
   - If SSE is used without HTTP/2, persistent SSE connections may saturate browser connection limit with multiple tabs
   - AWS "Classic" ELB [does not support HTTP/2](https://aws.amazon.com/elasticloadbalancing/features/). ALB supports HTTP/2, but apparently converts it to HTTP/1.1 upstream (AWS-specific)
   - Does not support upstream messaging (from the client to the server)
   
   _NOTE: HTTP/2 multiplexing could still potentially be valuable alongside the websocket features, and should be investigated further._
   
   ### Long polling (aka Comet)
   
   **Pros**
   
   - Excellent browser support (a standard HTTP request)
   
   **Cons**
   
   - Requires long-lived connections
   - Browser limit on concurrent connections to the same domain (6-8) results in connection blocking (HTTP/1.1)
   
   ### Short polling
   
   **Pros**
   
   - Excellent browser support (a standard HTTP request)
   - It's the current solution for async queries in SQL Lab
   
   **Cons**
   
   - Browser limit on concurrent connections to the same domain (6-8) results in connection blocking (HTTP/1.1)
   - Header overhead: every poll request and response contains a full set of HTTP headers
   - Auth overhead: each polling request must be authenticated and authorized on the server
   
   ---
   Thanks to @etr2460 @suddjian @nytai @willbarrett @craig-rueda for feedback and review.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] etr2460 commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
etr2460 commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-590599208
 
 
   We've seen significant load on both webservers and the metadata database managing the SQL Lab short polling requests. We're quite concerned that if we add short polling for every chart in a dashboard it would have a bunch of performance implications.
   
   One option might be to do a bulk short poll on dashboards, making a single short poll to track all the in flight chart requests.
   
   Also, we've seen issues around the short polling in SQL Lab where if any short poll fails, then the entire query is marked as failed on the front end. I wonder if the reliability of short polling makes it less ideal here than using websockets that reconnect when dropped

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] rmgpinto commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
rmgpinto commented on issue #9190:
URL: https://github.com/apache/superset/issues/9190#issuecomment-783453652


   When I enable `GLOBAL_ASYNC_QUERIES`, all native filters on dashboards yield `No Results`


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] villebro commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
villebro commented on issue #9190:
URL: https://github.com/apache/superset/issues/9190#issuecomment-783499569


   Thanks for reporting @rmgpinto - we'll put this on the list of issues to fix. Ping @junlincc 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-596832229
 
 
   @DiggidyDave Socket.io requires sticky sessions at the load balancer in order to properly perform long-polling: https://socket.io/docs/using-multiple-nodes/. Not sure if this is possible in your environment?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-591235866
 
 
   I'm sure you've seen all of this @robdiciuccio but leaving it here for the general audience, it touches on a lot of the tradeoffs discussed here: https://moduscreate.com/blog/fast-polling-vs-websockets-2/
   
   I had forgotten that the polling transport used long polling. :-/ I think there are some options you can configure on the transport, but I'm less confident now that short-polling with socket.io is an option. When I used it previously, we were using websockets (not long polling) on `gunicorn` and `gevent` but we were not using any datasources so I can't speak to that specifically. But we did have to actually write code on the server side to manage some concurrency issues (like explicitly sleeping to yield the processor in certain places to keep things moving) so it might make sense that some db drivers are not out-of-the-box compatible if running in the same process as the socket server.
   
   I know I said I had said my 2 cents, but a couple more questions came to mind (sorry!):
   - are "many tabs" really a problem? do we need to be polling/streaming events constantly for inactive tabs? If only the 1-3 (for power users) active tabs a user will have are the ones we care about, could we only actively short-poll only on those active tabs (using page visibility apis)? upon switching tabs, a quick single, first poll request will see if results have come in for the new tab and cause them to be immediately fetched
   - if the above is true: if polling is active only for active tabs, and batched for dashboards, what are the remaining problems that require websockets?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-595963073
 
 
   We're going to take another look at using Socket.io in order to support environments where websockets are not an option.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-596877134
 
 
   Interesting, I'm not actually sure about that, I'd have to dig into it. :-/ The main thing I think we care about is _any_ client-side interface between the business logic and raw native websockets, to reserve that option to swap out the impl with a short-polling approach.  (socket.io is just a super popular wrapper that happens to have the transport abstraction built-in, as is https://github.com/sockjs/sockjs-client and others)
   
   There are other options that are not coupled with a server architecture, like this one that is a small client-side wrapper: https://github.com/lukeed/sockette
   
   Or this one that is a promise-based wrapper (obvs, a polling impl could fulfill promises just as easily as far as client code is concerned): https://github.com/vitalets/websocket-as-promised
   
   Or anything else. As long as websockets are behind an interface of some kind there will be a path to unblock environments that can't use ws.
   
   Something as simple as this wrapping ws would be perfectly fine:
   ```
   abstract class ServerClientEventBus {
       abstract postMessage(...): void;
       abstract receiveMessage(...): void;
   }
   
   class WSSEventBus : extends ServerClientEventBus {
      // etc...
   }
   ```
   Anyway, I appreciate you looking at that.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-591168407
 
 
   Socket.io defaults to long-polling, upgrading to websockets if available. Using Socket.io without websockets would almost certainly saturate the browser's HTTP connection limit with just a few tabs open (without HTTP/2). This was the main reason behind recommending vanilla websockets vs Socket.io.
   
   I'm interested in hearing more about your experience with `flask-socketio`. Are you running this with `gunicorn` and `gevent`? This setup is [the current recommendation in the Superset docs](https://superset.incubator.apache.org/installation.html#a-proper-wsgi-http-server), but there are apparent incompatibilities with [database](https://www.psycopg.org/docs/advanced.html#green-support) [drivers](https://github.com/PyMySQL/mysqlclient-python/pull/285) and green threads. This lack of clarity around green thread support in Superset is another reason for recommending handling persistent connections in a separate application. I'd also be interested in hearing if others are using `gevent` or `eventlet` successfully in production.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-590504096
 
 
   WebSockets are pretty heavy-handed here, in my opinion. Short polling works well and is very load balancer friendly.  The "con" listed above due to connection limits needn't be an insurmountable problem since these are quick polling calls (or am I missing something?).
   
   I'm pretty concerned about the practical implications of this deploying in corporate environments. Routing HTTP is a well-understood, easy-to-scale problem that is most-likely to play nicely with any environment folks are deploying in and won't require infra changes to LBs etc which are not always under the control of the Superset "owners". I'm just not seeing the ROI on this vs keeping it simple.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594242375
 
 
   @DiggidyDave The use cases for long-running queries in SQL Lab and loading dashboards in a performant manner feel like very different use cases. Polling could (and currently does) satisfy the SQL Lab use case, though there is significant room for improvement there. The latency introduced by polling in dashboards would be a much larger impact to user experience.
   
   Agreed that this discussion is healthy! Architecture proposals should be actively debated and scrutinized so we all benefit from better informed decision-making.
   
   @williaster `@superset-ui` will need to be upgraded to also support [WebSocket](https://developer.mozilla.org/en-US/docs/Web/API/WebSocket) connections. This upgrade should be backwards compatible and configured via feature flags. [CORS does not apply to WebSockets](https://blog.securityevaluators.com/websockets-not-bound-by-cors-does-this-mean-2e7819374acc), but securing the WS connections by validating origin domain is recommended, so we should keep the cross-domain use case in mind during implementation of these security restrictions.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
willbarrett commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594117410
 
 
   Feedback on the alternate solution:
   
   - Sub-second polling falls down when latency between client and server increases. The polling implementation gets increasingly complex as we try to meet a similar level of service as what we would get from websockets.
   - Transitioning polling between tabs and tracking state for all of the tabs adds substantial complexity to the front-end which is unnecessary with a single websocket connection per tab.
   - By using short-polling, we may have slightly decreased the complexity of deployments, but conversely we have increased the complexity of the front-end. I would prefer to see more complexity server-side where issues are more visible vs. on the front-end, where issues can be harder to track.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-591076741
 
 
   Short polling is an option, but it adds considerable load to the metadata database at scale to authenticate and check permissions for each pending object in the polling request(s). If the user has a number of open tabs, we then have each client potentially hammering the metadata DB pretty hard. Contrast this with the websocket option (or SSE+HTTP/2), which requires a single authentication action upon connect, and a single authorization for each result set only when fetched.
   
   Websockets are more work to setup at the infrastructure level, though there are similar concerns with enabling HTTP/2 in many load balancers, without which, SSE is not a viable option. My initial inclination when drafting this SIP was to recommend SSE+HTTP/2 rather than websockets, but the Flask app is not well suited for persistent connections, making a sidecar app more feasible. Websockets are also arguably more ubiquitous for realtime communication at the client at this point. With regard to scaling, the fact that sticky sessions are not required due to the reconnection strategy allows for flexible horizontal scaling of the websocket sidecar app, and there are several patterns for load balancing websocket servers. Are there specific scaling or implementation concerns that we should address?
   
   Async query support is currently optional in Superset, and should remain so, IMO. The async solution we agree upon should be a balance of performance and feasibility, but we should consider short polling for a fallback if websockets are not available for whatever reason.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-591095450
 
 
   For the metadata/auth load problem, could we just issue short-lived tokens for the session with cache to remove the need to hammer that database? (Decomposition into simple problems with simple solutions may be preferable)
   
   Last pushback on the bigger idea 😁: Websockets have their uses--for example, I've found them very useful in the past for things like streaming 60hz point cloud updates and telemetry data--but ultimately they are an RTC protocol and that is really not the situation we have here, and to me this feels like overkill. I concede it will get the job done, but I fear it will be unnecessarily "expensive", it will introduce complexity, it will introduce a host of new bugs and other fallout, and I think the infra/ops requirements may lead to reduced adoption in environments where there is friction to messing with LBs etc.
   
   I think I've said my 2 cents on that... **looking forward**, if the community agrees to go with WebSockets i would ask that the SIP be modified to account for these concerns and keep it enterprise friendly. A couple of requests for consideration:
   - **consider separating the dependency on websockets' interface from the transport layer** using something like socket.io, which has the nice property of being configurable to use the polling transport layer and avoiding the use of actual websockets
   - **reconsider using flask-socketio to avoid need for sidecar**: the SIP doesn't elaborate on why it is too invasive, but in my experience using `flask-socketio` it is very clean and is as simple as adding a route in flask to expose a websocket endpoint.  If architected thoughtfully (with some very light configuration on top), having flask endpoints exposing the websockets on top of socket.io would give users the flexibility to use a sidecar or not, and to use websockets or polling at their discretion.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594125686
 
 
   To the first bullet: I don't disagree at all that websockets have the potential to outperform polling w.r.t. frontend-to-backend latency (assuming the backend--which is, of course, polling something, somewhere--is written efficiently). What I am questioning is **whether reducing the extra 500-1000ms max of latency for long-lived queries (and 250-500ms for short-lived ones) on non-active tabs is actually a requirement** (I obviously think it is not) and whether it is worth complicating and destabilizing Superset over it.
   
   OK, I genuinely think I've said my peace now :-) Thanks for the good faith back-and-forth here. I think its healthy.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave edited a comment on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave edited a comment on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-591235866
 
 
   I'm sure you've seen all of this @robdiciuccio but leaving it here for the general audience, it touches on a lot of the tradeoffs discussed here: https://moduscreate.com/blog/fast-polling-vs-websockets-2/
   
   I had forgotten that the polling transport used long polling. :-/ I think there are some options you can configure on the transport, but I'm less confident now that short-polling with socket.io is an option. When I used it previously, we were using websockets (not long polling) on `gunicorn` and `gevent` but we were not using any datasources so I can't speak to that specifically. But we did have to actually write code on the server side to manage some concurrency issues (like explicitly sleeping to yield the processor in certain places to keep things moving) so it might make sense that some db drivers are not out-of-the-box compatible if running in the same process as the socket server.
   
   I know I said I had said my 2 cents, but a couple more questions came to mind (sorry!):
   - are "many tabs" really a problem? do we need to be polling/streaming events constantly for inactive tabs? If only the 1-3 (for power users with big desktops with side by side windows) active tabs a user will have are the ones we care about, could we only actively short-poll on those active tabs (using page visibility apis)? upon switching tabs, a quick single, first poll request will see if results have come in for the new tab and cause them to be immediately fetched
   - if the above is true: if polling is active only for active tabs, and batched for dashboards, what are the remaining problems that require websockets?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett edited a comment on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
willbarrett edited a comment on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594097266
 
 
   Hi @DiggidyDave I want to chime in here. I don't think short-polling is a good option architecturally. Short-polling necessarily introduces latency as the system waits between polling intervals. The lower the latency desired in the system, the greater the load placed on the server to answer short-poll requests. Maintaining an open websocket connection solves this problem.
   
   To respond to your question on tabs, yes, I think "many tabs" is a problem. Our user research indicates that a small number of power-users create the majority of content inside of organizations. These users tend to have job titles like "business analyst", and for them a large number of open tabs is the norm. For this reason I would say we want non-active tabs to be actively updated. Think of a situation where you fire a query against a data warehouse in one tab, then switch away to work on something else. Ideally, there would be some manner of notification on update to let the user know that the query has finished. This is possible with active websockets without increasing server overhead. With short-polling, we can DDOS ourselves from inactive tabs but if we disable short-polling on inactive tabs we're going to negatively impact the user experience.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] williaster edited a comment on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
williaster edited a comment on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594176339
 
 
   > New dependencies (optional) Node.js runtime for websocket application
   
   What are the expected `node` versions to be supported?
   
   I also don't see any discussion about the migration plan for the `@superset-ui` client which is used throughout the app, and only supports `http` `fetch`ing at the moment, can you provide details for the plan there? How does a `http` => `wss` upgrade work with `fetch`? Will the `wss` client be added to `@superset-ui`?
   
   Additionally, we rely heavily on the `@superset-ui` client for embedding charts in other applications which currently relies on `CORS`. Can you discuss the implications/interactions between `CORS` and `wss`? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-597354093
 
 
   @DiggidyDave the abstraction approach sounds good, as it appears that Socket.io is not really appropriate for this use case. I've updated the body of the SIP with notes on the client-side abstraction.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] williaster commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
williaster commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594176339
 
 
   > New dependencies (optional) Node.js runtime for websocket application
   What are the expected `node` versions to be supported?
   
   I also don't see any discussion about the migration plan for the `@superset-ui` client which is used throughout the app, and only supports `http` `fetch`ing at the moment, can you provide details for the plan there? How does a `http` => `wss` upgrade work with `fetch`? Will the `wss` client be added to `@superset-ui`?
   
   Additionally, we rely heavily on the `@superset-ui` client for embedding charts in other applications which currently relies on `CORS`. Can you discuss the implications/interactions between `CORS` and `wss`? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-590609298
 
 
   The bulk polling could make sense to reduce load, but my bigger point here is that scaling a fleet of HTTP servers (or even routing a particular endpoint to a distinct dedicated fleet) is a pretty simple problem that can be accomodated in most environments, whereas websockets and sidecars have a lot of interesting problems in that area and may not even be feasible in some environments.
   
   For the last point, that sounds like a front-end bug, not a problem inherent with small HTTP GET requests for short polling. Polling is much simpler in terms of the tech involved (and can be stateless), so I am skeptical that small front-end bugs will cease to be a problem if we adopt a more complicated technology to replace the foundation.
   
   And to be clear, I'm not meaning to dismiss the idea of websockets, I just want to make sure that the simpler alternative is discussed and the trade-offs.
   
   It also might be worth discussing the relative strength of the "cons" for SSE.  "Some browsers" don't support it, which means the subset of the IE11 users (1.55% globally) that is not on Windows 10, as well as Opera.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] robdiciuccio commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
robdiciuccio commented on issue #9190:
URL: https://github.com/apache/superset/issues/9190#issuecomment-794281725


   The vote for this SIP PASSED with 5 binding +1, 3 non binding +1 and 0 -1 votes on 3/21/2020
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-595965008
 
 
   Much appreciated. FWIW here is a bit about bypassing the intial/default logpolling connection (which it usually establishes first as a fallback): https://socket.io/docs/client-api/#With-websocket-transport-only

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [superset] robdiciuccio closed issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
robdiciuccio closed issue #9190:
URL: https://github.com/apache/superset/issues/9190


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594104055
 
 
   Those are valid points, but quick alternative solutions:
   - The latency tolerances we are talking about here are "fluid".  Polling can initially be sub-second and backoff as execution time increases... someone waiting for a 10 minute query isn't going to notice 1-2 seconds of poll interval.
   - inactive tabs can be rolled into batch polling with ephemeral (not structural) `localStorage` to let only the active tab poll for a batch of active queries... polling operations should be very lightweight on the server side and lean heavily on cache.
   
   Like I said, I agree websockets will work. I'm just afraid it will cause a lot of complexity, instability, and will cause reduced adoption, in exchange for little or no benefit over simpler approaches.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] williaster edited a comment on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
williaster edited a comment on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594176339
 
 
   > New dependencies (optional) Node.js runtime for websocket application
   
   What are the expected `node` versions to be supported?
   
   I also don't see any discussion about the migration plan/implications for the `@superset-ui` client which is used throughout the app, and only supports `http` `fetch`ing at the moment, can you provide details for the plan there? How does a `http` => `wss` upgrade work with `fetch`? Will the `wss` client be added to `@superset-ui`?
   
   Additionally, we rely heavily on the `@superset-ui` client for embedding charts in other applications which currently relies on `CORS`. Can you discuss the implications/interactions between `CORS` and `wss`? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] willbarrett commented on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
willbarrett commented on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-594097266
 
 
   Hi @DiggidyDave I want to chime in here. I don't think short-polling is a good option architecturally. Short-polling necessarily introduces latency as the system waits between polling intervals. The lower the latency desired in the system, the greater the load placed on the server to answer short-poll requests. Maintaining an open websocket connection solves this problem.
   
   To respond to your question on tabs, yes, I think "many tabs" is a problem. Our user research indicates that a small number of power-users create the majority of content inside of organizations. These users tend to have job titles like "business analyst", and for them a large number of open tabs is the norm. For this reason I would say we want non-active tabs to be actively updated. Think of a situation where you fire a query against a data warehouse in one tab, then switch away to work on something else. Ideally, there would be some manner of notification on update to let the user know that the query has finished. This is possible with active websockets without increasing server overhead. With short-polling, we can DDOS ourselves, but if we disable short-polling on inactive tabs, we're going to negatively impact the user experience.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org


[GitHub] [incubator-superset] DiggidyDave edited a comment on issue #9190: [SIP-39] Global Async Query Support

Posted by GitBox <gi...@apache.org>.
DiggidyDave edited a comment on issue #9190: [SIP-39] Global Async Query Support
URL: https://github.com/apache/incubator-superset/issues/9190#issuecomment-591235866
 
 
   I'm sure you've seen all of this @robdiciuccio but leaving it here for the general audience, it touches on a lot of the tradeoffs discussed here: https://moduscreate.com/blog/fast-polling-vs-websockets-2/
   
   I had forgotten that the polling transport used long polling. :-/ I think there are some options you can configure on the transport, but I'm less confident now that short-polling with socket.io is an option. When I used it previously, we were using websockets (not long polling) on `gunicorn` and `gevent` but we were not using any datasources so I can't speak to that specifically. But we did have to actually write code on the server side to manage some concurrency issues (like explicitly sleeping to yield the processor in certain places to keep things moving) so it might make sense that some db drivers are not out-of-the-box compatible if running in the same process as the socket server.
   
   I know I said I had said my 2 cents, but a couple more questions came to mind (sorry!):
   - are "many tabs" really a problem? do we need to be polling/streaming events constantly for inactive tabs? If only the 1-3 (for power users with big desktops with side by side windows) active tabs a user will have are the ones we care about, could we only actively short-poll only on those active tabs (using page visibility apis)? upon switching tabs, a quick single, first poll request will see if results have come in for the new tab and cause them to be immediately fetched
   - if the above is true: if polling is active only for active tabs, and batched for dashboards, what are the remaining problems that require websockets?
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@superset.apache.org
For additional commands, e-mail: notifications-help@superset.apache.org