You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Josh Elser (JIRA)" <ji...@apache.org> on 2016/01/28 01:23:39 UTC

[jira] [Commented] (PHOENIX-2634) Dynamic service discovery for QueryServer

    [ https://issues.apache.org/jira/browse/PHOENIX-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120495#comment-15120495 ] 

Josh Elser commented on PHOENIX-2634:
-------------------------------------

Hi [~warwithin]. This would be neat to play with some more.

I've experimented with putting multiple PQS instances behind a "dumb" load balancer (haproxy, specifically) with success. This has some edge cases (which I've talked with [~jamestaylor] about somewhere previously), notable automatically resuming failed queries (assuming a static dataset). These are the same sorts of problems you'd have to address to implement something like pagination/cursor.

I've also added a [new attribute|http://calcite.apache.org/docs/avatica_protobuf_reference.html#rpcmetadata] that is returned by PQS at the wire-level for every request. This would let you implement your own client-routing decisions so that you could have full control over how a client "routes" its requests. This is just a hammer though, not a house.

When you start getting into load balancing and HA, service discovery also become an important piece (how do your clients actually find *where* your service is). YARN-913 introduce a "registry" which currently has a ZooKeeper-backed solution for service discovery. I believe there is some work on a DNS frontend for this, but I'm not sure the state of it or where it's being tracked. There are many other systems out there which could be leveraged for this aspect.

So, this is a long-winded way to say: what do you think should actually be done? PQS is designed to scale horizontally alreardy (as its REST-iness would imply), so what do you think the next step would be? Personally, I think trying to improve the edges in running behind a "dumb" loadbalancer and then look into recommendations on how DNS could be put in front of that.

Clients can then use a single name to refer to some "farm" of PQS instances, with the load balancer handling the routing logic. This would provide HA, service discovery and load balancing.

One of these days, I'll also try to write up some goodness to deploy PQS on top of Apache Slider to get some auto-magic scaling across a YARN instance. Not sure if my long-term vision would hinge on Slider or just be a deployment option.

> Dynamic service discovery for QueryServer
> -----------------------------------------
>
>                 Key: PHOENIX-2634
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2634
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: YoungWoo Kim
>
> It would be nice if Phoenix QueryServer supports a feature like HIVE-7935 for HA and load balancing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)