You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Gian Merlino <gi...@imply.io> on 2017/05/25 19:02:36 UTC

Avatica and load balancers

Is anyone out there using Avatica with servers (that don't share connection
state) behind load balancers? Is that a workable configuration? I'm
guessing it might be if sticky sessions are enabled on the load balancer.
What does the client do when the session switches to a new backend server?

I found a blog post that talks about some of these issues in the context of
Phoenix:
https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html

It seems to suggest that the client will retry queries and skip to the most
recently read offset. Is that behavior on by default? This sounds like it
won't work for a database that is accepting new data -- the query results
aren't generally going to be exact matches from run to run just due to new
rows being added. In that case, I'm struggling to think of any better
approach than failing the query and expecting the user to retry if they
want to.

Gian

Re: Avatica and load balancers

Posted by Josh Elser <jo...@gmail.com>.
Hi,

(Author of that post, here) -- That article heavily assumes two things 
for any degree of usability:

* Avatica servers generally "stay running"
* Once a user is routed to a backend server, it keeps getting routed to 
that server.

I've done some single-node testing and this actually worked fairly well 
for the level of effort I put in (download+run haproxy). It definitely 
falls over in the case of long-running queries, but, admittedly, Avatica 
doesn't do a great job in that case anyways (both fetching the next 
batch of results taking a long time and the fetching the total result 
set taking a long time).

If either of the two points above happen often, client will definitely 
see (potentially, significant) increased latency.

On 5/25/17 3:02 PM, Gian Merlino wrote:
> Is anyone out there using Avatica with servers (that don't share connection
> state) behind load balancers? Is that a workable configuration? I'm
> guessing it might be if sticky sessions are enabled on the load balancer.
> What does the client do when the session switches to a new backend server?
> 
> I found a blog post that talks about some of these issues in the context of
> Phoenix:
> https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html
> 
> It seems to suggest that the client will retry queries and skip to the most
> recently read offset. Is that behavior on by default? This sounds like it
> won't work for a database that is accepting new data -- the query results
> aren't generally going to be exact matches from run to run just due to new
> rows being added. In that case, I'm struggling to think of any better
> approach than failing the query and expecting the user to retry if they
> want to.
> 
> Gian
> 

Re: Avatica and load balancers

Posted by Gian Merlino <gi...@imply.io>.
There isn't anything exposed to "normal" clients that would let them get
read consistency from call to call. Generally we expect people to retry
their operation from the beginning.

Gian

On Fri, May 26, 2017 at 10:05 AM, Julian Hyde <jh...@apache.org> wrote:

> If you want read consistency in Druid, is there transaction id
> (snapshot id) you could use? Then you'll get the same results even if
> the result set fails over.
>
> On Fri, May 26, 2017 at 8:37 AM, Josh Elser <jo...@gmail.com> wrote:
> > The Phoenix thin client is 99.9% Avatica :)
> >
> > The only real configuration that comes to mind is specifying the
> > "jdbc:phoenix:thin" jdbc url prefix
> >
> >
> > On 5/25/17 7:25 PM, Gian Merlino wrote:
> >>
> >> I was thinking that client retries are probably a bad thing, though,
> >> insofar as they are based on offsets into the resultset. At least for
> >> Druid, it's not safe to assume a static dataset (new data could be
> >> continuously being inserted) and so transparent simple
> >> retry-and-skip-to-offset would lead to incorrect results.
> >>
> >> In Phoenix do you use the Avatica client or did you write your own?
> >>
> >> Gian
> >>
> >> On Thu, May 25, 2017 at 3:28 PM, James Taylor <ja...@apache.org>
> >> wrote:
> >>
> >>> Current plan is to keep the load balancer with Phoenix, but that could
> >>> change if there's general interest. In the first cut, we weren't
> planning
> >>> on doing any client retries, but that's probably a good idea to add.
> >>>
> >>>      James
> >>>
> >>> On Thu, May 25, 2017 at 1:28 PM, Gian Merlino <gi...@imply.io> wrote:
> >>>
> >>>> I'm asking in the context of Druid, which also uses Avatica. Although
> it
> >>>> sounds like PHOENIX-3654 means Phoenix is intending to go in the
> >>>
> >>> direction
> >>>>
> >>>> of client-side load balancing. Are you planning to contribute the work
> >>>> to
> >>>> Avatica or keep it unique to Phoenix? We might want something similar
> in
> >>>> Druid; client side load balancing will probably work better than
> using a
> >>>> proxy, given how the servers don't share state at all.
> >>>>
> >>>> If a backend server goes down, would a PHOENIX-3654 style client
> >>>> transparently retry the query on a different server? Or would it fail
> >>>> and
> >>>> expect the user to retry?
> >>>>
> >>>> Gian
> >>>>
> >>>> On Thu, May 25, 2017 at 12:55 PM, James Taylor <
> jamestaylor@apache.org>
> >>>> wrote:
> >>>>
> >>>>> If you're using Avatica in the context of Phoenix, you might be
> >>>>
> >>>> interested
> >>>>>
> >>>>> in PHOENIX-3654 which is about adding a load balancer to the Phoenix
> >>>>
> >>>> Query
> >>>>>
> >>>>> Server.
> >>>>>
> >>>>> Thanks,
> >>>>> James
> >>>>>
> >>>>> On Thu, May 25, 2017 at 12:02 PM, Gian Merlino <gi...@imply.io>
> wrote:
> >>>>>
> >>>>>> Is anyone out there using Avatica with servers (that don't share
> >>>>>
> >>>>> connection
> >>>>>>
> >>>>>> state) behind load balancers? Is that a workable configuration? I'm
> >>>>>> guessing it might be if sticky sessions are enabled on the load
> >>>>
> >>>> balancer.
> >>>>>>
> >>>>>> What does the client do when the session switches to a new backend
> >>>>>
> >>>>> server?
> >>>>>>
> >>>>>>
> >>>>>> I found a blog post that talks about some of these issues in the
> >>>>
> >>>> context
> >>>>>
> >>>>> of
> >>>>>>
> >>>>>> Phoenix:
> >>>>>> https://community.hortonworks.com/articles/9377/deploying-
> >>>>>> the-phoenix-query-server-in-production-e.html
> >>>>>>
> >>>>>> It seems to suggest that the client will retry queries and skip to
> >>>
> >>> the
> >>>>>
> >>>>> most
> >>>>>>
> >>>>>> recently read offset. Is that behavior on by default? This sounds
> >>>
> >>> like
> >>>>
> >>>> it
> >>>>>>
> >>>>>> won't work for a database that is accepting new data -- the query
> >>>>
> >>>> results
> >>>>>>
> >>>>>> aren't generally going to be exact matches from run to run just due
> >>>
> >>> to
> >>>>>
> >>>>> new
> >>>>>>
> >>>>>> rows being added. In that case, I'm struggling to think of any
> better
> >>>>>> approach than failing the query and expecting the user to retry if
> >>>
> >>> they
> >>>>>>
> >>>>>> want to.
> >>>>>>
> >>>>>> Gian
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >
>

Re: Avatica and load balancers

Posted by Julian Hyde <jh...@apache.org>.
If you want read consistency in Druid, is there transaction id
(snapshot id) you could use? Then you'll get the same results even if
the result set fails over.

On Fri, May 26, 2017 at 8:37 AM, Josh Elser <jo...@gmail.com> wrote:
> The Phoenix thin client is 99.9% Avatica :)
>
> The only real configuration that comes to mind is specifying the
> "jdbc:phoenix:thin" jdbc url prefix
>
>
> On 5/25/17 7:25 PM, Gian Merlino wrote:
>>
>> I was thinking that client retries are probably a bad thing, though,
>> insofar as they are based on offsets into the resultset. At least for
>> Druid, it's not safe to assume a static dataset (new data could be
>> continuously being inserted) and so transparent simple
>> retry-and-skip-to-offset would lead to incorrect results.
>>
>> In Phoenix do you use the Avatica client or did you write your own?
>>
>> Gian
>>
>> On Thu, May 25, 2017 at 3:28 PM, James Taylor <ja...@apache.org>
>> wrote:
>>
>>> Current plan is to keep the load balancer with Phoenix, but that could
>>> change if there's general interest. In the first cut, we weren't planning
>>> on doing any client retries, but that's probably a good idea to add.
>>>
>>>      James
>>>
>>> On Thu, May 25, 2017 at 1:28 PM, Gian Merlino <gi...@imply.io> wrote:
>>>
>>>> I'm asking in the context of Druid, which also uses Avatica. Although it
>>>> sounds like PHOENIX-3654 means Phoenix is intending to go in the
>>>
>>> direction
>>>>
>>>> of client-side load balancing. Are you planning to contribute the work
>>>> to
>>>> Avatica or keep it unique to Phoenix? We might want something similar in
>>>> Druid; client side load balancing will probably work better than using a
>>>> proxy, given how the servers don't share state at all.
>>>>
>>>> If a backend server goes down, would a PHOENIX-3654 style client
>>>> transparently retry the query on a different server? Or would it fail
>>>> and
>>>> expect the user to retry?
>>>>
>>>> Gian
>>>>
>>>> On Thu, May 25, 2017 at 12:55 PM, James Taylor <ja...@apache.org>
>>>> wrote:
>>>>
>>>>> If you're using Avatica in the context of Phoenix, you might be
>>>>
>>>> interested
>>>>>
>>>>> in PHOENIX-3654 which is about adding a load balancer to the Phoenix
>>>>
>>>> Query
>>>>>
>>>>> Server.
>>>>>
>>>>> Thanks,
>>>>> James
>>>>>
>>>>> On Thu, May 25, 2017 at 12:02 PM, Gian Merlino <gi...@imply.io> wrote:
>>>>>
>>>>>> Is anyone out there using Avatica with servers (that don't share
>>>>>
>>>>> connection
>>>>>>
>>>>>> state) behind load balancers? Is that a workable configuration? I'm
>>>>>> guessing it might be if sticky sessions are enabled on the load
>>>>
>>>> balancer.
>>>>>>
>>>>>> What does the client do when the session switches to a new backend
>>>>>
>>>>> server?
>>>>>>
>>>>>>
>>>>>> I found a blog post that talks about some of these issues in the
>>>>
>>>> context
>>>>>
>>>>> of
>>>>>>
>>>>>> Phoenix:
>>>>>> https://community.hortonworks.com/articles/9377/deploying-
>>>>>> the-phoenix-query-server-in-production-e.html
>>>>>>
>>>>>> It seems to suggest that the client will retry queries and skip to
>>>
>>> the
>>>>>
>>>>> most
>>>>>>
>>>>>> recently read offset. Is that behavior on by default? This sounds
>>>
>>> like
>>>>
>>>> it
>>>>>>
>>>>>> won't work for a database that is accepting new data -- the query
>>>>
>>>> results
>>>>>>
>>>>>> aren't generally going to be exact matches from run to run just due
>>>
>>> to
>>>>>
>>>>> new
>>>>>>
>>>>>> rows being added. In that case, I'm struggling to think of any better
>>>>>> approach than failing the query and expecting the user to retry if
>>>
>>> they
>>>>>>
>>>>>> want to.
>>>>>>
>>>>>> Gian
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Avatica and load balancers

Posted by Josh Elser <jo...@gmail.com>.
The Phoenix thin client is 99.9% Avatica :)

The only real configuration that comes to mind is specifying the 
"jdbc:phoenix:thin" jdbc url prefix

On 5/25/17 7:25 PM, Gian Merlino wrote:
> I was thinking that client retries are probably a bad thing, though,
> insofar as they are based on offsets into the resultset. At least for
> Druid, it's not safe to assume a static dataset (new data could be
> continuously being inserted) and so transparent simple
> retry-and-skip-to-offset would lead to incorrect results.
> 
> In Phoenix do you use the Avatica client or did you write your own?
> 
> Gian
> 
> On Thu, May 25, 2017 at 3:28 PM, James Taylor <ja...@apache.org>
> wrote:
> 
>> Current plan is to keep the load balancer with Phoenix, but that could
>> change if there's general interest. In the first cut, we weren't planning
>> on doing any client retries, but that's probably a good idea to add.
>>
>>      James
>>
>> On Thu, May 25, 2017 at 1:28 PM, Gian Merlino <gi...@imply.io> wrote:
>>
>>> I'm asking in the context of Druid, which also uses Avatica. Although it
>>> sounds like PHOENIX-3654 means Phoenix is intending to go in the
>> direction
>>> of client-side load balancing. Are you planning to contribute the work to
>>> Avatica or keep it unique to Phoenix? We might want something similar in
>>> Druid; client side load balancing will probably work better than using a
>>> proxy, given how the servers don't share state at all.
>>>
>>> If a backend server goes down, would a PHOENIX-3654 style client
>>> transparently retry the query on a different server? Or would it fail and
>>> expect the user to retry?
>>>
>>> Gian
>>>
>>> On Thu, May 25, 2017 at 12:55 PM, James Taylor <ja...@apache.org>
>>> wrote:
>>>
>>>> If you're using Avatica in the context of Phoenix, you might be
>>> interested
>>>> in PHOENIX-3654 which is about adding a load balancer to the Phoenix
>>> Query
>>>> Server.
>>>>
>>>> Thanks,
>>>> James
>>>>
>>>> On Thu, May 25, 2017 at 12:02 PM, Gian Merlino <gi...@imply.io> wrote:
>>>>
>>>>> Is anyone out there using Avatica with servers (that don't share
>>>> connection
>>>>> state) behind load balancers? Is that a workable configuration? I'm
>>>>> guessing it might be if sticky sessions are enabled on the load
>>> balancer.
>>>>> What does the client do when the session switches to a new backend
>>>> server?
>>>>>
>>>>> I found a blog post that talks about some of these issues in the
>>> context
>>>> of
>>>>> Phoenix:
>>>>> https://community.hortonworks.com/articles/9377/deploying-
>>>>> the-phoenix-query-server-in-production-e.html
>>>>>
>>>>> It seems to suggest that the client will retry queries and skip to
>> the
>>>> most
>>>>> recently read offset. Is that behavior on by default? This sounds
>> like
>>> it
>>>>> won't work for a database that is accepting new data -- the query
>>> results
>>>>> aren't generally going to be exact matches from run to run just due
>> to
>>>> new
>>>>> rows being added. In that case, I'm struggling to think of any better
>>>>> approach than failing the query and expecting the user to retry if
>> they
>>>>> want to.
>>>>>
>>>>> Gian
>>>>>
>>>>
>>>
>>
> 

Re: Avatica and load balancers

Posted by Gian Merlino <gi...@imply.io>.
I was thinking that client retries are probably a bad thing, though,
insofar as they are based on offsets into the resultset. At least for
Druid, it's not safe to assume a static dataset (new data could be
continuously being inserted) and so transparent simple
retry-and-skip-to-offset would lead to incorrect results.

In Phoenix do you use the Avatica client or did you write your own?

Gian

On Thu, May 25, 2017 at 3:28 PM, James Taylor <ja...@apache.org>
wrote:

> Current plan is to keep the load balancer with Phoenix, but that could
> change if there's general interest. In the first cut, we weren't planning
> on doing any client retries, but that's probably a good idea to add.
>
>     James
>
> On Thu, May 25, 2017 at 1:28 PM, Gian Merlino <gi...@imply.io> wrote:
>
> > I'm asking in the context of Druid, which also uses Avatica. Although it
> > sounds like PHOENIX-3654 means Phoenix is intending to go in the
> direction
> > of client-side load balancing. Are you planning to contribute the work to
> > Avatica or keep it unique to Phoenix? We might want something similar in
> > Druid; client side load balancing will probably work better than using a
> > proxy, given how the servers don't share state at all.
> >
> > If a backend server goes down, would a PHOENIX-3654 style client
> > transparently retry the query on a different server? Or would it fail and
> > expect the user to retry?
> >
> > Gian
> >
> > On Thu, May 25, 2017 at 12:55 PM, James Taylor <ja...@apache.org>
> > wrote:
> >
> > > If you're using Avatica in the context of Phoenix, you might be
> > interested
> > > in PHOENIX-3654 which is about adding a load balancer to the Phoenix
> > Query
> > > Server.
> > >
> > > Thanks,
> > > James
> > >
> > > On Thu, May 25, 2017 at 12:02 PM, Gian Merlino <gi...@imply.io> wrote:
> > >
> > > > Is anyone out there using Avatica with servers (that don't share
> > > connection
> > > > state) behind load balancers? Is that a workable configuration? I'm
> > > > guessing it might be if sticky sessions are enabled on the load
> > balancer.
> > > > What does the client do when the session switches to a new backend
> > > server?
> > > >
> > > > I found a blog post that talks about some of these issues in the
> > context
> > > of
> > > > Phoenix:
> > > > https://community.hortonworks.com/articles/9377/deploying-
> > > > the-phoenix-query-server-in-production-e.html
> > > >
> > > > It seems to suggest that the client will retry queries and skip to
> the
> > > most
> > > > recently read offset. Is that behavior on by default? This sounds
> like
> > it
> > > > won't work for a database that is accepting new data -- the query
> > results
> > > > aren't generally going to be exact matches from run to run just due
> to
> > > new
> > > > rows being added. In that case, I'm struggling to think of any better
> > > > approach than failing the query and expecting the user to retry if
> they
> > > > want to.
> > > >
> > > > Gian
> > > >
> > >
> >
>

Re: Avatica and load balancers

Posted by James Taylor <ja...@apache.org>.
Current plan is to keep the load balancer with Phoenix, but that could
change if there's general interest. In the first cut, we weren't planning
on doing any client retries, but that's probably a good idea to add.

    James

On Thu, May 25, 2017 at 1:28 PM, Gian Merlino <gi...@imply.io> wrote:

> I'm asking in the context of Druid, which also uses Avatica. Although it
> sounds like PHOENIX-3654 means Phoenix is intending to go in the direction
> of client-side load balancing. Are you planning to contribute the work to
> Avatica or keep it unique to Phoenix? We might want something similar in
> Druid; client side load balancing will probably work better than using a
> proxy, given how the servers don't share state at all.
>
> If a backend server goes down, would a PHOENIX-3654 style client
> transparently retry the query on a different server? Or would it fail and
> expect the user to retry?
>
> Gian
>
> On Thu, May 25, 2017 at 12:55 PM, James Taylor <ja...@apache.org>
> wrote:
>
> > If you're using Avatica in the context of Phoenix, you might be
> interested
> > in PHOENIX-3654 which is about adding a load balancer to the Phoenix
> Query
> > Server.
> >
> > Thanks,
> > James
> >
> > On Thu, May 25, 2017 at 12:02 PM, Gian Merlino <gi...@imply.io> wrote:
> >
> > > Is anyone out there using Avatica with servers (that don't share
> > connection
> > > state) behind load balancers? Is that a workable configuration? I'm
> > > guessing it might be if sticky sessions are enabled on the load
> balancer.
> > > What does the client do when the session switches to a new backend
> > server?
> > >
> > > I found a blog post that talks about some of these issues in the
> context
> > of
> > > Phoenix:
> > > https://community.hortonworks.com/articles/9377/deploying-
> > > the-phoenix-query-server-in-production-e.html
> > >
> > > It seems to suggest that the client will retry queries and skip to the
> > most
> > > recently read offset. Is that behavior on by default? This sounds like
> it
> > > won't work for a database that is accepting new data -- the query
> results
> > > aren't generally going to be exact matches from run to run just due to
> > new
> > > rows being added. In that case, I'm struggling to think of any better
> > > approach than failing the query and expecting the user to retry if they
> > > want to.
> > >
> > > Gian
> > >
> >
>

Re: Avatica and load balancers

Posted by Gian Merlino <gi...@imply.io>.
I'm asking in the context of Druid, which also uses Avatica. Although it
sounds like PHOENIX-3654 means Phoenix is intending to go in the direction
of client-side load balancing. Are you planning to contribute the work to
Avatica or keep it unique to Phoenix? We might want something similar in
Druid; client side load balancing will probably work better than using a
proxy, given how the servers don't share state at all.

If a backend server goes down, would a PHOENIX-3654 style client
transparently retry the query on a different server? Or would it fail and
expect the user to retry?

Gian

On Thu, May 25, 2017 at 12:55 PM, James Taylor <ja...@apache.org>
wrote:

> If you're using Avatica in the context of Phoenix, you might be interested
> in PHOENIX-3654 which is about adding a load balancer to the Phoenix Query
> Server.
>
> Thanks,
> James
>
> On Thu, May 25, 2017 at 12:02 PM, Gian Merlino <gi...@imply.io> wrote:
>
> > Is anyone out there using Avatica with servers (that don't share
> connection
> > state) behind load balancers? Is that a workable configuration? I'm
> > guessing it might be if sticky sessions are enabled on the load balancer.
> > What does the client do when the session switches to a new backend
> server?
> >
> > I found a blog post that talks about some of these issues in the context
> of
> > Phoenix:
> > https://community.hortonworks.com/articles/9377/deploying-
> > the-phoenix-query-server-in-production-e.html
> >
> > It seems to suggest that the client will retry queries and skip to the
> most
> > recently read offset. Is that behavior on by default? This sounds like it
> > won't work for a database that is accepting new data -- the query results
> > aren't generally going to be exact matches from run to run just due to
> new
> > rows being added. In that case, I'm struggling to think of any better
> > approach than failing the query and expecting the user to retry if they
> > want to.
> >
> > Gian
> >
>

Re: Avatica and load balancers

Posted by James Taylor <ja...@apache.org>.
If you're using Avatica in the context of Phoenix, you might be interested
in PHOENIX-3654 which is about adding a load balancer to the Phoenix Query
Server.

Thanks,
James

On Thu, May 25, 2017 at 12:02 PM, Gian Merlino <gi...@imply.io> wrote:

> Is anyone out there using Avatica with servers (that don't share connection
> state) behind load balancers? Is that a workable configuration? I'm
> guessing it might be if sticky sessions are enabled on the load balancer.
> What does the client do when the session switches to a new backend server?
>
> I found a blog post that talks about some of these issues in the context of
> Phoenix:
> https://community.hortonworks.com/articles/9377/deploying-
> the-phoenix-query-server-in-production-e.html
>
> It seems to suggest that the client will retry queries and skip to the most
> recently read offset. Is that behavior on by default? This sounds like it
> won't work for a database that is accepting new data -- the query results
> aren't generally going to be exact matches from run to run just due to new
> rows being added. In that case, I'm struggling to think of any better
> approach than failing the query and expecting the user to retry if they
> want to.
>
> Gian
>