You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Mark <st...@gmail.com> on 2010/08/28 19:28:38 UTC

Cassandra & HAProxy

  I will be loadbalancing between nodes using HAProxy. Is this recommended?

Also is there a some sort of ping/health check uri available?

Thanks

Re: Cassandra & HAProxy

Posted by Joe Stump <jo...@joestump.net>.

On Aug 28, 2010, at 12:29 PM, Mark wrote:

> Also, what would be a good way of monitoring the health of the cluster?

We use Ganglia. I believe failover is usually built into clients. Not sure why using HAProxy or LVS wouldn't be a good option though. I used to use it with MySQL slaves with much success.

--Joe

Re: Cassandra & HAProxy

Posted by Benjamin Black <b...@b3k.us>.

munin is the simplest thing.  There are numerous JMX stats of interest.

As a symmetric distributed system, you should not expect to monitor
Cassandra like you would a web server.  Intelligent clients use
connection pools and react to current node behavior in making choices
of where to send requests, including using describe_ring to discover
nodes and open new connections as needed.

On Sat, Aug 28, 2010 at 11:29 AM, Mark <st...@gmail.com> wrote:
>  On 8/28/10 11:20 AM, Benjamin Black wrote:
>>
>> no and no.
>>
>> On Sat, Aug 28, 2010 at 10:28 AM, Mark<st...@gmail.com>  wrote:
>>>
>>>  I will be loadbalancing between nodes using HAProxy. Is this
>>> recommended?
>>>
>>> Also is there a some sort of ping/health check uri available?
>>>
>>> Thanks
>>>
> Also, what would be a good way of monitoring the health of the cluster?
>

Re: Cassandra & HAProxy

Posted by Mark <st...@gmail.com>.

  On 8/28/10 11:20 AM, Benjamin Black wrote:
> no and no.
>
> On Sat, Aug 28, 2010 at 10:28 AM, Mark<st...@gmail.com>  wrote:
>>   I will be loadbalancing between nodes using HAProxy. Is this recommended?
>>
>> Also is there a some sort of ping/health check uri available?
>>
>> Thanks
>>
Also, what would be a good way of monitoring the health of the cluster?

Re: Cassandra & HAProxy

Posted by Benjamin Black <b...@b3k.us>.

Because you create a bottleneck at the HAProxy and because the
presence of the proxy precludes clients properly backing off from
nodes returning errors.  The proper approach is to have clients
maintain connection pools with connections to multiple nodes in the
cluster, and then to spread requests across those connections.  Should
a node begin returning errors (for example, because it is overloaded),
clients can remove it from rotation.

On Sat, Aug 28, 2010 at 11:27 AM, Mark <st...@gmail.com> wrote:
>  On 8/28/10 11:20 AM, Benjamin Black wrote:
>>
>> no and no.
>>
>> On Sat, Aug 28, 2010 at 10:28 AM, Mark<st...@gmail.com>  wrote:
>>>
>>>  I will be loadbalancing between nodes using HAProxy. Is this
>>> recommended?
>>>
>>> Also is there a some sort of ping/health check uri available?
>>>
>>> Thanks
>>>
> any reason on why loadbalancing client connections using HAProxy isnt
> recommended?
>

Re: Cassandra & HAProxy

Posted by Mark <st...@gmail.com>.

  On 8/28/10 2:44 PM, Benjamin Black wrote:
> On Sat, Aug 28, 2010 at 2:34 PM, Anthony Molinaro
> <an...@alumni.caltech.edu>  wrote:
>> I think maybe he thought you meant put a layer between cassandra internal
>> communication.
> No, I took the question to be about client connections.
>
>> There's no problem balancing client connections with
>> haproxy, we've been pushing several billion requests per month through
>> haproxy to cassandra.
>>
> Can it be done: yes.  Is it best practice: no.  Even 10 billion
> requests/month is an average of less than 4000 reqs/sec.   Just not
> that many for a distributed database like Cassandra.
>
>> we use
>>
>>   mode tcp
>>   balance leastconn
>>   server local 127.0.0.1:12350 check
>>
>> so basically just a connect based check, and it works fine
>>
> Cassandra can, and does, fail in ways that do not stop it from
> answering TCP connection requests.  Are you saying it works fine
> because you have seen numerous types of node failures and this was
> sufficient?  I would be quite surprised if that were so.  Using an LB
> for service discovery is a fine thing (connect to a VIP, call
> describe_ring, open direct connections to cluster nodes).  Relying on
> an LB to do the right thing when it is totally ignorant of what is
> going across those client connections (as is implied by simply
> checking for connectivity) is asking for trouble.  Doubly so when you
> use a leastconn policy (a failing node can spit out an error and close
> a connection with impressive speed, sucking all the traffic to itself;
> common problem with HTTP servers giving back errors).
>
>
> b
Yes it was in reference to client connections. Instead of clients 
sending requests to individual nodes it would send it to haproxy. FYI we 
are using ruby and our client is the Cassandra gem which I think you may 
know about :)

Re: Cassandra & HAProxy

Posted by Benjamin Black <b...@b3k.us>.

On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
<an...@alumni.caltech.edu> wrote:
> If one machine is misbehaving it tends to fail pretty quickly, at which
> point all the haproxies drop it (we have an haproxy on every client node,
> so it acts like a connection pooling mechanism for the client).

Cool.  Except this is not at all how most people use HAProxy (and I'd
be very surprised if the OP had this configuration in mind).  As you
say, you are using it per client as a connection pool (which I do
advocate, along with using languages that don't require this sort of
hack), rather than as a service proxy on the Cassandra side (which I
don't advocate).

b

Re: Cassandra & HAProxy

Posted by Ming Fang <mi...@mac.com>.


Sent from my iPhone

On Aug 29, 2010, at 3:20 PM, Benjamin Black <b...@b3k.us> wrote:

> On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
> <an...@alumni.caltech.edu> wrote:
>> 
>> 
>> I don't know it seems to tax our setup of 39 extra large ec2 nodes, its
>> also closer to 24000 reqs/sec at peak since there are different tables
>> (2 tables for each read and 2 for each write)
>> 
> 
> Could you clarify what you mean here?  On the face of it, this
> performance seems really poor given the number and size of nodes.
> 
> 
> b

Re: Cassandra & HAProxy

Posted by Edward Capriolo <ed...@gmail.com>.

On Mon, Aug 30, 2010 at 1:02 PM, Dave Viner <da...@pobox.com> wrote:
> Hi Edward,
> By "down hard", I assume you mean that the machine is no longer responding
> on the cassandra thrift port.  That makes sense (and in fact is what I'm
> doing currently).  But, it seems like the real improvement is something that
> would allow for a simple monitor that goes beyond the simple "machine not
> reachable" issue and covers more common scenarios that temporarily impact
> service time, but aren't so drastic as to cause machine outage.
> Dave Viner
>
> On Mon, Aug 30, 2010 at 9:52 AM, Edward Capriolo <ed...@gmail.com>
> wrote:
>>
>> On Mon, Aug 30, 2010 at 12:40 PM, Dave Viner <da...@pobox.com> wrote:
>> > FWIW - we've been using HAProxy in front of a cassandra cluster in
>> > production and haven't run into any problems yet.  It sounds like our
>> > cluster is tiny in comparison to Anthony M's cluster.  But I just wanted
>> > to
>> > mentioned that others out there are doing the same.
>> > One thing in this thread that I thought was interesting is Ben's initial
>> > comment "the presence of the proxy precludes clients properly backing
>> > off
>> > from nodes returning errors."  I think it would be very cool if someone
>> > implemented a mechanism for haproxy to detect the error nodes and then
>> > enable it to drop those nodes from the rotation.  I'd be happy to help
>> > with
>> > this, as I know how it works with haproxy and standard web servers or
>> > other
>> > tcp servers.  But, I'm not sure how to make it work with Cassandra,
>> > since,
>> > as Ben points out, it can return valid tcp responses (that say
>> > "error-condition") on the standard port.
>> > Dave Viner
>> >
>> > On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro
>> > <an...@alumni.caltech.edu> wrote:
>> >>
>> >> On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote:
>> >> > On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
>> >> > <an...@alumni.caltech.edu> wrote:
>> >> > >
>> >> > >
>> >> > > I don't know it seems to tax our setup of 39 extra large ec2 nodes,
>> >> > > its
>> >> > > also closer to 24000 reqs/sec at peak since there are different
>> >> > > tables
>> >> > > (2 tables for each read and 2 for each write)
>> >> > >
>> >> >
>> >> > Could you clarify what you mean here?  On the face of it, this
>> >> > performance seems really poor given the number and size of nodes.
>> >>
>> >> As you say I would expect to achieve much better performance given the
>> >> node
>> >> size, but if you go back and look through some of the issues we've seen
>> >> over time, you'll find we've been hit with nodes being too small,
>> >> having
>> >> too few nodes to deal with request volume, having OOMs, having bad
>> >> sstables,
>> >> having the ring appear different to different nodes, and several other
>> >> problems.
>> >>
>> >> Many of i/o problems presented themselves as MessageDeserializer pool
>> >> backups
>> >> (although we stopped having these since Jonathan was by and suggested
>> >> row
>> >> cache of about 1Gb, thanks Riptano!).  We currently have mystery OOMs
>> >> which are probably caused by GC storms during compactions (although
>> >> usually
>> >> the nodes restart and compact fine, so who knows).  I also regularly
>> >> watch
>> >> nodes go away for 30 seconds or so (logs show node goes dead, then
>> >> comes
>> >> back to life a few seconds later).
>> >>
>> >> I've sort of given up worrying about these, as we are in the process of
>> >> moving this cluster to our own machines in a colo, so I figure I should
>> >> wait until they are moved, and see how the new machines do before I
>> >> worry
>> >> more about performance.
>> >>
>> >> -Anthony
>> >>
>> >> --
>> >>
>> >> ------------------------------------------------------------------------
>> >> Anthony Molinaro
>> >> <an...@alumni.caltech.edu>
>> >
>> >
>>
>> Any proxy with a TCP health check should be able to determine if the
>> Cassandra service is down hard. The problem for the tools that are not
>> cassandra protocol aware are detecting slowness or other anomalies
>> like TimedOut exceptions.
>>
>> If you are seeing GC storms during compactions you might have rows
>> that are too big. When the compaction hits these memory spikes. I
>> lowered the compaction priority (and added more nodes) which has
>> helped compaction back off leaving some IO for requests.
>
>

Correct. I see two basic approaches for this. One is your proxy has to
know how to communicate cassandra+thrift and be have some intelligence
such as "I got an exception" or "Request took to long" and mark the
node as failed.

The other is to have something external making nodes as dead. This is
something that eddie http://eddie.sourceforge.net/lbdns.html does. if
(Bad node) { remove from dns }.

In our deployment, we have added some extra intelligence to hector to
dodge compacting nodes, etc.

Re: Cassandra & HAProxy

Posted by Dave Viner <da...@pobox.com>.

Hi Edward,

By "down hard", I assume you mean that the machine is no longer responding
on the cassandra thrift port.  That makes sense (and in fact is what I'm
doing currently).  But, it seems like the real improvement is something that
would allow for a simple monitor that goes beyond the simple "machine not
reachable" issue and covers more common scenarios that temporarily impact
service time, but aren't so drastic as to cause machine outage.

Dave Viner


On Mon, Aug 30, 2010 at 9:52 AM, Edward Capriolo <ed...@gmail.com>wrote:

> On Mon, Aug 30, 2010 at 12:40 PM, Dave Viner <da...@pobox.com> wrote:
> > FWIW - we've been using HAProxy in front of a cassandra cluster in
> > production and haven't run into any problems yet.  It sounds like our
> > cluster is tiny in comparison to Anthony M's cluster.  But I just wanted
> to
> > mentioned that others out there are doing the same.
> > One thing in this thread that I thought was interesting is Ben's initial
> > comment "the presence of the proxy precludes clients properly backing off
> > from nodes returning errors."  I think it would be very cool if someone
> > implemented a mechanism for haproxy to detect the error nodes and then
> > enable it to drop those nodes from the rotation.  I'd be happy to help
> with
> > this, as I know how it works with haproxy and standard web servers or
> other
> > tcp servers.  But, I'm not sure how to make it work with Cassandra,
> since,
> > as Ben points out, it can return valid tcp responses (that say
> > "error-condition") on the standard port.
> > Dave Viner
> >
> > On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro
> > <an...@alumni.caltech.edu> wrote:
> >>
> >> On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote:
> >> > On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
> >> > <an...@alumni.caltech.edu> wrote:
> >> > >
> >> > >
> >> > > I don't know it seems to tax our setup of 39 extra large ec2 nodes,
> >> > > its
> >> > > also closer to 24000 reqs/sec at peak since there are different
> tables
> >> > > (2 tables for each read and 2 for each write)
> >> > >
> >> >
> >> > Could you clarify what you mean here?  On the face of it, this
> >> > performance seems really poor given the number and size of nodes.
> >>
> >> As you say I would expect to achieve much better performance given the
> >> node
> >> size, but if you go back and look through some of the issues we've seen
> >> over time, you'll find we've been hit with nodes being too small, having
> >> too few nodes to deal with request volume, having OOMs, having bad
> >> sstables,
> >> having the ring appear different to different nodes, and several other
> >> problems.
> >>
> >> Many of i/o problems presented themselves as MessageDeserializer pool
> >> backups
> >> (although we stopped having these since Jonathan was by and suggested
> row
> >> cache of about 1Gb, thanks Riptano!).  We currently have mystery OOMs
> >> which are probably caused by GC storms during compactions (although
> >> usually
> >> the nodes restart and compact fine, so who knows).  I also regularly
> watch
> >> nodes go away for 30 seconds or so (logs show node goes dead, then comes
> >> back to life a few seconds later).
> >>
> >> I've sort of given up worrying about these, as we are in the process of
> >> moving this cluster to our own machines in a colo, so I figure I should
> >> wait until they are moved, and see how the new machines do before I
> worry
> >> more about performance.
> >>
> >> -Anthony
> >>
> >> --
> >> ------------------------------------------------------------------------
> >> Anthony Molinaro                           <anthonym@alumni.caltech.edu
> >
> >
> >
>
> Any proxy with a TCP health check should be able to determine if the
> Cassandra service is down hard. The problem for the tools that are not
> cassandra protocol aware are detecting slowness or other anomalies
> like TimedOut exceptions.
>
> If you are seeing GC storms during compactions you might have rows
> that are too big. When the compaction hits these memory spikes. I
> lowered the compaction priority (and added more nodes) which has
> helped compaction back off leaving some IO for requests.
>

Re: Cassandra & HAProxy

Posted by Edward Capriolo <ed...@gmail.com>.

On Mon, Aug 30, 2010 at 12:40 PM, Dave Viner <da...@pobox.com> wrote:
> FWIW - we've been using HAProxy in front of a cassandra cluster in
> production and haven't run into any problems yet.  It sounds like our
> cluster is tiny in comparison to Anthony M's cluster.  But I just wanted to
> mentioned that others out there are doing the same.
> One thing in this thread that I thought was interesting is Ben's initial
> comment "the presence of the proxy precludes clients properly backing off
> from nodes returning errors."  I think it would be very cool if someone
> implemented a mechanism for haproxy to detect the error nodes and then
> enable it to drop those nodes from the rotation.  I'd be happy to help with
> this, as I know how it works with haproxy and standard web servers or other
> tcp servers.  But, I'm not sure how to make it work with Cassandra, since,
> as Ben points out, it can return valid tcp responses (that say
> "error-condition") on the standard port.
> Dave Viner
>
> On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro
> <an...@alumni.caltech.edu> wrote:
>>
>> On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote:
>> > On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
>> > <an...@alumni.caltech.edu> wrote:
>> > >
>> > >
>> > > I don't know it seems to tax our setup of 39 extra large ec2 nodes,
>> > > its
>> > > also closer to 24000 reqs/sec at peak since there are different tables
>> > > (2 tables for each read and 2 for each write)
>> > >
>> >
>> > Could you clarify what you mean here?  On the face of it, this
>> > performance seems really poor given the number and size of nodes.
>>
>> As you say I would expect to achieve much better performance given the
>> node
>> size, but if you go back and look through some of the issues we've seen
>> over time, you'll find we've been hit with nodes being too small, having
>> too few nodes to deal with request volume, having OOMs, having bad
>> sstables,
>> having the ring appear different to different nodes, and several other
>> problems.
>>
>> Many of i/o problems presented themselves as MessageDeserializer pool
>> backups
>> (although we stopped having these since Jonathan was by and suggested row
>> cache of about 1Gb, thanks Riptano!).  We currently have mystery OOMs
>> which are probably caused by GC storms during compactions (although
>> usually
>> the nodes restart and compact fine, so who knows).  I also regularly watch
>> nodes go away for 30 seconds or so (logs show node goes dead, then comes
>> back to life a few seconds later).
>>
>> I've sort of given up worrying about these, as we are in the process of
>> moving this cluster to our own machines in a colo, so I figure I should
>> wait until they are moved, and see how the new machines do before I worry
>> more about performance.
>>
>> -Anthony
>>
>> --
>> ------------------------------------------------------------------------
>> Anthony Molinaro                           <an...@alumni.caltech.edu>
>
>

Any proxy with a TCP health check should be able to determine if the
Cassandra service is down hard. The problem for the tools that are not
cassandra protocol aware are detecting slowness or other anomalies
like TimedOut exceptions.

If you are seeing GC storms during compactions you might have rows
that are too big. When the compaction hits these memory spikes. I
lowered the compaction priority (and added more nodes) which has
helped compaction back off leaving some IO for requests.

Re: Cassandra & HAProxy

Posted by Dave Viner <da...@pobox.com>.

FWIW - we've been using HAProxy in front of a cassandra cluster in
production and haven't run into any problems yet.  It sounds like our
cluster is tiny in comparison to Anthony M's cluster.  But I just wanted to
mentioned that others out there are doing the same.

One thing in this thread that I thought was interesting is Ben's initial
comment "the presence of the proxy precludes clients properly backing off
from nodes returning errors."  I think it would be very cool if someone
implemented a mechanism for haproxy to detect the error nodes and then
enable it to drop those nodes from the rotation.  I'd be happy to help with
this, as I know how it works with haproxy and standard web servers or other
tcp servers.  But, I'm not sure how to make it work with Cassandra, since,
as Ben points out, it can return valid tcp responses (that say
"error-condition") on the standard port.

Dave Viner

On Sun, Aug 29, 2010 at 4:48 PM, Anthony Molinaro <
anthonym@alumni.caltech.edu> wrote:

>
> On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote:
> > On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
> > <an...@alumni.caltech.edu> wrote:
> > >
> > >
> > > I don't know it seems to tax our setup of 39 extra large ec2 nodes, its
> > > also closer to 24000 reqs/sec at peak since there are different tables
> > > (2 tables for each read and 2 for each write)
> > >
> >
> > Could you clarify what you mean here?  On the face of it, this
> > performance seems really poor given the number and size of nodes.
>
> As you say I would expect to achieve much better performance given the node
> size, but if you go back and look through some of the issues we've seen
> over time, you'll find we've been hit with nodes being too small, having
> too few nodes to deal with request volume, having OOMs, having bad
> sstables,
> having the ring appear different to different nodes, and several other
> problems.
>
> Many of i/o problems presented themselves as MessageDeserializer pool
> backups
> (although we stopped having these since Jonathan was by and suggested row
> cache of about 1Gb, thanks Riptano!).  We currently have mystery OOMs
> which are probably caused by GC storms during compactions (although usually
> the nodes restart and compact fine, so who knows).  I also regularly watch
> nodes go away for 30 seconds or so (logs show node goes dead, then comes
> back to life a few seconds later).
>
> I've sort of given up worrying about these, as we are in the process of
> moving this cluster to our own machines in a colo, so I figure I should
> wait until they are moved, and see how the new machines do before I worry
> more about performance.
>
> -Anthony
>
> --
> ------------------------------------------------------------------------
> Anthony Molinaro                           <an...@alumni.caltech.edu>
>

Re: Cassandra & HAProxy

Posted by Anthony Molinaro <an...@alumni.caltech.edu>.

On Sun, Aug 29, 2010 at 12:20:10PM -0700, Benjamin Black wrote:
> On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
> <an...@alumni.caltech.edu> wrote:
> >
> >
> > I don't know it seems to tax our setup of 39 extra large ec2 nodes, its
> > also closer to 24000 reqs/sec at peak since there are different tables
> > (2 tables for each read and 2 for each write)
> >
> 
> Could you clarify what you mean here?  On the face of it, this
> performance seems really poor given the number and size of nodes.

As you say I would expect to achieve much better performance given the node
size, but if you go back and look through some of the issues we've seen
over time, you'll find we've been hit with nodes being too small, having
too few nodes to deal with request volume, having OOMs, having bad sstables,
having the ring appear different to different nodes, and several other
problems.

Many of i/o problems presented themselves as MessageDeserializer pool backups
(although we stopped having these since Jonathan was by and suggested row
cache of about 1Gb, thanks Riptano!).  We currently have mystery OOMs
which are probably caused by GC storms during compactions (although usually
the nodes restart and compact fine, so who knows).  I also regularly watch
nodes go away for 30 seconds or so (logs show node goes dead, then comes
back to life a few seconds later).

I've sort of given up worrying about these, as we are in the process of
moving this cluster to our own machines in a colo, so I figure I should
wait until they are moved, and see how the new machines do before I worry
more about performance.

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <an...@alumni.caltech.edu>

Re: Cassandra & HAProxy

Posted by Benjamin Black <b...@b3k.us>.

On Sun, Aug 29, 2010 at 11:04 AM, Anthony Molinaro
<an...@alumni.caltech.edu> wrote:
>
>
> I don't know it seems to tax our setup of 39 extra large ec2 nodes, its
> also closer to 24000 reqs/sec at peak since there are different tables
> (2 tables for each read and 2 for each write)
>

Could you clarify what you mean here?  On the face of it, this
performance seems really poor given the number and size of nodes.

b

Re: Cassandra & HAProxy

Posted by Anthony Molinaro <an...@alumni.caltech.edu>.

On Sat, Aug 28, 2010 at 02:44:41PM -0700, Benjamin Black wrote:
> On Sat, Aug 28, 2010 at 2:34 PM, Anthony Molinaro
> <an...@alumni.caltech.edu> wrote:
> > I think maybe he thought you meant put a layer between cassandra internal
> > communication.
> 
> No, I took the question to be about client connections.

Sorry didn't mean to put words into your mouth

> > There's no problem balancing client connections with
> > haproxy, we've been pushing several billion requests per month through
> > haproxy to cassandra.
> >
> 
> Can it be done: yes.  Is it best practice: no.  Even 10 billion
> requests/month is an average of less than 4000 reqs/sec.   Just not
> that many for a distributed database like Cassandra.

I don't know it seems to tax our setup of 39 extra large ec2 nodes, its
also closer to 24000 reqs/sec at peak since there are different tables
(2 tables for each read and 2 for each write)

> Cassandra can, and does, fail in ways that do not stop it from
> answering TCP connection requests.  Are you saying it works fine
> because you have seen numerous types of node failures and this was
> sufficient? I would be quite surprised if that were so.  Using an LB
> for service discovery is a fine thing (connect to a VIP, call
> describe_ring, open direct connections to cluster nodes).  Relying on
> an LB to do the right thing when it is totally ignorant of what is
> going across those client connections (as is implied by simply
> checking for connectivity) is asking for trouble.  Doubly so when you
> use a leastconn policy (a failing node can spit out an error and close
> a connection with impressive speed, sucking all the traffic to itself;
> common problem with HTTP servers giving back errors).

The haproxy does seem sufficient for us.  We've been running with cassandra
in production since 0.3.0 and seen just about every possible failure.  For
the most part it has worked.  I'm not saying it's the most efficient, just
that it will work for most people's usage.  All the writes to this cluster
are via php, which creates a connection for each request, so a connection
check works fine in this case.  We attempt to pool connections via java for
reads, but they reconnect whenever they receive an error.

If one machine is misbehaving it tends to fail pretty quickly, at which 
point all the haproxies drop it (we have an haproxy on every client node,
so it acts like a connection pooling mechanism for the client).  describe_ring
is a newish call, it didn't exist when we wrote our systems and we have not
had a chance to revisit.  So while yes there are problems with using an
haproxy, they are not insurmountable, and it would probably work for many
use cases.  But like everything YMMV.

-Anthony

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <an...@alumni.caltech.edu>

Re: Cassandra & HAProxy

Posted by Benjamin Black <b...@b3k.us>.

On Sat, Aug 28, 2010 at 2:34 PM, Anthony Molinaro
<an...@alumni.caltech.edu> wrote:
> I think maybe he thought you meant put a layer between cassandra internal
> communication.

No, I took the question to be about client connections.

> There's no problem balancing client connections with
> haproxy, we've been pushing several billion requests per month through
> haproxy to cassandra.
>

Can it be done: yes.  Is it best practice: no.  Even 10 billion
requests/month is an average of less than 4000 reqs/sec.   Just not
that many for a distributed database like Cassandra.

> we use
>
>  mode tcp
>  balance leastconn
>  server local 127.0.0.1:12350 check
>
> so basically just a connect based check, and it works fine
>

Cassandra can, and does, fail in ways that do not stop it from
answering TCP connection requests.  Are you saying it works fine
because you have seen numerous types of node failures and this was
sufficient?  I would be quite surprised if that were so.  Using an LB
for service discovery is a fine thing (connect to a VIP, call
describe_ring, open direct connections to cluster nodes).  Relying on
an LB to do the right thing when it is totally ignorant of what is
going across those client connections (as is implied by simply
checking for connectivity) is asking for trouble.  Doubly so when you
use a leastconn policy (a failing node can spit out an error and close
a connection with impressive speed, sucking all the traffic to itself;
common problem with HTTP servers giving back errors).

b

Re: Cassandra & HAProxy

Posted by Anthony Molinaro <an...@alumni.caltech.edu>.

I think maybe he thought you meant put a layer between cassandra internal
communication.  There's no problem balancing client connections with
haproxy, we've been pushing several billion requests per month through
haproxy to cassandra.

we use

  mode tcp
  balance leastconn
  server local 127.0.0.1:12350 check

so basically just a connect based check, and it works fine

-Anthony

On Sat, Aug 28, 2010 at 11:27:26AM -0700, Mark wrote:
>  On 8/28/10 11:20 AM, Benjamin Black wrote:
> >no and no.
> >
> >On Sat, Aug 28, 2010 at 10:28 AM, Mark<st...@gmail.com>  wrote:
> >>  I will be loadbalancing between nodes using HAProxy. Is this 
> >>  recommended?
> >>
> >>Also is there a some sort of ping/health check uri available?
> >>
> >>Thanks
> >>
> any reason on why loadbalancing client connections using HAProxy isnt 
> recommended?

-- 
------------------------------------------------------------------------
Anthony Molinaro                           <an...@alumni.caltech.edu>

Re: Cassandra & HAProxy

Posted by Mark <st...@gmail.com>.

  On 8/28/10 11:20 AM, Benjamin Black wrote:
> no and no.
>
> On Sat, Aug 28, 2010 at 10:28 AM, Mark<st...@gmail.com>  wrote:
>>   I will be loadbalancing between nodes using HAProxy. Is this recommended?
>>
>> Also is there a some sort of ping/health check uri available?
>>
>> Thanks
>>
any reason on why loadbalancing client connections using HAProxy isnt 
recommended?

Re: Cassandra & HAProxy

Posted by Benjamin Black <b...@b3k.us>.

no and no.

On Sat, Aug 28, 2010 at 10:28 AM, Mark <st...@gmail.com> wrote:
>  I will be loadbalancing between nodes using HAProxy. Is this recommended?
>
> Also is there a some sort of ping/health check uri available?
>
> Thanks
>