You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jason Harvey <al...@gmail.com> on 2011/03/19 02:55:59 UTC

Optimizing a few nodes to handle all client connections?

Hola everyone,

I have been considering making a few nodes only manage 1 token and
entirely dedicating them to talking to clients. My reasoning behind
this is I don't like the idea of a node having a dual-duty of handling
data, and talking to all of the client stuff.

Is there any merit to this thought?

Cheers,
Jason

Re: Optimizing a few nodes to handle all client connections?

Posted by Edward Capriolo <ed...@gmail.com>.
On Fri, Mar 18, 2011 at 9:55 PM, Jason Harvey <al...@gmail.com> wrote:
> Hola everyone,
>
> I have been considering making a few nodes only manage 1 token and
> entirely dedicating them to talking to clients. My reasoning behind
> this is I don't like the idea of a node having a dual-duty of handling
> data, and talking to all of the client stuff.
>
> Is there any merit to this thought?
>
> Cheers,
> Jason
>

Technically possible but not recommended. Beside making this node a
single point of failure, you assuredly add more latency to every
request. Also each request has memory overhead, one node will have the
sum overhead of all the requests it is not scalable. Also this node
can become a bandwidth limit.

One of the reasons to chose cassandra is it does NOT have a
master/queen node that all requests are proxied through.

Re: Optimizing a few nodes to handle all client connections?

Posted by aaron morton <aa...@thelastpickle.com>.
ah, my flippant comments at the end. 

Instead of "Single point of failure" perhaps I should have said "specialist nodes are a bad idea as they may reduce the overall availability of the cluster to the availability any one sub group." e.g. a cluster of 10 nodes, where 8 are data and 2 are connections may be down for 100% of the keys after the loss of 2 nodes if they happen to be the connection nodes.  

WRT the partitioner I was thinking of a situation such as 

node 1 : 33.3% of the ring
node 2 : 33.3% of the ring
node 3 : 33.3% of the ring
node 4 : 0.1% of the ring 

My point was that giving the node a small token range would not be enough to reduce it's data load. If node 4 was a functioning node in the ring then at RF 3 it will be asked to be a replica for the data from nodes 2 and 3. Unless the replica strategy excluded the node from the list of natural endpoints for all but the token range it was responsible for. 

Aaron


On 21 Mar 2011, at 10:28, Robert Coli wrote:

> On Sun, Mar 20, 2011 at 1:20 PM, aaron morton <aa...@thelastpickle.com> wrote:
>> Even if the node is only
>> responsible for a small about of the ring, it would normally still get data
>> handed to it and read from it as a replica. You would need to use a Replica
>> Placement Strategy that knew it ignore the "connection only" nodes.
>> IMHO it's a bad idea: Single point of failure, wasted compute resources,
>> imbalance between "connection" and "worker" nodes.
> 
> As I understand what is being proposed, the node could only be
> responsible for a single token, and presumably would perform very well
> indeed when reading or writing that token. I don't see why you would
> need to avoid placing a single token's worth of data on a node, or why
> it would become a single point of failure if you did.. is there
> something I'm missing.. ?
> 
> =Rob


Re: Optimizing a few nodes to handle all client connections?

Posted by Robert Coli <rc...@digg.com>.
On Sun, Mar 20, 2011 at 1:20 PM, aaron morton <aa...@thelastpickle.com> wrote:
> Even if the node is only
> responsible for a small about of the ring, it would normally still get data
> handed to it and read from it as a replica. You would need to use a Replica
> Placement Strategy that knew it ignore the "connection only" nodes.
> IMHO it's a bad idea: Single point of failure, wasted compute resources,
> imbalance between "connection" and "worker" nodes.

As I understand what is being proposed, the node could only be
responsible for a single token, and presumably would perform very well
indeed when reading or writing that token. I don't see why you would
need to avoid placing a single token's worth of data on a node, or why
it would become a single point of failure if you did.. is there
something I'm missing.. ?

=Rob

Re: Optimizing a few nodes to handle all client connections?

Posted by aaron morton <aa...@thelastpickle.com>.
As Vijay says look at the "fat client" contrib. Even if the node is only responsible for a small about of the ring, it would normally still get data handed to it and read from it as a replica. You would need to use a Replica Placement Strategy that knew it ignore the "connection only" nodes. 

IMHO it's a bad idea: Single point of failure, wasted compute resources, imbalance between "connection" and "worker" nodes. 

Aaron

On 19 Mar 2011, at 15:27, Vijay wrote:

> Are you saying you dont like the idea of the co-ordinator node being in the same ring? if yes have you looked at the cassandra "fat client" in contrib?
> 
> Regards,
> </VJ>
> 
> 
> 
> On Fri, Mar 18, 2011 at 6:55 PM, Jason Harvey <al...@gmail.com> wrote:
> Hola everyone,
> 
> I have been considering making a few nodes only manage 1 token and
> entirely dedicating them to talking to clients. My reasoning behind
> this is I don't like the idea of a node having a dual-duty of handling
> data, and talking to all of the client stuff.
> 
> Is there any merit to this thought?
> 
> Cheers,
> Jason
> 


Re: Optimizing a few nodes to handle all client connections?

Posted by Vijay <vi...@gmail.com>.
Are you saying you dont like the idea of the co-ordinator node being in the
same ring? if yes have you looked at the cassandra "fat client" in contrib?

Regards,
</VJ>



On Fri, Mar 18, 2011 at 6:55 PM, Jason Harvey <al...@gmail.com> wrote:

> Hola everyone,
>
> I have been considering making a few nodes only manage 1 token and
> entirely dedicating them to talking to clients. My reasoning behind
> this is I don't like the idea of a node having a dual-duty of handling
> data, and talking to all of the client stuff.
>
> Is there any merit to this thought?
>
> Cheers,
> Jason
>