You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@couchdb.apache.org by howard chen <ho...@gmail.com> on 2009/03/14 11:20:58 UTC

Use couchdb as the data store for real-time chat room

Hi all,


Currently we have a quite large scale online chat-room, using MySQL /
Memcached for data store. Chat room message is refreshed every 2 - 10
sec (adaptive, depends on "chat rate", using Ajax).

Each user message is stored in MySQL as the as rows containing the
fields such as { User_Name, Message, Add_Time }, users send a request
to server with { Last_Time } - which indicating the last messages
fetched time, and all rows between Add_Time to  Last_Time will be
returned using JSON.


For performance reason, I added Memcached in front to do some caching.
The system currently is running quite well except bottom neck is in
MySQL if chat room get more and more traffic.


Now I am considering to try couchdb, but before that, maybe you guy
can comments on my usage to see if feasible to try couchdb.


1. I know couchdb is good for simple key-value query, but is couchdb
fast for range requests ? E.g. I want to get all message in between
Add_Time -  Last_Time? How does it compare with MySQL range query on
index?

2. Is Memcached useless if I am using couchdb?



Thanks.

Re: Use couchdb as the data store for real-time chat room

Posted by Troy Kruthoff <tk...@gmail.com>.
If performance and scalability are important you'll:

1) use xmpp as the backend
2) use evented anything over apache/fork/php (I recommend Twisted).   
You can still serve all the html, etc, with php, but hit the twisted  
server with the ajax long poll.

If you google around, you'll find the consensus of many before you.

-- troy



On Mar 14, 2009, at 5:36 AM, howard chen wrote:

> Hi,
>
> On Sat, Mar 14, 2009 at 8:20 PM, Sven Helmberger <sven.helmberger@gmx.de 
> > wrote:
>
>> This sounds like it could be the main performance problem. I would  
>> expect
>> some form of COMET or even the long request pattern to improve your
>> performance.
>>
>
>
> Yes, but we are serving many concurrent users, so I don't want to hold
> a persistence connection from the server to client. (we are using
> Apache2/pre_fork, mod_php)
>
> As since our polling interval is adaptive, is it quite easy to scale
> by adding servers currently.
>
> Howard


Re: Use couchdb as the data store for real-time chat room

Posted by Patrick Aljord <pa...@gmail.com>.
On Sat, Mar 14, 2009 at 8:11 AM, Damien Katz <da...@apache.org> wrote:

> We are planning to provide a native COMET mechanism, so clients can be
> notified as soon as documents they are interested are updated or created.
> The enhancements are for selective and near real-time replication features
> we are planning, but they will be useful for things like this too.
>
> -Damien
>

That sounds great, is it planed for  1.0 or post-1.0?

Re: Use couchdb as the data store for real-time chat room

Posted by Damien Katz <da...@apache.org>.
Erlang's speciality is lots of concurrent users.

We are planning to provide a native COMET mechanism, so clients can be  
notified as soon as documents they are interested are updated or  
created. The enhancements are for selective and near real-time  
replication features we are planning, but they will be useful for  
things like this too.

-Damien

On Mar 14, 2009, at 8:36 AM, howard chen wrote:

> Hi,
>
> On Sat, Mar 14, 2009 at 8:20 PM, Sven Helmberger <sven.helmberger@gmx.de 
> > wrote:
>
>> This sounds like it could be the main performance problem. I would  
>> expect
>> some form of COMET or even the long request pattern to improve your
>> performance.
>>
>
>
> Yes, but we are serving many concurrent users, so I don't want to hold
> a persistence connection from the server to client. (we are using
> Apache2/pre_fork, mod_php)
>
> As since our polling interval is adaptive, is it quite easy to scale
> by adding servers currently.
>
> Howard


Re: Use couchdb as the data store for real-time chat room

Posted by howard chen <ho...@gmail.com>.
Hi,

On Sat, Mar 14, 2009 at 8:20 PM, Sven Helmberger <sv...@gmx.de> wrote:

> This sounds like it could be the main performance problem. I would expect
> some form of COMET or even the long request pattern to improve your
> performance.
>


Yes, but we are serving many concurrent users, so I don't want to hold
a persistence connection from the server to client. (we are using
Apache2/pre_fork, mod_php)

As since our polling interval is adaptive, is it quite easy to scale
by adding servers currently.

Howard

Re: Use couchdb as the data store for real-time chat room

Posted by Sven Helmberger <sv...@gmx.de>.
howard chen wrote:
> Hi all,
> 
> 
> Currently we have a quite large scale online chat-room, using MySQL /
> Memcached for data store. Chat room message is refreshed every 2 - 10
> sec (adaptive, depends on "chat rate", using Ajax).
> 

Does that mean you use AJAX polling with a user configurable interval of 
2 to 10 seconds to drive the chat?

This sounds like it could be the main performance problem. I would 
expect some form of COMET or even the long request pattern to improve 
your performance.

Regards,
Sven Helmberger

Re: Use couchdb as the data store for real-time chat room

Posted by Jan Lehnardt <ja...@apache.org>.
On 14 Mar 2009, at 12:32, howard chen wrote:

> Hi,
>
> On Sat, Mar 14, 2009 at 6:57 PM, Jason Davies  
> <ja...@jasondavies.com> wrote:
>> CouchDB gives you a lot more for
>> free (flexible schema, replication and in general designed for high
>> scalability) so I would recommend giving it a try.
>>
>
> Sure will have a try later on.
>
>
>>> 2. Is Memcached useless if I am using couchdb?
>>
>> I'm guessing memcached would still be useful as a caching layer to  
>> squeeze
>> that extra bit of speed if you have the RAM, but maybe someone else  
>> can
>> speak from experience.
>>
>
> The reason I ask is since couchdb output JSON, so ideal case is to let
> users contact couchdb directly in term of scalability. If adding a
> cache layer using Memcached which eat data from HTTP response seems
> too much overhead involved as Memcached is a also key-value store.
>
> Don't know if couchdb has some kind of query cache and how to tune?
> Oh..maybe I should use squid?

CouchDB uses the filesystem buffer as a cache. CouchDB's b-tree storage
is a fairly thin layer on top of the filesystem and you should get  
decent caching
characteristics out of the box. Of course, no requests are better than  
cached
requests. View results come with HTTP ETags that allow you to find out  
if a
client would need to re-fetch an item.

Squid* and Varnish do file caching as well where there might be faster  
than
CouchDB but probably not magnitudes. A memory-based HTTP cache might
help you (nginx & memcache), but I think you can go a long way with pure
CouchDB and HTTP caching alone.

(* There is an anecdote on the web where Squid tries to outsmart the  
filesystem
buffer cache and ends up doing two disk-to-memory operations where one
would do, I can't seem to find the reference right now).

Cheers
Jan
--


Re: Use couchdb as the data store for real-time chat room

Posted by howard chen <ho...@gmail.com>.
Hi,

On Sat, Mar 14, 2009 at 6:57 PM, Jason Davies <ja...@jasondavies.com> wrote:
> CouchDB gives you a lot more for
> free (flexible schema, replication and in general designed for high
> scalability) so I would recommend giving it a try.
>

Sure will have a try later on.


>> 2. Is Memcached useless if I am using couchdb?
>
> I'm guessing memcached would still be useful as a caching layer to squeeze
> that extra bit of speed if you have the RAM, but maybe someone else can
> speak from experience.
>

The reason I ask is since couchdb output JSON, so ideal case is to let
users contact couchdb directly in term of scalability. If adding a
cache layer using Memcached which eat data from HTTP response seems
too much overhead involved as Memcached is a also key-value store.

Don't know if couchdb has some kind of query cache and how to tune?
Oh..maybe I should use squid?

Howard

Re: Use couchdb as the data store for real-time chat room

Posted by Jason Davies <ja...@jasondavies.com>.
Hi Howard,

On 14 Mar 2009, at 10:20, howard chen wrote:

> 1. I know couchdb is good for simple key-value query, but is couchdb
> fast for range requests ? E.g. I want to get all message in between
> Add_Time -  Last_Time? How does it compare with MySQL range query on
> index?

Yes, CouchDB is extremely fast for range requests.  In fact, this is  
CouchDB's sweet spot.  Essentially, CouchDB's views allow you to  
create a B-Tree index (mapping keys to values) and then retrieve  
values from a range of keys (in a linear keyspace) very quickly.     
So, this would be ideal for range requests on documents indexed by  
time, like in your case.  I don't know how comparable such queries are  
with an equivalent on a MySQL index, but it would certainly be  
comparable, and CouchDB gives you a lot more for free (flexible  
schema, replication and in general designed for high scalability) so I  
would recommend giving it a try.

> 2. Is Memcached useless if I am using couchdb?

I'm guessing memcached would still be useful as a caching layer to  
squeeze that extra bit of speed if you have the RAM, but maybe someone  
else can speak from experience.

--
Jason Davies

www.jasondavies.com