You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Rodent of Unusual Size <Ke...@Golux.Com> on 2000/02/08 13:39:17 UTC

'Apache on {Beowolf,clusters}'

I'm getting more and more requests about whether/how Apache
will run on clusters.  It's becoming a FAQ.  Does anyone have
any documentation on using Apache in such configurations, or
on performance?  Dean? :-)
-- 
#ken    P-)}

Ken Coar                    <http://Golux.Com/coar/>
Apache Software Foundation  <http://www.apache.org/>
"Apache Server for Dummies" <http://Apache-Server.Com/>

Come to the first official Apache Software Foundation
Conference!  <http://ApacheCon.Com/>

Re: 'Apache on {Beowolf,clusters}'

Posted by th...@cnation.com.
"Kevin A. Burton" wrote:
> 
> Bill Stoddard wrote:
> >
> > > Rodent of Unusual Size wrote:
> > > >
> > > > I'm getting more and more requests about whether/how Apache
> > > > will run on clusters.  It's becoming a FAQ.  Does anyone have
> > > > any documentation on using Apache in such configurations, or
> > > > on performance?  Dean? :-)
> > >
> > > Why would anyone want to do this?  Why not put a ton of Apache boxes
> > > behind a LocalDirector or any other of the billions of load balancers.
> > >
> > I had the same question.  I don't know anything about Beowolf clustering,
> > but if you could establish a fast comm link (at bus speeds for example)
> > between the CPUs in the cluster, seems you could build some really scalable
> > web server farms (with appropriate modifications to Apache to exploit the
> > cluster).  Has anyone ever considered putting a CPU and some RAM on a PCI
> > card, installing Linux and Apache? Seems you could scale this quite nicely.
> > The PCI card Apache (with a kernel level cache?) could serve the static
> > content and off-load dynamic content to a back-end server. Or visa-versa.
> > The combinations are limitless. I'm suprised no one has turned this into a
> > business :-)
> >
> > Bill
> 
> My gut reaction is that this wouldn't help.  HTTP servers are usuually
> not CPU intensive.  Taking a file and putting it through a socket isn't
> CPU bound.  However you see HTTP servers with 100% CPU because it has
> 5000 users.
> 
> There is no way I can see of scaling this across more than say 2 - 4
> CPUs.


I also suspect that the Beowolf management nodes would become
bottleknecks. Distributing traffic with DNS round-robin would be far
more efficient.

LVS (www.linuxvirtualserver.org) seems to be the clustering solution
most people who ask about Beowolf and Apache would really want. I don't
think most of them really understand that Beowolf is for computationally
intensive applications and is not really designed to address network and
file system availability issues.

LocalDirector and other hardware solutions are probably also a much
better idea.

thornton

Re: 'Apache on {Beowolf,clusters}'

Posted by "Kevin A. Burton" <bu...@relativity.yi.org>.
Bill Stoddard wrote:
> 
> > Rodent of Unusual Size wrote:
> > >
> > > I'm getting more and more requests about whether/how Apache
> > > will run on clusters.  It's becoming a FAQ.  Does anyone have
> > > any documentation on using Apache in such configurations, or
> > > on performance?  Dean? :-)
> >
> > Why would anyone want to do this?  Why not put a ton of Apache boxes
> > behind a LocalDirector or any other of the billions of load balancers.
> >
> I had the same question.  I don't know anything about Beowolf clustering,
> but if you could establish a fast comm link (at bus speeds for example)
> between the CPUs in the cluster, seems you could build some really scalable
> web server farms (with appropriate modifications to Apache to exploit the
> cluster).  Has anyone ever considered putting a CPU and some RAM on a PCI
> card, installing Linux and Apache? Seems you could scale this quite nicely.
> The PCI card Apache (with a kernel level cache?) could serve the static
> content and off-load dynamic content to a back-end server. Or visa-versa.
> The combinations are limitless. I'm suprised no one has turned this into a
> business :-)
> 
> Bill

My gut reaction is that this wouldn't help.  HTTP servers are usuually
not CPU intensive.  Taking a file and putting it through a socket isn't
CPU bound.  However you see HTTP servers with 100% CPU because it has
5000 users.

There is no way I can see of scaling this across more than say 2 - 4
CPUs.

Kevin

-- 
Kevin A Burton
Senior Software Engineer
Kendara Inc
http://www.kendara.com
Mobile:  408-910-6145
Linux - The revolution will NOT be televised

Re: 'Apache on {Beowolf,clusters}'

Posted by Ben Laurie <be...@algroup.co.uk>.
Dean Gaudet wrote:
> in our architecture at CP we do the SSL and layer-3 load balancing with
> dedicated "proxy" boxes using code we wrote.

Is that code available?

Chers,

Ben.

--
SECURE HOSTING AT THE BUNKER! http://www.thebunker.net/hosting.htm

http://www.apache-ssl.org/ben.html

Y19100 no-prize winner!
http://www.ntk.net/index.cgi?back=2000/now0121.txt

Re: 'Apache on {Beowolf,clusters}'

Posted by Dean Gaudet <dg...@arctic.org>.
a CPU and some RAM on a PCI card would be about the same as an extra
CPU... or a couple extra CPUs in an SMP configuration.  except that it'd
be NUMA (non-uniform memory architecture).  you'd need some fancy
arbitration to get packets to the CPU-on-PCI... and more arbitration to
share disk i/o.  it'd be as difficult or worse to implement than SMP
systems.

SMP is frequently a waste on webservers... depends on the application
though.  if you're doing ssl, the extra cpu can really help.  if you're
just doing static stuff, the extra overhead of smp is a waste -- might as
well use a layer-3 switch and load balance between a couple disjoint
machines to crank up the i/o bandwidth.

i'm not sure what typical web apps would gain from a beowulf-style
cluster.

search engines, sure.  see hotbot for a good example of a clustered search
engine.  in this case though there's a reason for the cluster architecture
-- the index of web pages is too large to hold in the memory of a single
node.  they use commodity hardware, which means 2 Gb of RAM or so per node
these days.  the index is on the order of 100Gb?  maybe 200?  i'm not sure
any more.  the nodes use myrinet to communicate with each other to share
memory and disk access.

as an exercise, consider what it'd take to write a PVM mpm for apache-2.0.  
(see pvm.org i think).  the challenges will be -- what host do the
requests come into?  how do you handle failures?  how do you farm the
requests off to other hosts? pretty soon you'll start thinking about
tcp-to-tcp tunnelling.. and i don't think you'll find PVM really buys you
much at all.  at some point your architecture will start to overlap with
the "http router" concepts we've started talking about here.

there's a special case of cpu-on-a-card -- SSL accelerators.  i've been
doing some research in this area, and i'm not really sure what they gain,
but i could be wrong.  it seems to me that moore's law generally applied
to commodity cpus makes multi-cpu boxes doing SSL in software a better
choice than dedicated hardware accelerators (i.e. a dual 600Mhz x86 box
costs less and outperforms many accelerators).  but i'm biased because i
can't stand dealing with closed vendor kernel drivers, and libraries.

in our architecture at CP we do the SSL and layer-3 load balancing with
dedicated "proxy" boxes using code we wrote.  seems to work well enough.  
(we couldn't go with off-the-shelf layer-3 switches because we scale in
dimensions that other folks haven't considered yet.)

Dean

p.s. cp is hiring ;)

On Tue, 8 Feb 2000, Bill Stoddard wrote:

> 
> > Rodent of Unusual Size wrote:
> > >
> > > I'm getting more and more requests about whether/how Apache
> > > will run on clusters.  It's becoming a FAQ.  Does anyone have
> > > any documentation on using Apache in such configurations, or
> > > on performance?  Dean? :-)
> >
> > Why would anyone want to do this?  Why not put a ton of Apache boxes
> > behind a LocalDirector or any other of the billions of load balancers.
> >
> I had the same question.  I don't know anything about Beowolf clustering,
> but if you could establish a fast comm link (at bus speeds for example)
> between the CPUs in the cluster, seems you could build some really scalable
> web server farms (with appropriate modifications to Apache to exploit the
> cluster).  Has anyone ever considered putting a CPU and some RAM on a PCI
> card, installing Linux and Apache? Seems you could scale this quite nicely.
> The PCI card Apache (with a kernel level cache?) could serve the static
> content and off-load dynamic content to a back-end server. Or visa-versa.
> The combinations are limitless. I'm suprised no one has turned this into a
> business :-)
> 
> Bill
> 
> 


Re: 'Apache on {Beowolf,clusters}'

Posted by Bill Stoddard <st...@raleigh.ibm.com>.
> Rodent of Unusual Size wrote:
> >
> > I'm getting more and more requests about whether/how Apache
> > will run on clusters.  It's becoming a FAQ.  Does anyone have
> > any documentation on using Apache in such configurations, or
> > on performance?  Dean? :-)
>
> Why would anyone want to do this?  Why not put a ton of Apache boxes
> behind a LocalDirector or any other of the billions of load balancers.
>
I had the same question.  I don't know anything about Beowolf clustering,
but if you could establish a fast comm link (at bus speeds for example)
between the CPUs in the cluster, seems you could build some really scalable
web server farms (with appropriate modifications to Apache to exploit the
cluster).  Has anyone ever considered putting a CPU and some RAM on a PCI
card, installing Linux and Apache? Seems you could scale this quite nicely.
The PCI card Apache (with a kernel level cache?) could serve the static
content and off-load dynamic content to a back-end server. Or visa-versa.
The combinations are limitless. I'm suprised no one has turned this into a
business :-)

Bill


Re: 'Apache on {Beowolf,clusters}'

Posted by "Kevin A. Burton" <bu...@relativity.yi.org>.
Rodent of Unusual Size wrote:
> 
> I'm getting more and more requests about whether/how Apache
> will run on clusters.  It's becoming a FAQ.  Does anyone have
> any documentation on using Apache in such configurations, or
> on performance?  Dean? :-)

Why would anyone want to do this?  Why not put a ton of Apache boxes
behind a LocalDirector or any other of the billions of load balancers.

The benefit of Beowolf is in the processing power.  There is a huge
amount of configuration time and to take advantage of Beowolf's power
you really have to target your application for this.

My suggestion would be to make a "virtual" cluster by just having a ton
of Apache boxes with the same content.  I don't see any reason why
Beowolf would give you an advantage.

-- 

Kevin A Burton
http://relativity.yi.org
Message to SUN:  "Open Source Java!"
"For evil to win is for good men to do nothing."

Re: 'Apache on {Beowolf,clusters}'

Posted by Sam Talebbeik <sa...@ix.netcom.com>.
Ken, this may not be exactly what you are looking for but
I though it is useful information. There is a white paper
about using Linux clusters along with Apache on TurboLinux's
web page. The white paper about Cluster Server case study
is the one that mentions Apache. Here is the link:
http://www.turbolinux.com/product/whitepages.html

    Regards,
    Sam 

Rodent of Unusual Size wrote:
> 
> I'm getting more and more requests about whether/how Apache
> will run on clusters.  It's becoming a FAQ.  Does anyone have
> any documentation on using Apache in such configurations, or
> on performance?  Dean? :-)
> --
> #ken    P-)}
> 
> Ken Coar                    <http://Golux.Com/coar/>
> Apache Software Foundation  <http://www.apache.org/>
> "Apache Server for Dummies" <http://Apache-Server.Com/>
> 
> Come to the first official Apache Software Foundation
> Conference!  <http://ApacheCon.Com/>