You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cocoon.apache.org by Pier Fumagalli <pi...@betaversion.org> on 2003/02/05 03:46:51 UTC

[TIPS] Basic configurations of Apache 2.0 for Cocoon 2

After few discussions had face-to-face with some of you (Stefano on the
phone ranting about setting up Tomcat, Jeremy over lunch at my place few
weeks ago, and several others), and few odd questions popping out on the
list, I feel the need to tell you why my vision is so narrow when someone
touches the "Apache" argument.

As I said several times in the past 6 years, I've learnt how to use Apache
(1.3 first, and 2.0 lately) to suit my needs and I would never envision an
HTTP server running without it.

Given my "pragmatical" vision, it's hard to explain "why" I am so biased,
and probably the best way to come out-of-the-loophole is to share the few
things I learnt, and that make my everyday life of administrator easy...

So, those are few tips for those of you who wonder about my rants. (I should
really post those to the Wiki, but dammit, I don't know how to use it :-)


Why Apache as a front end?
--------------------------

Probably the first and most important question to answer is WHY it is so
important to have Apache HTTPd as a front-end for a website.

I believe that for anyone, there's nothing more annoying than hitting a web
page, waiting for a few seconds, and then seeing our favorite browser coming
up with "The connection was refused when attempting to contact
http://www.domain.tld/".

In my opinion (and my boss') it is unacceptable to have a "downtime" on a
website, and if that happens, whoever connects needs to know what's going
on, or, at least, we need to tell him something: "We are sorry, but
currently http://www.domain.tld/ is unavailable because of essential system
upgrades. We expect to resume all our services in less than 10 minutes.
Please, check back later" sounds so much better (maybe with our little nice
logo, and yada yada, yada).

When once I asked to Brian Behlendorf why Apache was doing some oddities in
the code, he responded "Call it defensive programming": this explains the
entire vision behind Apache: Apache, no matter what, can _not_ "go down" and
not respond to HTTP requests. This is the essence behind it and its design
is centered around this idea, so, in my opinion (and experience) it is that
one option allowing us to achieve our goal of "zero port 80 downtime".

Apache's design enforces a multi-process model: there is always a minimal
wrapper bound to port 80 (as safe and minimal as possible), spawning new OS
processes per request doing the work. This allows that even in the worst
case scenario (a segmentation violation in the code that dumps the entire OS
process), something will be sent back to the client.

A Java-based web server can not achieve this. Java is a single-process
environment and if something happens to it, it will just exit, unbinding
port 80 and leaving our clients with "connection refused".

There is another issue, important one, about security. Java does not support
switching user-ID after it's started, and under UNIX operating systems,
everyone knows that noone apart from "root" can bind to ports < 1024.

In our case it is a problem, I either decide to run my service as root (and
that is NOT a good idea), or I bind to some port > 1024 (usually 8080). But
then, the complexity arises when forwarding requests for port 80 (our usual
HTTP service) to a port above 1024 (8080). Either firewall packages, or port
remappers, any of those solution involves a some-degree of complexity.

Apache avoids all that. Being native, it can bind to ports < 1024 and run as
a non-privileged user, allowing us to run our servlet container (as well) as
a non privileged user.

But those are not the only advantages, Apache helps us in much much better
ways, and I hope, at this point to be able to show you what and how...


What Apache? How Apache?
------------------------

A very personal choice is what version of Apache you want to run. In my
following examples I will assume you're going to use Apache 2.0, as it is
now _stable_ and much more performing than the "old" 1.3.

It's now several months that most of the sites hosted by VNU (my employer)
are running 2.0 (apart from our old legacy "rolaren" server) and I never had
in my personal experience a single problem.

Apache 2.0, though, is somehow more "difficult" to build and configure: the
most difficult choice is the selection of the MPM (Multi-Process Module) to
use. Read the manual to choose what suits you best, but in my case the
"worker" MPM (multi-process, multi-threaded) is the one giving me the best
performance/solidity ratio.

The "www.apache.org" website, on the other hand, uses the "prefork" MPM
(multi-process, single threaded, exactly as Apache 1.3 did), but I feel that
under certain operating system it is slightly slower than "worker". Your
choice.

As a reference, I configure Apache 2.0 in the following way:

./configure \
    --with-mpm=worker \
    --enable-modules=all \
    --enable-mods-shared=all \
    --enable-proxy \
    --enable-proxy-http \
    --disable-ipv6

Basically, I use the "worker" module, all modules are compiled as DSO
modules (dynamically loaded, so that I can disable the ones I don't use),
including the proxy/proxy-http module, and I don't care for IPv6 support.


Connecting Cocoon
-----------------

As Stefano, I had several headaches trying to connect Apache and [name your
Servlet container of choice]. Mod_JK (JK2) doesn't work for me, mod_webapp
works for me, but just for me because I'm the author, and was forced to
sadly abandon its development, the only solution I see (and the one which
works best for me currently) is mod_proxy.

Mod_proxy is a nice little module, especially in Apache 2.0 where its
caching part is completely decoupled in another module (mod_cache), it's
very small, lightweight, and does the job...

Plus, you have the advantage to choose whatever servlet container you have
in the backend: Orion, WebSphere, Tomcat, Jetty, you name it, it supports
HTTP :-) (well, apart from ServletExec, but that's another story, and if
someone wants some hints, let me know).

Connecting Cocoon is _simple_: all you have to do is configure your servlet
container to run on a high port (8080 for example) and make sure it runs as
a non privileged user, make sure that it knows that is a proxied-HTTP server
(Cocoon, Jetty, Resin, Orion, ... They all have this concept, check out the
documentation), and configure Apache with those two lines:

    ProxyPass        / http://localhost:8080/
    ProxyPassReverse / http://localhost:8080/

The first one tells Apache that any whatsoever request (from / onwards) gets
"proxied" to localhost:8080, and the second one tells Apache to make sure
that any "Location" HTTP header coming back gets rewritten accordingly (just
in case if your Servlet container doesn't let you set the "proxied"
configuration).

That's _IT_. It runs, and it runs smoothly.


Trivially serving static files
------------------------------

Now, Apache is _definitely_ faster than any Java based servlet container in
serving files straight to HTTP clients. This is just because nowadays it
uses a kernel-based function called "sendfile", that makes its performances
far greater than anything than Java can do.

Using mod_proxy and the set of ProxyPass configuration directive doesn't
allow us to set a "pattern" to associate to resources to be served straight
off the filesystem, it only allows us to define exclusion lists and
processing lists.

In my example, then I will rewrite my configuration to make Apache serve
everything beginning with "/static/" straight out of my web-application,
without even touching the servlet container:

    # Make sure that my document root points to the root of the web
    # application (where the WEB-INF is located, for instance).
    DocumentRoot /export/webapps/cocoon

    # We don't proxy any request beginning with the keyword "/static/".
    # So, for example, "/static/logo.gif" will be served directly by
    # Apache from the "/export/webapps/cocoon/static/logo.gif file"
    ProxyPass        /static/ !

    # Another one for "favicon.ico", so that explorer and mozilla are happy
    ProxyPass        /favicon.ico !
    
    # And now we send back to the servlet engine everyting else that does
    # not begin with "/static/" or "/favicon.ico"
    ProxyPass        / http://localhost:8080/
    ProxyPassReverse / http://localhost:8080/

Simple, the "!" keyword in ProxyPass means "don't" :-)


The holding page
----------------

If you used one of the configurations above, you'll see that if your servlet
container is not respondong on port 8080 for any reason, you will get a nice
"Bad Gateway" error page (HTTP 502 Error).

As that page is quite ugly (I have to admit that the HTTPd freaks are not
good HTML artists), you might want to point your clients to a
better-designed page (or containing some lame excuse on why your servlet
container is down).

You can do that easily (again), by using the ErrorDocument directive. Note
that, though, the ErrorDocument directive requires a file (so it needs to be
non proxied). Either you get down nasty with your mod_alias configurations,
or simply, use the second configuration and include it in your webapp as a
static file. Anyway, what you have to specify in that case is simply:

    # If mod_proxy cannot connect to the servlet container, we want
    # to display a nice static page saying the reason
    ErrorDocument 502 /static/unavailable.html

If (for example) you wanted to use Server-Side-Includes to render your page
(it might be nice to display something like the host name, or the time when
the request was received, you can do so by using SHTML files. This is what I
use at home:

<html>
  <head>
    <title><!--#echo var="SERVER_NAME"-->: server off-line</title>
  </head>
  <body>
    <h3><!--#echo var="SERVER_NAME"-->: server off-line</h3>
    <p>
      We are sorry, but the server is temporarily unavailable due to
      maintenance. Our team is working to restore service as soon as
      possible.<br />
      In case of troubles, please feel free to contact our webmaster
      sending an email to
      <a href="mailto:<!--#echo var="SERVER_ADMIN"-->">
        &lt;<!--#echo var="SERVER_ADMIN"-->&gt;
      </a>.
    </p>
    <hr/>
    <p>
      <small>
        <!--#echo var="SERVER_SOFTWARE"--> running on
        <!--#echo var="SERVER_NAME"-->:<!--#echo var="SERVER_PORT"-->
        at <!--#echo var="DATE_LOCAL"-->.
      </small>
    </p>
  </body>
</html>

And to make it work properly this is how your httpd.conf will have to look
like:

    # Make sure that Server Side Includes are processed and sent
    # to the client with mime-type as text/html
    AddType text/html .shtml
    AddOutputFilter Includes .shtml

    # Make sure that our SHTMLs are processed in the static
    # directory
    <Directory "/export/webapps/cocoon">
        Options IncludesNoExec
    </Directory>

    # If mod_proxy cannot connect to the servlet container, we want
    # to display a nice static page saying the reason. This is a
    # SHTML page (using the Server-Side-Includes filter)
    ErrorDocument 502 /static/unavailable.shtml


Putting mod_proxy all together in one
-------------------------------------

Ok, now that we have seen how each piece gets together, let's try to put
them all together, adding also that any request to "/WEB-INF/" should be
forbidden straight away (there's no point in proxying them when we know that
the servlet container will block them all)

    # Make sure that my document root points to the root of the web
    # application (where the WEB-INF is located, for instance).
    DocumentRoot /export/webapps/cocoon

    # Make sure that Server Side Includes are processed and sent
    # to the client with mime-type as text/html
    AddType text/html .shtml
    AddOutputFilter Includes .shtml

    # Make sure that our SHTMLs are processed in the static
    # directory
    <Directory "/export/webapps/cocoon">
        Options +IncludesNoExec
    </Directory>

    # Block the stupid "WEB-INF" pseudo-url (god I wish web-applications
    # were designed with some intelligence... Ok, my fault as well)
    <Location /WEB-INF>
        Order deny,allow
        Deny from all
    </Location>

    # If mod_proxy cannot connect to the servlet container, we want
    # to display a nice static page saying the reason. This is a
    # SHTML page (using the Server-Side-Includes filter)
    ErrorDocument 502 /static/unavailable.shtml

    # We don't proxy any request beginning with the keyword "/static/".
    # So, for example, "/static/logo.gif" will be served directly by
    # Apache from the "/export/webapps/cocoon/static/logo.gif file"
    ProxyPass        /static/ !

    # Another one for "favicon.ico", so that explorer and mozilla are happy
    ProxyPass        /favicon.ico !
    
    # And now we send back to the servlet engine everyting else that does
    # not begin with "/static/" or "/favicon.ico"
    ProxyPass        / http://localhost:8080/
    ProxyPassReverse / http://localhost:8080/

Simple, easy, beautiful...


A more complex example: mod_rewrite
-----------------------------------

This is all nice and clean, but if we want to be really nasty, and starting
to serve (for example) all our GIF and JPG files straight via Apache, we
would need to use mod_rewrite.

I know, mod_rewrite is ugly, it uses PERL regular expressions (so, well,
it's even slightly slower), but mod_proxy is way to crummy, it's either "in"
or "out", and it takes over the whole world (you can't really do much else
after you said you're going to forward a URL).

So, mod_rewrite, even if it's ugly, even if it's slower, _is_ our solution.
With a couple of rules, we can take the configuration written above to the
extreme, and basically do WHATEVER we want with a URL _before_ it even knows
about a possible servlet container in the backend.

I suggest you to read _carefully_ the mod_rewrite documentation, but, as a
start, I'm going to rewrite what's written above, using rewrite and its
flags, from here on, you're on your own :-) :-)

    # Make sure that my document root points to the root of the web
    # application (where the WEB-INF is located, for instance).
    DocumentRoot /export/webapps/cocoon

    # Make sure that Server Side Includes are processed and sent
    # to the client with mime-type as text/html
    AddType text/html .shtml
    AddOutputFilter Includes .shtml

    # Make sure that our SHTMLs are processed in the static
    # directory
    <Directory "/export/webapps/cocoon">
        Options +IncludesNoExec
    </Directory>

    # If mod_proxy cannot connect to the servlet container, we want
    # to display a nice static page saying the reason. This is a
    # SHTML page (using the Server-Side-Includes filter)
    ErrorDocument 502 /static/unavailable.shtml

    # The nastiness begins, let's fire up the "rewrite engine"
    RewriteEngine On

    # Everything that starts with "/static" or "/static/" is served straight
    # through: no redirection, no proxying, no nothing, and the [L] flag
    # implies that if this rule is matched, no other matching must be
    # performed
    RewriteRule "^/static/?(.*)" "$0" [L]

    # Everything that starts with a NON-CASE-SENSITIVE match (the NC flag)
    # of "/WEB-INF" or "/WEB-INF/" is forbidden (the F flag). And again,
    # this is the last rule (the L flag), nothing will be processed by the
    # rewrite engine if this rule is matched
    RewriteRule "^/WEB-INF/?(.*)" "$0" [L,F,NC]

    # Everything ending in ".gif", ".jpg" or ".jpeg" will be served again
    # directly by Apache, no need to bother the servlet container. As above
    # this is the last rule as specified by the [L] flag at the end
    RewriteRule "^/(.*)\.gif$" "$0" [L]
    RewriteRule "^/(.*)\.(jpg|jpeg)$" "$0" [L]

    # Everything else not matched above needs to go to the servlet container
    # via HTTP listening on port 8080. The [P] flag (which is required)
    # implies that our requests will be handled by mod_proxy.
    RewriteRule "^/(.*)" "http://localhost:8080/$1" [P]

    # Make sure that if the servlet container specifies a "Location" HTTP
    # header during redirection starting with "http://localhost:8080/", we
    # can handle it and return to our client the effective (not real)
    # location we want to redirect them to. This is _essential_.
    ProxyPassReverse / http://localhost:8080/

As I mentioned before, ugly, but _really_ effective. In few lines we connect
the HTTP-based servlet container running Cocoon to Apache, we make sure that
if the servlet container falls over, we direct people to an appropriate
holding page, we serve all that is under /static, all GIF and all JPEG files
straight off without touching Cocoon and all the rest through our sitemap,
and as a free bonus, everything that ends in ".shtml" (from disk or from the
sitemap) will be passed through the Apache "Server-Side-Includes" filter
(mod_include, which is ugly, but sometimes _really_ effective)...


Conclusions
-----------

I hope to have cleared some of the doubts on Apache, and why I love it so
much... It is a hub, a hub embracing your website and making it work better,
faster, more reliably and exactly fine-tuned precisely as you (or your boss)
like it.

And you can trust Apache, I believe that our spirit, the spirit of the
entire Cocoon community is built on top on the original HTTPd vision of
let's make things work so nicely that the world won't have to look for
another solution...

HTTPd does it in its little piece of being an HTTP hub, Jetty does it in its
little piece of being a servlet container, Cocoon does it in its little
piece of being the best "web-application" framework available on the planet
right now. Together, those three little pieces _will_ conquer the world.

Have fun...

    Pier

(BTW, where the hell is Tomcat in this picture? :-)


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Steven Noels <st...@outerthought.org>.
Pier Fumagalli wrote:

> After few discussions had face-to-face with some of you (Stefano on the
> phone ranting about setting up Tomcat, Jeremy over lunch at my place few
> weeks ago, and several others), and few odd questions popping out on the
> list, I feel the need to tell you why my vision is so narrow when someone
> touches the "Apache" argument.

I added a link from http://wiki.cocoondev.org/Wiki.jsp?page=HowTos to 
your very nice howto. Thanks!

</Steven>
-- 
Steven Noels                            http://outerthought.org/
Outerthought - Open Source, Java & XML Competence Support Center
Read my weblog at            http://blogs.cocoondev.org/stevenn/
stevenn at outerthought.org                stevenn at apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Ivan Mikushin <iv...@openmechanics.net>.
Hi!

It's like a light in the end of a _dark_ tunnel! You saved people _tons_ 
of time. Great thanks!!!

Cheers,

Ivan.


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Andrew Savory <an...@luminas.co.uk>.
On Wed, 5 Feb 2003, Pier Fumagalli wrote:

> So, those are few tips for those of you who wonder about my rants.

Pier, that's fantastic!

The world will echo to the sound of thousands of httpd.conf files being
edited this morning, I'm sure.

> HTTPd does it in its little piece of being an HTTP hub, Jetty does it in its
> little piece of being a servlet container

Looking forward to your hints and tips on using Jetty ;-)


Andrew.

-- 
Andrew Savory                                Email: andrew@luminas.co.uk
Managing Director                              Tel:  +44 (0)870 741 6658
Luminas Internet Applications                  Fax:  +44 (0)700 598 1135
This is not an official statement or order.    Web:    www.luminas.co.uk

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Jeremy Quinn <je...@media.demon.co.uk>.
On Wednesday, Feb 5, 2003, at 02:46 Europe/London, Pier Fumagalli wrote:

<big-snip/>

> HTTPd does it in its little piece of being an HTTP hub, Jetty does it 
> in its
> little piece of being a servlet container, Cocoon does it in its little
> piece of being the best "web-application" framework available on the 
> planet
> right now. Together, those three little pieces _will_ conquer the 
> world.
>
> Have fun...
>

you are my hero!!!

Many thanks for this Pier.


regards Jeremy


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Stefano Mazzocchi <st...@apache.org>.
Miles Elam wrote:
> Stefano Mazzocchi wrote:
> 
>> So, the URI space should be designed with *NO* technological 
>> constraints imposed by the architecture.
>>
>> Just stating this loud and clear. 
> 
> 
> 
> Heh heh...nice.  Made my last post a moot point.  I'm seriously getting 
> tired of constantly being a few steps behind you at every item.  ;-)
> 
> Someone remind me to read *all* of the emails before responding to the 
> first email next time eh?

LOL :)

No worries. Resonation is good karma.

-- 
Stefano Mazzocchi                               <st...@apache.org>
    Pluralitas non est ponenda sine necessitate [William of Ockham]
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Miles Elam <mi...@pcextremist.com>.
Stefano Mazzocchi wrote:

> So, the URI space should be designed with *NO* technological 
> constraints imposed by the architecture.
>
> Just stating this loud and clear. 


Heh heh...nice.  Made my last post a moot point.  I'm seriously getting 
tired of constantly being a few steps behind you at every item.  ;-)

Someone remind me to read *all* of the emails before responding to the 
first email next time eh?

- Miles



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [OT] Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
On 6/2/03 12:15 pm, "Pier Fumagalli" <pi...@betaversion.org> wrote:

> On 6/2/03 1:39 am, "Miles Elam" <mi...@pcextremist.com> wrote:
> 
>> Here's a question for all you HTTPd heads out there.  ;-)
>> 
>> Is it possible now or reasonably straightforward to have HTTPd look for
>> a static file and, upon failure, look up a fallback resource?  For
>> example, if a user requests "/images/foo.png", HTTPd would look up the
>> file on the filesystem.  If the file wasn't there, it would pass it to
>> the servlet engine (or whatever dynamic process available).  Hell, even
>> better: "/images/foo" that invokes HTTPd's content negotiation and then
>> checks the dynamic pool(s).
>> 
>> It's got filters now to pass the output of one module to the input of
>> another, but what about a process similar to nested try/catch blocks?
>> The first, most efficient method fails, fall back to the next, and the
>> next...
> 
> Nope... (It would be just plain wrong IMO).

I stand corrected by Stefan... You _never_ stop learning.


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [OT] Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
On 6/2/03 1:39 am, "Miles Elam" <mi...@pcextremist.com> wrote:

> Here's a question for all you HTTPd heads out there.  ;-)
> 
> Is it possible now or reasonably straightforward to have HTTPd look for
> a static file and, upon failure, look up a fallback resource?  For
> example, if a user requests "/images/foo.png", HTTPd would look up the
> file on the filesystem.  If the file wasn't there, it would pass it to
> the servlet engine (or whatever dynamic process available).  Hell, even
> better: "/images/foo" that invokes HTTPd's content negotiation and then
> checks the dynamic pool(s).
> 
> It's got filters now to pass the output of one module to the input of
> another, but what about a process similar to nested try/catch blocks?
> The first, most efficient method fails, fall back to the next, and the
> next...

Nope... (It would be just plain wrong IMO).

    Pier


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


[OT] Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Miles Elam <mi...@pcextremist.com>.
Here's a question for all you HTTPd heads out there.  ;-)

Is it possible now or reasonably straightforward to have HTTPd look for 
a static file and, upon failure, look up a fallback resource?  For 
example, if a user requests "/images/foo.png", HTTPd would look up the 
file on the filesystem.  If the file wasn't there, it would pass it to 
the servlet engine (or whatever dynamic process available).  Hell, even 
better: "/images/foo" that invokes HTTPd's content negotiation and then 
checks the dynamic pool(s).

It's got filters now to pass the output of one module to the input of 
another, but what about a process similar to nested try/catch blocks?  
The first, most efficient method fails, fall back to the next, and the 
next...

- Miles



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


[OT] Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
On 5/2/03 21:34, "Stefano Mazzocchi" <st...@apache.org> wrote:

>> (Note to self: Even remotely, Stefano is hinting my subconscious mind to get
>> down to my knees and write C code, remember to flame him for that one day)
> 
> ihihihih :)

Junk-head! :-)

    Pier


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Stefano Mazzocchi <st...@apache.org>.
Pier Fumagalli wrote:

>>Please, don't get me wrong: Pier suggestions are great and I'm so glad I
>>convinced him to come back to work here on cocoonland, but we must not
>>sit on those sysadmin tricks (even if clever and powerful) we must
>>design a system that stands the pressure both from a technological *and*
>>a human point of view.
>>
>>So, the URI space should be designed with *NO* technological constraints
>>imposed by the architecture.
>>
>>Just stating this loud and clear.
> 
> 
> Absolutely... I know that the solution is far from perfect, but if your
> website works today, Stefano, it's because of the hack! :-)

I know. I was just making sure that others knew what we already 
discussed privately.

> That doesn't imply that we should sit on our asses and use that forever...
> It works now because there's nothing better at the moment, nothing allowing
> us to do things in a less hacky way, or faster...
> 
> Note the words I used describing rewrite:
> 
>     [...] if we want to be really nasty, and starting to serve (for example)
>     all our GIF and JPG files straight via Apache, we would need to use
>     mod_rewrite. I know, mod_rewrite is ugly [...]
> 
> There will be a friggin' working solution one day, it's just not there
> yet...

Exactly. So, people, the moral of the story is: HTTPd can give you a lot 
of power and you can *abuse it* a lot. But we should not rest on those 
hacks but we should look for architectural solutions that last.

>     Pier
> 
> (Note to self: Even remotely, Stefano is hinting my subconscious mind to get
> down to my knees and write C code, remember to flame him for that one day)

ihihihih :)

-- 
Stefano Mazzocchi                               <st...@apache.org>
    Pluralitas non est ponenda sine neccesitate [William of Ockham]
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Stefano Mazzocchi <st...@apache.org>.
Niclas Hedhman wrote:
> On Thursday 06 February 2003 02:26, Pier Fumagalli wrote:
> 
>>Oh, absolutely, I don't like that either... But in my specific case (VNU,
>>with somewhat 8000/9000 hits per minute (we peak up sometimes at 400 hits
>>per second), we need to use tricks (yes, tricks/hacks) to make that happen,
>>because passing through (even on localhost) binary data at that rate is
>>sometimes overkilling...
> 
> 
> Shouldn't you in this case have front-end load balancing routers (various 
> topologies available) to lower the load on each server?
> 
> Sounds like you are spending thousands of engineering dollars in optimization 
> solutions, when the same dollars can buy you a quick and reliable solution 
> off-the-shelf.... And you can spend that engineering time on something more 
> useful.

ahahahahah

tell you what: until now, Pier thought those tricks were *FUN*.

But now that he rediscovered that there are smarter (and more 
long-lasting and more emotionally rewarding etc etc) ways of having fun 
(workign with flow, for example), I'm sure he'll start spending his time 
differently :)

-- 
Stefano Mazzocchi                               <st...@apache.org>
    Pluralitas non est ponenda sine necessitate [William of Ockham]
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
On 7/2/03 15:21, "Gianugo Rabellino" <gi...@apache.org> wrote:

> Stefano Mazzocchi wrote:
> 
>>>> Are we looking into a mod_cocoon somewhere down the line?? (That
>>>> would be fun)
>>> 
>>> 
>>> I don't believe that this community wants to see C code in the CVS
>>> repository :-) :-)
>> 
>> 
>> I woudn't mind at all. Somebody else would?
>> 
> 
> Definitely not, I'd be even glad to help, even though my C skills should
> be rusty by now to say the least... time to reread K&R? ;-)

We'll see then... :-)

    Pier


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Gianugo Rabellino <gi...@apache.org>.
Stefano Mazzocchi wrote:

>>> Are we looking into a mod_cocoon somewhere down the line?? (That 
>>> would be fun)
>>
>>
>> I don't believe that this community wants to see C code in the CVS
>> repository :-) :-)
> 
> 
> I woudn't mind at all. Somebody else would?
> 

Definitely not, I'd be even glad to help, even though my C skills should 
be rusty by now to say the least... time to reread K&R? ;-)

Ciao,

-- 
Gianugo Rabellino
CTO
Pro-netics s.r.l.
http://www.pro-netics.com


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Stefano Mazzocchi <st...@apache.org>.
Pier Fumagalli wrote:

>>Are we looking into a mod_cocoon somewhere down the line?? (That would be fun)
> 
> I don't believe that this community wants to see C code in the CVS
> repository :-) :-)

I woudn't mind at all. Somebody else would?

> The ideas are there, and a partial implementation as well... Right now it's
> just too easy to use mod_proxy and let the baby live on its own...

Agreed.

> One thing IMVHO we should focus on beforehands would be to have a "clean"
> (minimal) Cocoon distribution... Removing the hard-coded "WEB-INF" paths and
> write some more doccos...

Definately. Those speed optimizations on the native side have a very low 
priority compared to the rest of the cocoon work left to do for 2.1

-- 
Stefano Mazzocchi                               <st...@apache.org>
    Pluralitas non est ponenda sine necessitate [William of Ockham]
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
On 7/2/03 2:01, "Niclas Hedhman" <ni...@hedhman.org> wrote:
> On Thursday 06 February 2003 20:27, Pier Fumagalli wrote:
>> On 6/2/03 3:47 am, "Niclas Hedhman" <ni...@hedhman.org> wrote:
>>
>>> Shouldn't you in this case have front-end load balancing routers (various
>>> topologies available) to lower the load on each server?
>> 
>> Wish I could... I can't have (because of a fucked up design and
>> implementation before I came on board) 2 instances running (they clash on
>> the database side... Who is the idiot storing the status of a cache in the
>> servlet container in the database where the data to be cached is? Jules!)
> 
> Hmmm. Maybe we should re-introduce the death-penalty for programmers....

Well, one way or another the guy is no more for us, soooo...

>> It's not thousands of dollars, boy I'm _not_ that expensive... :-)
> 
> No?? $500 a day?

Gosh! I _wish_ :-) That would put me down around the 80k pounds/year...
Not even close! :-)

<note>
  The first one who offers me a job thinking that I'm interested in money,
  will be personally flamed by the underwritten author of this email...
</note>

>> Anyhow, the "who cares? Go and buy something off-the-shelf" attitude is
>> _so_ wrong... Ok, I'll go shopping in Tottenham Ct. Road and forget about
>> thinking that there might be a better and more intelligent way out???
>> 
>> Gee, with that attitude noone would have even thought about writing Apache
>> 2, you have such a perfect solution going and buying M$IIS off the shelf
>> :-)
> 
> Well, the idea of buying a solution to "load balancing" is to free up time to
> invent in areas where no solution at all exists.
> No matter how much you optimize a piece of software, you will always have a
> peak limit. Question would then be, how much time to spend for how much
> improvement, and until we reach which limit?

Well, the limit is "cleanliness"... Once a piece of software is "clean",
"linear", it is, at the same time, quite well optimized...

Probably my rants are not about absolute performance (which I can get out of
mere mod_rewrite hacks!) but on how one thing is "clean" and "linear"... I
have this feeling that easier it is, faster it is...

> Let me also say, that there is an enormous difference between I spending 2
> weeks to save $3000 of purchases (which I probably won't do, "why bother"),
> compare to me spending 2 weeks to save a whole community $100 each in
> purchases (which is a lot easier to do, "grateful people").

And if you spend 4 weekends (they come for free) to save a whole community
$3000 of purchases? :-)

> Anyway Pier, I am happy that you (and Stefano even more so) is expressing
> "concerns" (mildly) about Tomcat and its direction. I never really liked it
> (mostly stemmed from its early problems with configurations) and stuck with
> Apache+JServ (mostly out of laziness "It ain't broke").

That's why I resorted to Jetty, at the end... It's clean enough, it does its
dirty job of serving friggin' pages, and it's 1/10th of the size of Tomcat.

I also like ServletExec (same approach), but too bad that it's not open, and
sometimes I spend some time reverse-engineering it to fix some bugs! :-)

> Are we looking into a mod_cocoon somewhere down the line?? (That would be fun)

I don't believe that this community wants to see C code in the CVS
repository :-) :-)

The ideas are there, and a partial implementation as well... Right now it's
just too easy to use mod_proxy and let the baby live on its own...

One thing IMVHO we should focus on beforehands would be to have a "clean"
(minimal) Cocoon distribution... Removing the hard-coded "WEB-INF" paths and
write some more doccos...

    Pier


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Niclas Hedhman <ni...@hedhman.org>.
On Thursday 06 February 2003 20:27, Pier Fumagalli wrote:
> On 6/2/03 3:47 am, "Niclas Hedhman" <ni...@hedhman.org> wrote:
> > Shouldn't you in this case have front-end load balancing routers (various
> > topologies available) to lower the load on each server?
>
> Wish I could... I can't have (because of a fucked up design and
> implementation before I came on board) 2 instances running (they clash on
> the database side... Who is the idiot storing the status of a cache in the
> servlet container in the database where the data to be cached is? Jules!)

Hmmm. Maybe we should re-introduce the death-penalty for programmers....

> It's not thousands of dollars, boy I'm _not_ that expensive... :-)

No?? $500 a day?

> Anyhow, the "who cares? Go and buy something off-the-shelf" attitude is
> _so_ wrong... Ok, I'll go shopping in Tottenham Ct. Road and forget about
> thinking that there might be a better and more intelligent way out???
>
> Gee, with that attitude noone would have even thought about writing Apache
> 2, you have such a perfect solution going and buying M$IIS off the shelf
> :-)

Well, the idea of buying a solution to "load balancing" is to free up time to 
invent in areas where no solution at all exists.
No matter how much you optimize a piece of software, you will always have a 
peak limit. Question would then be, how much time to spend for how much 
improvement, and until we reach which limit?

Let me also say, that there is an enormous difference between I spending 2 
weeks to save $3000 of purchases (which I probably won't do, "why bother"), 
compare to me spending 2 weeks to save a whole community $100 each in 
purchases (which is a lot easier to do, "grateful people").

Anyway Pier, I am happy that you (and Stefano even more so) is expressing 
"concerns" (mildly) about Tomcat and its direction. I never really liked it 
(mostly stemmed from its early problems with configurations) and stuck with 
Apache+JServ (mostly out of laziness "It ain't broke").

Are we looking into a mod_cocoon somewhere down the line?? (That would be fun)


Niclas

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
On 6/2/03 3:47 am, "Niclas Hedhman" <ni...@hedhman.org> wrote:

> On Thursday 06 February 2003 02:26, Pier Fumagalli wrote:
>> Oh, absolutely, I don't like that either... But in my specific case (VNU,
>> with somewhat 8000/9000 hits per minute (we peak up sometimes at 400 hits
>> per second), we need to use tricks (yes, tricks/hacks) to make that happen,
>> because passing through (even on localhost) binary data at that rate is
>> sometimes overkilling...
> 
> Shouldn't you in this case have front-end load balancing routers (various
> topologies available) to lower the load on each server?

Wish I could... I can't have (because of a fucked up design and
implementation before I came on board) 2 instances running (they clash on
the database side... Who is the idiot storing the status of a cache in the
servlet container in the database where the data to be cached is? Jules!)

Anyhow. Yes, when I rewrite the entire site to run with Cocoon, or with
anything else that allows me to have several instances running at the same
time, well, maybe I will...

> Sounds like you are spending thousands of engineering dollars in optimization
> solutions, when the same dollars can buy you a quick and reliable solution
> off-the-shelf.... And you can spend that engineering time on something more
> useful.

It's not thousands of dollars, boy I'm _not_ that expensive... :-)

Anyhow, the "who cares? Go and buy something off-the-shelf" attitude is _so_
wrong... Ok, I'll go shopping in Tottenham Ct. Road and forget about
thinking that there might be a better and more intelligent way out???

Gee, with that attitude noone would have even thought about writing Apache
2, you have such a perfect solution going and buying M$IIS off the shelf :-)
:-) :-)

    Pier


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Niclas Hedhman <ni...@hedhman.org>.
On Thursday 06 February 2003 02:26, Pier Fumagalli wrote:
> Oh, absolutely, I don't like that either... But in my specific case (VNU,
> with somewhat 8000/9000 hits per minute (we peak up sometimes at 400 hits
> per second), we need to use tricks (yes, tricks/hacks) to make that happen,
> because passing through (even on localhost) binary data at that rate is
> sometimes overkilling...

Shouldn't you in this case have front-end load balancing routers (various 
topologies available) to lower the load on each server?

Sounds like you are spending thousands of engineering dollars in optimization 
solutions, when the same dollars can buy you a quick and reliable solution 
off-the-shelf.... And you can spend that engineering time on something more 
useful.

Niclas

---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
"Stefano Mazzocchi" <st...@apache.org> wrote:

> Don't want to rain on the party here, but I don't like the /static and
> mod_rewrite approach at all (sorry Pier),

Oh, absolutely, I don't like that either... But in my specific case (VNU,
with somewhat 8000/9000 hits per minute (we peak up sometimes at 400 hits
per second), we need to use tricks (yes, tricks/hacks) to make that happen,
because passing through (even on localhost) binary data at that rate is
sometimes overkilling...

It's a hack, but when you deal with _big_ numbers, oh boy, hacks can save
you so many sleepless nights... :-)

> [...]
> The best solution should be a proxy-ing rock-solid HTTP stack up front
> with a light-speed binary-oriented cache and cocoon doing *all* the
> resource generation (and avoiding to cache binary results itself).

A some sort of mod_cocoon_cache... And tell you what? Apache 2.0 was born
with THAT in mind... Look here:

<http://httpd.apache.org/docs-2.0/mod/mod_cache.html>

As you can see mod_cache already relies on a pluggable storage manager. Just
need to write one tied up with the cocoon cache and TADAAAA! it all works
nicely!

> *THIS* is the nirvana of the web serving architectures, because it
> follows the REST architectural practices, it's friendly with unlimited
> proxy distribution (think of backbone proxying in between your request
> and the hosting server)

I believe that we talked about it extensively enough to share the same
vision, right? :-) :-)

> designing the URI space so that it routes around technological problems
> is a potentially *very* dangerous approach and I don't like it.

I know, but noone wrote some pieces of code (yet)... _RIGHT_NOW_ the only
way allowing you to do stuff like that is... well... the hack! :-)

> the transparent proxy/cache solution *is* the key and reduces
> administration overheads (and worldwide load distribution problems) by
> orders of magnitude.

Oh, did you notice that also the mod_proxy approach is pluggable in Apache
2.0? You don't really need to write a new mod_jk to make that work, all one
needs to do is write a new backend system if we want to get more
performances out of the baby...

> Please, don't get me wrong: Pier suggestions are great and I'm so glad I
> convinced him to come back to work here on cocoonland, but we must not
> sit on those sysadmin tricks (even if clever and powerful) we must
> design a system that stands the pressure both from a technological *and*
> a human point of view.
> 
> So, the URI space should be designed with *NO* technological constraints
> imposed by the architecture.
> 
> Just stating this loud and clear.

Absolutely... I know that the solution is far from perfect, but if your
website works today, Stefano, it's because of the hack! :-)

That doesn't imply that we should sit on our asses and use that forever...
It works now because there's nothing better at the moment, nothing allowing
us to do things in a less hacky way, or faster...

Note the words I used describing rewrite:

    [...] if we want to be really nasty, and starting to serve (for example)
    all our GIF and JPG files straight via Apache, we would need to use
    mod_rewrite. I know, mod_rewrite is ugly [...]

There will be a friggin' working solution one day, it's just not there
yet...

    Pier

(Note to self: Even remotely, Stefano is hinting my subconscious mind to get
down to my knees and write C code, remember to flame him for that one day)


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Stefano Mazzocchi <st...@apache.org>.
Gianugo Rabellino wrote:
> Pier Fumagalli wrote:
> 
>> "Pier Fumagalli" <pi...@betaversion.org> wrote:
>>
>>> Ouch... I _knew_ I forgot somethin0g yesterday night, well, time to do a
>>> small addition (craps, I have to _really_ learn how to use the Wiki 
>>> now):
>>
>>
>>
>> I did: <http://wiki.cocoondev.org/Wiki.jsp?page=ApacheModProxy>
>>
> 
> Cool!
> 
> How about adding a few lines about enabling caching on the proxy 
> (mod_cache) and coupling it with the explanation of the cache tuning 
> (expires headers) on the cocoon side? I can volunteer for the Cocoon 
> part, but I've never used the new mod_cache (I'm still in 1.3 land).

Don't want to rain on the party here, but I don't like the /static and 
mod_rewrite approach at all (sorry Pier), but I'd much rather see a 
transparent proxy setup like the one that Gianugo hints above.

Why? simple: if you do use mod_rewrite up front to have all *.jpg served 
from the local disk you have several problems:

1) dynamic image generation (think of image gallery thumbnails) doesn't 
work.

2) the URI space of the static resources must be mapped *directly* into 
the disk.

3) it's *not* future compatible (think of /images/logo being rendered 
differently depending on the user agent)

4) it forces people to work on two different environments, one for 
static files, one for dynamic ones, making it more difficult to migrate 
one into another.

The best solution should be a proxy-ing rock-solid HTTP stack up front 
with a light-speed binary-oriented cache and cocoon doing *all* the 
resource generation (and avoiding to cache binary results itself).

*THIS* is the nirvana of the web serving architectures, because it 
follows the REST architectural practices, it's friendly with unlimited 
proxy distribution (think of backbone proxying in between your request 
and the hosting server)

designing the URI space so that it routes around technological problems 
is a potentially *very* dangerous approach and I don't like it.

the transparent proxy/cache solution *is* the key and reduces 
administration overheads (and worldwide load distribution problems) by 
orders of magnitude.

Please, don't get me wrong: Pier suggestions are great and I'm so glad I 
  convinced him to come back to work here on cocoonland, but we must not 
sit on those sysadmin tricks (even if clever and powerful) we must 
design a system that stands the pressure both from a technological *and* 
a human point of view.

So, the URI space should be designed with *NO* technological constraints 
imposed by the architecture.

Just stating this loud and clear.

-- 
Stefano Mazzocchi                               <st...@apache.org>
    Pluralitas non est ponenda sine neccesitate [William of Ockham]
--------------------------------------------------------------------



---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
"Gianugo Rabellino" <gi...@apache.org> wrote:

> Pier Fumagalli wrote:
>> "Pier Fumagalli" <pi...@betaversion.org> wrote:
>> 
>>> Ouch... I _knew_ I forgot somethin0g yesterday night, well, time to do a
>>> small addition (craps, I have to _really_ learn how to use the Wiki now):
>> 
>> 
>> I did: <http://wiki.cocoondev.org/Wiki.jsp?page=ApacheModProxy>
>> 
> 
> Cool!
> 
> How about adding a few lines about enabling caching on the proxy
> (mod_cache) and coupling it with the explanation of the cache tuning
> (expires headers) on the cocoon side? I can volunteer for the Cocoon
> part, but I've never used the new mod_cache (I'm still in 1.3 land).

You start, I'm swamped ATM! :-)

Anyhow, just as a FYI, I'm going to talk about this whole stuff at NordU
(Usenix) next week in Sweden...

<http://www.nordu.org/NordU2003/technical-session-1.1625.html>

So, yeah, I'll have to throw in mod_cache as well somehow... :-)

    Pier

BTW, anyone of you freaks is in Sweden next weekend (Vasteras/Stockolm),
I'll be around thurs and fri to get back to London on sat... And I have no
clue about what's going on over there... :-)


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Gianugo Rabellino <gi...@apache.org>.
Pier Fumagalli wrote:
> "Pier Fumagalli" <pi...@betaversion.org> wrote:
> 
>>Ouch... I _knew_ I forgot somethin0g yesterday night, well, time to do a
>>small addition (craps, I have to _really_ learn how to use the Wiki now):
> 
> 
> I did: <http://wiki.cocoondev.org/Wiki.jsp?page=ApacheModProxy>
> 

Cool!

How about adding a few lines about enabling caching on the proxy 
(mod_cache) and coupling it with the explanation of the cache tuning 
(expires headers) on the cocoon side? I can volunteer for the Cocoon 
part, but I've never used the new mod_cache (I'm still in 1.3 land).

Ciao,

-- 
Gianugo Rabellino
CTO
Pro-netics s.r.l.
http://www.pro-netics.com


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
"Pier Fumagalli" <pi...@betaversion.org> wrote:
> 
> Ouch... I _knew_ I forgot somethin0g yesterday night, well, time to do a
> small addition (craps, I have to _really_ learn how to use the Wiki now):

I did: <http://wiki.cocoondev.org/Wiki.jsp?page=ApacheModProxy>

    Pier


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Jeremy Quinn <je...@media.demon.co.uk>.
On Wednesday, Feb 5, 2003, at 15:13 Europe/London, Pier Fumagalli wrote:

>> Brilliant!!!! It is working, hurrah!
>
> I knew you would have liked it! ;-) (Re: per our discussion over 
> salads!)

you DID finish that salad right ? ;)
(Stefano cooks great pasta!!)

>
> And that's all... You can roughly copy and paste this example in a
> <VirtualHost> section of your httpd.conf (obviously after having 
> applied the
> appropriate modification), and go...
>


You are making it look too easy!!!

Many thanks, I'll try this out.

In my setup, I added :

	Include /Library/Apache2/conf/extra

to the end of my httpd.conf, so I can keep my Cocoon config in a 
separate file in the 'extra' folder and don't need to touch the 
original.

The only change I had to make to the original httpd.conf was to change 
the default character set from ISO-8859-1 to UTF-8 as this is what my 
Cocoon outputs and Apache was re-encoding.

so

	AddDefaultCharset ISO-8859-1

becomes

	AddDefaultCharset UTF-8


Bingo!!!


Many thanks

regards Jeremy


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Pier Fumagalli <pi...@betaversion.org>.
"Jeremy Quinn" <je...@media.demon.co.uk> wrote:
> On Wednesday, Feb 5, 2003, at 02:46 Europe/London, Pier Fumagalli wrote:
> 
>> After few discussions had face-to-face with some of you (Stefano on the
>> phone ranting about setting up Tomcat, Jeremy over lunch at my place
>> few
>> weeks ago, and several others), and few odd questions popping out on
>> the
>> list, I feel the need to tell you why my vision is so narrow when
>> someone
>> touches the "Apache" argument.
> 
> Brilliant!!!! It is working, hurrah!

I knew you would have liked it! ;-) (Re: per our discussion over salads!)

> One thing I would like to try, is to make Apache handle the display of
> any 404s, 500s etc. caught by TomCat/Cocoon. (Not switched to Jetty
> yet).

Oh, you won't have to change anything on the Apache front if you switch
container... You won't have to use a different module! :-)

> ie. I would like our production server to have Apache outputting a
> common style of error pages, while keeping Cocoon outputting error
> pages (with stacktraces etc.) only on the staging server (so developers
> can see the cause).
> 
> Obviously I can get Cocoon to output the same style of error page as
> Apache's ones, but I'd prefer to get Cocoon to offload all error
> reporting to Apache if possible.
> 
> Anyone know how to do this?

Ouch... I _knew_ I forgot somethin0g yesterday night, well, time to do a
small addition (craps, I have to _really_ learn how to use the Wiki now):

Letting Apache to handle error pages
------------------------------------

Whenever we want Apache to handle error messages in a consistent way
(basically overwriting what Cocoon writes as a body in error pages), we can
do that by simply adding a few lines to the configurations we used before:

    # Make sure that Apache processes the headers coming back from the proxy
    # requests. This will enable also the evaluation of HTTP status codes.
    ProxyPassReverse / http://localhost:8000/

    # Tell mod_mod proxy that it should not send back the body-content of
    # error pages, but be fascist and use its local error pages if the
    # remote HTTP stack is sending an HTTP 4xx or 5xx status code.
    ProxyErrorOverride On

    # For each individual error we want to handle, let's specify what file
    # we want to use. Note that all files must be available through a
    # locally accessible directory (as our /static/), and they can even be
    # SSI files (SHTML files).
    ErrorDocument 404 /static/notfound.shtml
    ErrorDocument 500 /static/error.shtml
    ErrorDocument 502 /static/unavailable.shtml

This is how it can be done, so that (for example, as suggested by Jeremy),
one can configure Cocoon to dump full-stack-traces on the staging server,
(or from an interface available only to the internal network), while
displaying nicely formatted error messages to our client.


Preserving the Host header through a proxy
------------------------------------------

In some cases, it is quite important to preserve the "Host" header
throughout the proxied request.

For example, to be able to deal with multiple virtual hosts on the backend
servlet container, the proxied request  MUST include the original Host name
requested by our client. Apache allows us to pass this value through using
the ProxyPreserveHost directive:

    # Make sure that the virtual host name is passed through to the
    # backend servlet container for virtual host support.
    ProxyPreserveHost On


Putting it all together
-----------------------

Linking together all the different pieces we've analyzed before, now, we can
attempt to write up a do-it-all fragment of our httpd.conf file:


    #######################################################################
    # GLOBAL CONFIGURATIONS                                               #
    #######################################################################

    # Make sure that my document root points to the root of the web
    # application (where the WEB-INF is located, for instance).
    DocumentRoot /export/webapps/cocoon

    # Make sure that Server Side Includes are processed and sent
    # to the client with mime-type as text/html
    AddType text/html .shtml
    AddOutputFilter Includes .shtml

    # Make sure that our SHTMLs are processed in the static
    # directory
    <Directory "/export/webapps/cocoon">
        Options +IncludesNoExec
    </Directory>

    #######################################################################
    # ERROR PAGES CONFIGURATION                                           #
    #######################################################################

    # If mod_proxy cannot connect to the servlet container, we want
    # to display a nice static page saying the reason. This is a
    # SHTML page (using the Server-Side-Includes filter)
    ErrorDocument 502 /static/unavailable.shtml

    # For each individual error we want to handle, let's specify what file
    # we want to use. Note that all files must be available through a
    # locally accessible directory (as our /static/), and they can even be
    # SSI files (SHTML files).
    ErrorDocument 404 /static/notfound.shtml
    ErrorDocument 500 /static/error.shtml

    #######################################################################
    # MOD_PROXY CONFIGURATIONS                                            #
    #######################################################################

    # Make sure that if the servlet container specifies a "Location" HTTP
    # header during redirection starting with "http://localhost:8080/", we
    # can handle it and return to our client the effective (not real)
    # location we want to redirect them to. This is _essential_ to handle
    # also the error returned by the backend servlet container.
    ProxyPassReverse / http://localhost:8080/

    # Make sure that the virtual host name is passed through to the
    # backend servlet container for virtual host support.
    ProxyPreserveHost On

    # Tell mod_mod proxy that it should not send back the body-content of
    # error pages, but be fascist and use its local error pages if the
    # remote HTTP stack is sending an HTTP 4xx or 5xx status code.
    ProxyErrorOverride On

    #######################################################################
    # MOD_REWRITE CONFIGURATIONS                                          #
    #######################################################################

    # The nastiness begins, let's fire up the "rewrite engine"
    RewriteEngine On

    # Everything that starts with "/static" or "/static/" is served straight
    # through: no redirection, no proxying, no nothing, and the [L] flag
    # implies that if this rule is matched, no other matching must be
    # performed
    RewriteRule "^/static/?(.*)" "$0" [L]

    # Everything that starts with a NON-CASE-SENSITIVE match (the NC flag)
    # of "/WEB-INF" or "/WEB-INF/" is forbidden (the F flag). And again,
    # this is the last rule (the L flag), nothing will be processed by the
    # rewrite engine if this rule is matched
    RewriteRule "^/WEB-INF/?(.*)" "$0" [L,F,NC]

    # Everything ending in ".gif", ".jpg" or ".jpeg" will be served again
    # directly by Apache, no need to bother the servlet container. As above
    # this is the last rule as specified by the [L] flag at the end
    RewriteRule "^/(.*)\.gif$" "$0" [L]
    RewriteRule "^/(.*)\.(jpg|jpeg)$" "$0" [L]

    # Everything else not matched above needs to go to the servlet container
    # via HTTP listening on port 8080. The [P] flag (which is required)
    # implies that our requests will be handled by mod_proxy.
    RewriteRule "^/(.*)" "http://localhost:8080/$1" [P]

And that's all... You can roughly copy and paste this example in a
<VirtualHost> section of your httpd.conf (obviously after having applied the
appropriate modification), and go...

    Pier


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org


Re: [TIPS] Basic configurations of Apache 2.0 for Cocoon 2

Posted by Jeremy Quinn <je...@media.demon.co.uk>.
On Wednesday, Feb 5, 2003, at 02:46 Europe/London, Pier Fumagalli wrote:

> After few discussions had face-to-face with some of you (Stefano on the
> phone ranting about setting up Tomcat, Jeremy over lunch at my place 
> few
> weeks ago, and several others), and few odd questions popping out on 
> the
> list, I feel the need to tell you why my vision is so narrow when 
> someone
> touches the "Apache" argument.
>


Brilliant!!!! It is working, hurrah!

One thing I would like to try, is to make Apache handle the display of 
any 404s, 500s etc. caught by TomCat/Cocoon. (Not switched to Jetty 
yet).

ie. I would like our production server to have Apache outputting a 
common style of error pages, while keeping Cocoon outputting error 
pages (with stacktraces etc.) only on the staging server (so developers 
can see the cause).

Obviously I can get Cocoon to output the same style of error page as 
Apache's ones, but I'd prefer to get Cocoon to offload all error 
reporting to Apache if possible.

Anyone know how to do this?


Thanks for your help

regards Jeremy


---------------------------------------------------------------------
To unsubscribe, e-mail: cocoon-dev-unsubscribe@xml.apache.org
For additional commands, email: cocoon-dev-help@xml.apache.org