You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Rob Hartill <ro...@imdb.com> on 1996/04/26 16:04:27 UTC

ScriptAlias-like extensions for perl etc

All my CGI is written in perl. As an experiment I changed
mod_cgi.c to call the perl binary directly instead of leaving
it to sh (I think that's what gets called to determine that perl
needs to be called because of the "#!" first line.)

I guess there must be an existing directive that could be reused to
mark script extensions or entire directories to be passed to a binary
instead of to sh. The change to mod_cgi.c is trivial..

  execle(r->filename, argv0, NULL, env);
becomes
  execle("/usr/local/bin/perl", argv0, r->filename, NULL, env);


Speedups are not at all dramatic (running perl is a bigger overhead), but
I can detect speedup by hammering on a CGI script and then looking at
mod_status to compare the numbers.

What'd be nice would be things like,

CGIBinary /usr/local/bin/perl pl
DefaultBinary /usr/local/bin/perl

or something along those lines.


thoughts?

rob

Re: ScriptAlias-like extensions for perl etc

Posted by "James H. Cloos Jr." <cl...@jhcloos.com>.
Tom> I don't have any thoughts on the specific change you suggest.
Tom> I'm just curious to know if you tried using mod_perl.  I wonder
Tom> what the speed improvement there would be like.

Speaking of which, has anyone been able to get it to compile & link
against against a 1.1dev?

I'll include below a patch that is as far as I got, before I was
distracted by other, non Apache, problems....

Tom> If someone can tell me where to get actually get mod_perl, I'd
Tom> appreciate it.  Else I'll download the archives and search
Tom> through them for the announcement.

>From the list, <URL:http://www.sn.no/~aas/mod_perl.patch.gz>

In addition to this diff, you need to add the Module line to Configuration.
(I removed as much local baggage from the patch as I could w/o changing any
@@ sections.  The changes I made were mostly limited to paths.
------------------------------------------------------------
diff -urpP src-CVS/Configuration src/Configuration
--- src-CVS/Configuration	Tue Apr  9 20:00:16 1996
+++ src/Configuration	Fri Apr 12 11:48:55 1996
@@ -54,15 +54,15 @@ CC= gcc
 #  defaults in.  Note that this config file does not include DBM auth by
 #  default --- configure it in below if you need it].
 
-CFLAGS= -O2
+CFLAGS= -O6 -DXBITHACK -DSTATUS -I/usr/lib/perl5/i586-linux/5.002/CORE
 
 # Place here any flags you may need upon linking, such as a flag to
 # prevent dynamic linking (if desired)
-LFLAGS= 
+LFLAGS=-L/usr/lib/perl5/i586-linux/5.002/CORE 
 
 # Place here any extra libraries you may need to link to. 
 # -lndbm is commonly required for DBM auth, if that is configured in.
-EXTRA_LIBS=
+EXTRA_LIBS=-lperl -ldb -lgdbm -lm
 
 # AUX_CFLAGS are system-specific control flags.
 # NOTE: IF YOU DO NOT CHOOSE ONE OF THESE, EDIT httpd.h AND CHOOSE
diff -urpP src-CVS/Makefile.tmpl src/Makefile.tmpl
--- src-CVS/Makefile.tmpl	Sat Mar 30 14:00:13 1996
+++ src/Makefile.tmpl	Fri Apr 12 10:46:35 1996
@@ -7,6 +7,9 @@ OBJS= alloc.o http_main.o http_core.o ht
   http_log.o http_protocol.o rfc1413.o util.o util_script.o modules.o buff.o\
   md5c.o util_md5.o explain.o $(MODULES)
 
+PERL=/usr/bin/perl
+PERLLIB=/usr/lib/perl5
+
 .c.o:
 	$(CC) -c $(CFLAGS) $(AUX_CFLAGS) $<
 
@@ -26,6 +29,15 @@ httpd: $(OBJS)
 
 clean:
 	rm -f httpd $(OBJS) *pure*
+
+mod_perl.o: mod_perl.c
+	$(CC) -I$(PERLLIB)/i586-linux/5.002/CORE -c $(CFLAGS) $(AUX_CFLAGS) $<
+
+perl_glue.o: perl_glue.c
+	$(CC) -I$(PERLLIB)/i586-linux/5.002/CORE -c $(CFLAGS) $(AUX_CFLAGS) $<
+
+perl_glue.c: perl_glue.xs
+	$(PERL) $(PERLLIB)/ExtUtils/xsubpp -typemap $(PERLLIB)/ExtUtils/typemap perl_glue.xs >perl_glue.c
 
 dist.tar: 
 	# Assure a semi-sensible configuration going out...
diff -urpP src-CVS/conf.h src/conf.h
--- src-CVS/conf.h	Fri Apr  5 08:00:09 1996
+++ src/conf.h	Fri Apr 12 10:22:47 1996
@@ -51,6 +51,9 @@
  *
  */
 
+#define die   apache_die
+#define usage apache_usage
+
 
 /*
  * conf.h: system-dependant #defines and includes...
diff -urpP src-CVS/mod_perl.c src/mod_perl.c
--- src-CVS/mod_perl.c	Wed Dec 31 18:00:00 1969
+++ src/mod_perl.c	Fri Apr 12 10:10:32 1996
@@ -0,0 +1,212 @@
+/* ====================================================================
+ * Copyright (c) 1995 The Apache Group.  All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice, this list of conditions and the following disclaimer. 
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice, this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * 3. All advertising materials mentioning features or use of this
+ *    software must display the following acknowledgment:
+ *    "This product includes software developed by the Apache Group
+ *    for use in the Apache HTTP server project (http://www.apache.org/)."
+ *
+ * 4. The names "Apache Server" and "Apache Group" must not be used to
+ *    endorse or promote products derived from this software without
+ *    prior written permission.
+ *
+ * 5. Redistributions of any form whatsoever must retain the following
+ *    acknowledgment:
+ *    "This product includes software developed by the Apache Group
+ *    for use in the Apache HTTP server project (http://www.apache.org/)."
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE APACHE GROUP ``AS IS'' AND ANY
+ * EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE APACHE GROUP OR
+ * ITS CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT
+ * NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
+ * LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
+ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
+ * OF THE POSSIBILITY OF SUCH DAMAGE.
+ * ====================================================================
+ *
+ * This software consists of voluntary contributions made by many
+ * individuals on behalf of the Apache Group and was originally based
+ * on public domain software written at the National Center for
+ * Supercomputing Applications, University of Illinois, Urbana-Champaign.
+ * For more information on the Apache Group and the Apache HTTP server
+ * project, please see <http://www.apache.org/>.
+ *
+ */
+
+/* This module embeds a perl interpreter within the Apache httpd. Files
+ * classfied as "httpd/perl" are interpreted as a perl script by the
+ * server.  The apache C API is directly available to the perl script
+ * as through the perl_glue.xs routines.
+ *
+ * This should be much faster than what you can achieved with CGI
+ * scripts and you also has more direct contol over the connection
+ * back to the client.
+ */
+
+#include <EXTERN.h>
+#include <perl.h>
+
+#include "httpd.h"
+#include "http_config.h"
+#include "http_protocol.h"
+#include "http_log.h"
+#include "http_main.h"
+
+void boot_Apache _((CV* cv));
+
+static void xs_init()
+{
+  newXS("Apache::bootstrap", boot_Apache, __FILE__);
+}
+
+
+int perl_handler(request_rec *r)
+{
+  char *argv[] = { "", r->filename, NULL };
+  int status;
+
+  PerlInterpreter *perl = perl_alloc();
+
+
+  /* If the method is POST and the Content-Type is 
+   * application/x-www-form-urlencoded, then we read the data here
+   * so that the perl script does not have to do it.
+   */
+
+  if (r->method_number == M_POST) {
+    char *ct = table_get(r->headers_in, "Content-Type");
+    if (ct && strEQ(ct, "application/x-www-form-urlencoded")) {
+      char *lenp = table_get(r->headers_in, "Content-Length");
+      long len = lenp ? atoi(lenp) : 0;
+      if (len) {
+	/* We read the data */
+	char *content = (char*)palloc(r->pool, len+1);
+	long n = read_client_block(r, content, len);
+	if (n != len) {
+	  log_reason("Can't read request form content", r->filename, r);
+	  return BAD_REQUEST;
+	}
+	content[len] = '\0';
+	r->args = content;
+	/* Make this hint to the script so it does not try to read also */
+	table_set(r->headers_in, "Content-Length", "0");
+      }
+    }
+  }
+
+  perl_construct(perl);
+  perl_parse(perl, xs_init, 2, argv, NULL);
+
+  /* Make a pointer to the request structure available as $req */
+  sv_setref_pv(perl_get_sv("req", TRUE), "request_recPtr", (void*)r);
+
+  /* Make basic information available as perl variables */
+  sv_setpv(perl_get_sv("method",      TRUE), r->method);
+  sv_setpv(perl_get_sv("protocol",    TRUE), r->protocol);
+  sv_setpv(perl_get_sv("uri",         TRUE), r->uri);
+  sv_setpv(perl_get_sv("path_into",   TRUE), r->path_info);
+  sv_setpv(perl_get_sv("args",        TRUE), r->args);
+
+  /* Parse the r->args a form if the string contains an unencoded '=' */
+  if (r->args && strchr(r->args, '=')) {
+    /* This parsing destroys the value of the r->args string, but since
+     * we don't need it any more we don't bother with a pstrdup().
+     */
+    AV *av = perl_get_av("args", TRUE);
+    char *a = r->args;
+    char *end = a;
+    char *k, *v;
+    while (*end) {
+      a = end;
+      /* find next '&' character */
+      while (*end && *end != '&')
+	end++;
+
+      if (*end)
+	*end++ = '\0';
+
+      /* split on '=' */
+      k = a;
+      v = a;
+      while (*v && *v != '=')
+	v++;
+      if (*v)
+	*v++ = '\0';
+
+      /* Then we unescape the 'keyword' and the 'value'. */
+      unescape_url(k);
+      unescape_url(v);
+
+      /* XXX: An unescaped %00 might have terminated the string before
+       * we wanted, but there is not easy way to obtain the real unescaped
+       * string length so we ignore this problem for now.
+       */
+      av_push(av, newSVpv(k, 0));
+      av_push(av, newSVpv(v, 0));
+    }
+  }
+
+  /* Make the r->headers_in available as %headers_in */
+  {
+    array_header *hdrs_arr = table_elts (r->headers_in);
+    table_entry *hdrs = (table_entry *)hdrs_arr->elts;
+    int i;
+    HV *in = perl_get_hv("headers_in", TRUE);
+    for (i = 0; i < hdrs_arr->nelts; ++i) {
+      char *key = hdrs[i].key;
+      if (!key) continue;
+      hv_store(in, key, strlen(key), newSVpv(hdrs[i].val, 0), 0);
+    }
+  }
+
+  perl_run(perl);
+
+  status = statusvalue;
+  if (status == 65535)  /* this is what we get by exit(-1) in perl */
+    status = DECLINED;
+
+  perl_destruct(perl);
+  perl_free(perl);
+
+  return status;
+}
+
+handler_rec perl_handlers[] = {
+{ "httpd/perl", perl_handler },
+{ NULL }
+};
+
+module perl_module = {
+   STANDARD_MODULE_STUFF,
+   NULL,			/* initializer */
+   NULL,			/* create per-directory config structure */
+   NULL,			/* merge per-directory config structures */
+   NULL,			/* create per-server config structure */
+   NULL,			/* merge per-server config structures */
+   NULL,			/* command table */
+   perl_handlers,		/* handlers */
+   NULL,			/* translate_handler */
+   NULL,			/* check_user_id */
+   NULL,			/* check auth */
+   NULL,			/* check access */
+   NULL,			/* type_checker */
+   NULL,			/* pre-run fixups */
+   NULL				/* logger */
+};
diff -urpP src-CVS/perl_glue.xs src/perl_glue.xs
--- src-CVS/perl_glue.xs	Wed Dec 31 18:00:00 1969
+++ src/perl_glue.xs	Fri Apr 12 10:10:32 1996
@@ -0,0 +1,129 @@
+#ifdef __cplusplus
+extern "C" {
+#endif
+#include "EXTERN.h"
+#include "perl.h"
+#include "XSUB.h"
+#ifdef __cplusplus
+}
+#endif
+
+#include "httpd.h"
+#include "http_config.h"
+#include "http_protocol.h"
+#include "http_log.h"
+#include "http_main.h"
+
+MODULE = Apache  PACKAGE = Apache
+
+PROTOTYPES: ENABLE
+
+void
+send_http_header(req)
+	request_rec *req
+
+
+# Beware that we have changes the order of the arguments for this
+# function.
+
+int
+send_fd(req, f)
+	request_rec *req
+	FILE *f
+	CODE:
+	RETVAL = send_fd(f, req);
+
+void
+set_content_type(r, type)
+	request_rec *r
+	char *type
+	CODE:
+	r->content_type = pstrdup(r->pool, type);
+
+void
+set_content_encoding(r, encoding)
+	request_rec *r
+	char *encoding
+	CODE:
+	r->content_encoding = pstrdup(r->pool, encoding);
+
+void
+set_content_language(r, lang)
+	request_rec *r
+	char *lang
+	CODE:
+	r->content_language = pstrdup(r->pool, lang);
+
+void
+set_status(r, status)
+	request_rec *r
+	int status
+	CODE:
+	r->status = status;
+
+void
+set_status_line(r, line)
+	request_rec *r
+	char *line
+	CODE:
+	r->status_line = pstrdup(r->pool, line);
+
+void
+set_header_out(r, key, val)
+	request_rec *r
+	char *key
+	char *val
+	CODE:
+	table_set(r->headers_out, key, val);
+
+void
+set_err_headers_out(r, key, val)
+	request_rec *r
+	char *key
+	char *val
+	CODE:
+	table_set(r->err_headers_out, key, val);
+
+void
+set_no_cache(r, val)
+	request_rec *r
+	int val
+	CODE:
+	r->no_cache = val;
+	
+long
+read_client_block(r, buffer, bufsiz)
+	request_rec *r
+	char	    *buffer
+	int 	     bufsiz
+	OUTPUT:
+	buffer
+
+long
+write_client_block(r, buffer)
+	request_rec* r
+	char *       buffer = NO_INIT
+	PREINIT:
+	int size;
+	CODE:
+	buffer = (char *)SvPV(ST(1), size);
+	RETVAL = fwrite(buffer, sizeof(char), size, r->connection->client);
+
+
+# Beware, we have changed the order of the arguments for the log_reason()
+# funtion.
+
+void
+log_reason(r, reason, filename)
+	request_rec*	r
+	char *	reason
+	char *	filename
+	CODE:
+	log_reason(reason, filename, r);
+
+void
+log_error(r, mess)
+	request_rec*	r
+	char *		mess
+	CODE:
+	log_error(mess, r->server);
diff -urpP src-CVS/typemap src/typemap
--- src-CVS/typemap	Wed Dec 31 18:00:00 1969
+++ src/typemap	Fri Apr 12 10:10:32 1996
@@ -0,0 +1,2 @@
+TYPEMAP
+request_rec *	T_PTROBJ
------------------------------------------------------------

-JimC
-- 
James H. Cloos, Jr.	<URL:http://www.jhcloos.com/~cloos/>
cloos@jhcloos.com	Work: cloos@io.com
LPF,Usenix,SAGE,ISOC,ACLU

Re: ScriptAlias-like extensions for perl etc

Posted by Tom Tromey <tr...@creche.cygnus.com>.
Rob> All my CGI is in perl :-) and the scripts are pretty big
Rob> 1000-2000 lines each. One thing I already do is pre-process the
Rob> "requires" lines of my scripts so that the stuff in the script
Rob> dirs already has the require'd modules embedded in the main
Rob> file... saves me ~10 file open/reads per CGI call.

Another thing to research is the Perl compiler, which should be going
into alpha pretty soon.  Right now it generates C, but I think there
are plans to have it generate bytecodes as well.  The latter, in
conjunction with a perl-cgi module, would probably be a big win.

The former could make it possible to write modules in Perl.  That
might also be a win.

On a related note, I'd like to see mod_include extended so that the
list of recognized keywords is dynamically extensible.  That way I
could write a mod_include_tcl (or mod_include_perl) that would allow
me to embed Tcl (or Perl, or whatever) scripts into my html documents.
In conjunction with a dynamic loading capability (something like
mod_dld), this could be very useful.

Tom
-- 
tromey@cygnus.com                 Member, League for Programming Freedom

Re: ScriptAlias-like extensions for perl etc

Posted by Tom Tromey <tr...@creche.cygnus.com>.
Rob> CGIBinary /usr/local/bin/perl pl
Rob> DefaultBinary /usr/local/bin/perl

Rob> or something along those lines.

Rob> thoughts?

I don't have any thoughts on the specific change you suggest.  I'm
just curious to know if you tried using mod_perl.  I wonder what the
speed improvement there would be like.

I realize this isn't a generic solution.  But if you do write a lot of
CGIs in Perl, it might be a pretty good solution.

If someone can tell me where to get actually get mod_perl, I'd
appreciate it.  Else I'll download the archives and search through
them for the announcement.

Tom
-- 
tromey@cygnus.com                 Member, League for Programming Freedom

Re: ScriptAlias-like extensions for perl etc

Posted by Tony Sanders <sa...@bsdi.com>.
Tom Tromey writes:
> Tony> That's a negative.  exec() [in the kernel that is] looks at the
> Tony> file for the magic number ``#!'' and if present parses the line
> Tony> for the binary to actually exec.  This is true on all Unix
> Tony> systems.
> 
> Some older systems don't recognize #!.
Yes, that's true.  #! wasn't part of original Unix but neither
were pipes or virtual memory or networking.  It should be
a pretty safe assumption to make these days.

Test it on your system by inserting a trogan /bin/sh.

There are certainly no BSD-derived systems that hacked /bin/sh to
support this.  I'm a little less certain about SYSV systems.
I believe there was some in olden-times (circa 83) but they
are almost surely all gone by now.

Re: ScriptAlias-like extensions for perl etc

Posted by Tom Tromey <tr...@creche.cygnus.com>.
Tony> That's a negative.  exec() [in the kernel that is] looks at the
Tony> file for the magic number ``#!'' and if present parses the line
Tony> for the binary to actually exec.  This is true on all Unix
Tony> systems.

Some older systems don't recognize #!.

I don't honestly know if any such systems are actually running today.
The paranoid still go out of their way to deal with them, however.

Cf the Perl Configure script for some perspective on this issue...

In any case, I agree... changing mod_cgi to directly exec Perl
shouldn't really result in any appreciable speedup.  At least, that
result is suprising to me.

There are much bigger wins to be had when using Perl anyway: mod_perl
(as already discussed), the Perl compiler (currently pre-alpha and
probably useless today), or the dump/unexec trick (avoids parsing
costs, possibly significant).

Tom
-- 
tromey@cygnus.com                 Member, League for Programming Freedom

Re: ScriptAlias-like extensions for perl etc

Posted by Tony Sanders <sa...@bsdi.com>.
Rob Hartill writes:
> All my CGI is written in perl. As an experiment I changed
> mod_cgi.c to call the perl binary directly instead of leaving
> it to sh (I think that's what gets called to determine that perl
> needs to be called because of the "#!" first line.)
That's a negative.  exec() [in the kernel that is] looks at the
file for the magic number ``#!'' and if present parses the line
for the binary to actually exec.  This is true on all Unix systems.

> Speedups are not at all dramatic (running perl is a bigger overhead), but
> I can detect speedup by hammering on a CGI script and then looking at
> mod_status to compare the numbers.
You would need to do some profiling to find out what was going on
but your change would not affect real performance in any measureable way.

> What'd be nice would be things like,
> 
> CGIBinary /usr/local/bin/perl pl
> DefaultBinary /usr/local/bin/perl

Please don't :-)