You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Florent Guillaume <gu...@clipper.ens.fr> on 1995/07/17 02:33:39 UTC
Patches to handle content-language
What follows is a patch for Apache 0.8.0 (and Shambhala) that enables
consistent handling of content-language with MultiViews.
(The new behaviour is much closer to what you can have using CERN's httpd.)
Previously, if you wanted to handle files in several languages, you
were obliged to have a .var file for each, because mod_mime.c didn't
know what Content-Language was and so didn't type on language.
I added a per-directory directive AddLanguage which is very similar to
AddEncoding : it takes a language and a suffix. For example my srm.conf has
AddLanguage fr .fr
AddLanguage en .en
AddLanguage de .de
Mod_mime.c now recognizes filenames of the form
basename.type.lang.encoding, for example chapter1.html.fr.gz is
correctly typed as text/html with language=fr and encoding=x-gzip. The
Content-Language is stored in a new field in request_rec, exactly like
the Content-Encoding.
Also : when in MultiViews, if you request somefile.html and both
somefile.html.fr and somefile.html.en are available with the same
quality setting, the previous behaviour was to serve whichever was
smallest in size. This made it impossible to have the server give by
default pages in French if the client didn't send an Accept-Language:
header. I changed this behaviour to server the pages with the priority
given in the config file (the first AddLanguage has highest priority).
I don't think it has any impact on existing applications.
I also fixed a bug in find_lang_index in which a NULL string could be
strncmp'ed.
Regards,
Florent Guillaume
*** ../shambhala.orig/httpd.h Wed Jul 12 19:44:53 1995
--- httpd.h Sun Jul 16 21:12:14 1995
***************
*** 274,279 ****
--- 274,280 ----
char *content_type; /* Break these out --- we dispatch on 'em */
char *content_encoding;
+ char *content_language;
int no_cache;
*** ../shambhala.orig/http_config.h Mon Jun 26 00:42:14 1995
--- http_config.h Sun Jul 16 21:12:57 1995
***************
*** 176,182 ****
* (as a SERVER_ERROR, since the module which was
* supposed to handle this was configured wrong).
* type_checker --- Determine MIME type of the requested entity;
! * sets content_type and _encoding fields.
* logger --- log a transaction. Not supported yet out of sheer
* laziness on my part.
*/
--- 176,182 ----
* (as a SERVER_ERROR, since the module which was
* supposed to handle this was configured wrong).
* type_checker --- Determine MIME type of the requested entity;
! * sets content_type, _encoding and _language fields.
* logger --- log a transaction. Not supported yet out of sheer
* laziness on my part.
*/
*** ../shambhala.orig/http_protocol.c Thu Jul 13 02:28:05 1995
--- http_protocol.c Sun Jul 16 22:44:53 1995
***************
*** 499,504 ****
--- 499,507 ----
if (r->content_encoding)
fprintf (fd, "Content-encoding: %s\015\012", r->content_encoding);
+ if (r->content_language)
+ fprintf (fd, "Content-language: %s\015\012", r->content_language);
+
for (i = 0; i < hdrs_arr->nelts; ++i) {
if (!hdrs[i].key) continue;
fprintf (fd, "%s: %s\015\012", hdrs[i].key, hdrs[i].val);
*** ../shambhala.orig/mod_negotiation.c Sat Jul 1 19:46:05 1995
--- mod_negotiation.c Mon Jul 17 01:32:47 1995
***************
*** 132,138 ****
char *type_name;
char *file_name;
char *content_encoding;
! char *lang;
float level; /* Auxiliary to content-type... */
float qs;
float bytes;
--- 132,138 ----
char *type_name;
char *file_name;
char *content_encoding;
! char *content_language;
float level; /* Auxiliary to content-type... */
float qs;
float bytes;
***************
*** 172,178 ****
mime_info->type_name = "";
mime_info->file_name = "";
mime_info->content_encoding = "";
! mime_info->lang = "";
mime_info->is_pseudo_html = 0.0;
mime_info->level = 0.0;
--- 172,178 ----
mime_info->type_name = "";
mime_info->file_name = "";
mime_info->content_encoding = "";
! mime_info->content_language = "";
mime_info->is_pseudo_html = 0.0;
mime_info->level = 0.0;
***************
*** 560,567 ****
mime_info.bytes = atoi(body);
}
else if (!strncmp (buffer, "content-language:", 17)) {
! mime_info.lang = get_token (neg->pool, &body, 0);
! str_tolower (mime_info.lang);
}
else if (!strncmp (buffer, "content-encoding:", 17)) {
mime_info.content_encoding = get_token (neg->pool, &body, 0);
--- 560,567 ----
mime_info.bytes = atoi(body);
}
else if (!strncmp (buffer, "content-language:", 17)) {
! mime_info.content_language = get_token (neg->pool, &body, 0);
! str_tolower (mime_info.content_language);
}
else if (!strncmp (buffer, "content-encoding:", 17)) {
mime_info.content_encoding = get_token (neg->pool, &body, 0);
***************
*** 589,597 ****
int read_types_multi (negotiation_state *neg)
{
request_rec *r = neg->r;
- char *file_name = pstrdup (r->pool, r->filename);
! char *filp = &file_name[strlen(file_name) - 1];
int prefix_len;
DIR *dirp;
struct DIR_TYPE *dir_entry;
--- 589,596 ----
int read_types_multi (negotiation_state *neg)
{
request_rec *r = neg->r;
! char *filp;
int prefix_len;
DIR *dirp;
struct DIR_TYPE *dir_entry;
***************
*** 648,653 ****
--- 647,653 ----
mime_info.sub_req = sub_req;
mime_info.file_name = dir_entry->d_name;
mime_info.content_encoding = sub_req->content_encoding;
+ mime_info.content_language = sub_req->content_language;
get_entry (neg->pool, &accept_info, sub_req->content_type);
set_mime_fields (&mime_info, &accept_info);
***************
*** 759,767 ****
int find_lang_index (array_header *accept_langs, char *lang)
{
! accept_rec *accs = (accept_rec *)accept_langs->elts;
int i;
for (i = 0; i < accept_langs->nelts; ++i)
if (!strncmp (lang, accs[i].type_name, strlen(accs[i].type_name)))
return i;
--- 759,772 ----
int find_lang_index (array_header *accept_langs, char *lang)
{
! accept_rec *accs;
int i;
+ if (!lang)
+ return -1;
+
+ accs = (accept_rec *)accept_langs->elts;
+
for (i = 0; i < accept_langs->nelts; ++i)
if (!strncmp (lang, accs[i].type_name, strlen(accs[i].type_name)))
return i;
***************
*** 777,793 ****
if (neg->accept_langs->nelts == 0) {
! /* Client doesn't care */
for (i = 0; i < neg->avail_vars->nelts; ++i)
! var_recs[i].lang_index = -1;
return;
}
for (i = 0; i < neg->avail_vars->nelts; ++i)
if (var_recs[i].quality > 0) {
! int index = find_lang_index (neg->accept_langs, var_recs[i].lang);
var_recs[i].lang_index = index;
if (index >= 0) found_any = 1;
--- 782,802 ----
if (neg->accept_langs->nelts == 0) {
! /* Client doesn't care : use order of config file */
!
! extern int mime_get_lang_index (request_rec *r, char *lang);
for (i = 0; i < neg->avail_vars->nelts; ++i)
! var_recs[i].lang_index =
! mime_get_lang_index (neg->r, var_recs[i].content_language);
return;
}
for (i = 0; i < neg->avail_vars->nelts; ++i)
if (var_recs[i].quality > 0) {
! int index = find_lang_index (neg->accept_langs,
! var_recs[i].content_language);
var_recs[i].lang_index = index;
if (index >= 0) found_any = 1;
***************
*** 1031,1036 ****
--- 1040,1046 ----
r->filename = sub_req->filename;
r->content_type = sub_req->content_type;
r->content_encoding = sub_req->content_encoding;
+ r->content_language = sub_req->content_language;
r->finfo = sub_req->finfo;
return OK;
*** ../shambhala.orig/mod_mime.c Fri Jun 30 13:54:26 1995
--- mod_mime.c Mon Jul 17 01:32:25 1995
***************
*** 69,74 ****
--- 69,75 ----
typedef struct {
table *forced_types; /* Additional AddTyped stuff */
table *encoding_types; /* Added with AddEncoding... */
+ table *language_types; /* Added with AddLanguage... */
} mime_dir_config;
module mime_module;
***************
*** 80,85 ****
--- 81,87 ----
new->forced_types = make_table (p, 4);
new->encoding_types = make_table (p, 4);
+ new->language_types = make_table (p, 4);
return new;
}
***************
*** 95,100 ****
--- 97,104 ----
base->forced_types);
new->encoding_types = overlay_tables (p, add->encoding_types,
base->encoding_types);
+ new->language_types = overlay_tables (p, add->language_types,
+ base->language_types);
return new;
}
***************
*** 113,118 ****
--- 117,157 ----
return NULL;
}
+ char *add_language(cmd_parms *cmd, mime_dir_config *m, char *lang, char *ext)
+ {
+ if (*ext == '.') ++ext;
+ table_set (m->language_types, ext, lang);
+ return NULL;
+ }
+
+
+ /* This function is called by the negotiation module to know the index
+ * of a given language in the config files.
+ */
+
+ int mime_get_lang_index (request_rec *r, char *lang)
+ {
+ mime_dir_config *conf;
+ int nelts;
+ table_entry *elts;
+ int i;
+
+ if (!lang)
+ return -1;
+
+ conf = (mime_dir_config *)get_module_config(r->per_dir_config, &mime_module);
+ nelts = conf->language_types->nelts;
+ elts = (table_entry *) conf->language_types->elts;
+
+ for (i = 0; i < nelts; ++i)
+ if (!strcasecmp (elts[i].val, lang))
+ return i;
+
+ return -1;
+ }
+
+
+
/* The sole bit of server configuration that the MIME module has is
* the name of its config file, so...
*/
***************
*** 129,134 ****
--- 168,175 ----
"a mime type followed by a file extension" },
{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2,
"an encoding (e.g., gzip), followed by a file extension" },
+ { "AddLanguage", add_language, NULL, OR_FILEINFO, TAKE2,
+ "a language (e.g., fr), followed by a file extension" },
{ "TypesConfig", set_types_config, NULL, RSRC_CONF, TAKE1,
"the MIME types config file" },
{ NULL }
***************
*** 198,203 ****
--- 239,255 ----
if ((type = table_get (conf->encoding_types, &fn[i])))
{
r->content_encoding = type;
+
+ /* go back to previous extension to try to use it as a language */
+
+ fn[i-1] = '\0';
+ if((i=rind(fn,'.')) < 0) return OK;
+ ++i;
+ }
+
+ if ((type = table_get (conf->language_types, &fn[i])))
+ {
+ r->content_language = type;
/* go back to previous extension to try to use it as a type */
--
Florent.Guillaume@ens.fr
Re: Patches to handle content-language
Posted by Brian Behlendorf <br...@organic.com>.
On Mon, 17 Jul 1995, Florent Guillaume wrote:
> What follows is a patch for Apache 0.8.0 (and Shambhala) that enables
> consistent handling of content-language with MultiViews.
>
> (The new behaviour is much closer to what you can have using CERN's httpd.)
>
> Previously, if you wanted to handle files in several languages, you
> were obliged to have a .var file for each, because mod_mime.c didn't
> know what Content-Language was and so didn't type on language.
>
> I added a per-directory directive AddLanguage which is very similar to
> AddEncoding : it takes a language and a suffix. For example my srm.conf has
>
> AddLanguage fr .fr
> AddLanguage en .en
> AddLanguage de .de
>
> Mod_mime.c now recognizes filenames of the form
> basename.type.lang.encoding, for example chapter1.html.fr.gz is
> correctly typed as text/html with language=fr and encoding=x-gzip. The
> Content-Language is stored in a new field in request_rec, exactly like
> the Content-Encoding.
I really like this, but what resolves name collisions and missing
info between type, lang, and encoding? For example, if I decide to name
all my Framemaker documents .fr, what happens to document.fr?
document.fr.en? document.fr.fr? If type, lang, and encoding shared the
same namespace, *no* problem. In this case, we're using filename
extensions to indicate meta-information other than content-type, which
I'm certainly comfortable with, but the collision issue should be
resolved somehow.
Also, it would be tremendous if I could have the flexibility to negotiate
on file type and language and encoding by specifying only the meta-info I
want in the filename - in other words, lets say I have documents in all
the possible variations of
basename.[html,txt,pdf].[en,fr,jp].[gz,Z,uu]
Right now with content-negotiation, if I have an index.html and an
index.html3, then I can simply point a resource locator to "index" and
negotiation happens, but I can also defeat negotiation by explicitly
linking to "index.html3" if I wanted to make sure someone got the 3.0
version.
Let's say for the above 9 versions of the document I wanted to
be able to specify which variables are mandatory. If I didn't care at
all which document was fetched, I'd create a link to "basename". If I
wanted specifically the gzip'd french PDF, I'd make a link to
"basename.pdf.fr.gzip". Now, let's say I want to make a link to all
french variants explicit, yet let the client/server negotiate on their
own as to encoding and content-type preferences. I'd like to then link
to "basename.fr". Or, I specifically want the uuencoded PDF's, but I
don't care what language: "basename.pdf.gz".
Thoughts? If we ensure there's no namespace collisions between mime
type extensions and filename extensions and encoding extension then this
is easy. If not....
Brian
--=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=--
brian@organic.com brian@hyperreal.com http://www.[hyperreal,organic].com/