You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Chris Knight <Ch...@nasa.gov> on 2003/05/30 02:15:13 UTC
automatic type identification (by extension) in virtual DAV repositories
So, Catacomb and other mod_dav backends that handle GET requests would
like to have the Content-Type of resources be automatically identified
based on the path information of that resource in the same manner as
file resources.
How would I best approach this problem? It appears that the mod_dav.c
hooks are defined as MIDDLE as is the mod_mime.c hooks...Should I force
a specific ordering of execution? Only other problem is find_ct in
mod_mime.c wants to see the finfo structure which doesn't apply to
virtual resources.
I can't call find_ct directly, nor can I get at the mime_dir_config
information to map the extension back to a Content-Type. Ideas?
It seems to me that the mime_dir_config information should be made
public or at least some methods for converting extension to Content-Type
(and visa-versa?)
Re: automatic type identification (by extension) in virtual DAV repositories
Posted by Chris Knight <Ch...@nasa.gov>.
André Malo wrote:
>* Chris Knight wrote:
>
>
>
>>So, Catacomb and other mod_dav backends that handle GET requests would
>>like to have the Content-Type of resources be automatically identified
>>based on the path information of that resource in the same manner as
>>file resources.
>>
>>How would I best approach this problem? It appears that the mod_dav.c
>>hooks are defined as MIDDLE as is the mod_mime.c hooks...Should I force
>>a specific ordering of execution? Only other problem is find_ct in
>>mod_mime.c wants to see the finfo structure which doesn't apply to
>>virtual resources.
>>
>>
>
>ModMimeUsePathInfo was created for this purpose IIRC.
>
>HTH, nd
>
>
But it does not function in this circumstance. I believe
ModMimeUsePathInfo turns on calls to find_ct but find_ct relies on
filename which only goes down to the <Location> (the root of the
Catacomb server.) I could hack the filename if I can get to it before
find_ct is called (as detailed in my second paragraph) but that's
certainly a poor solution.
Re: automatic type identification (by extension) in virtual DAV repositories
Posted by Chris Knight <Ch...@nasa.gov>.
Justin Erenkrantz wrote:
> --On Friday, May 30, 2003 2:34 AM +0200 André Malo <nd...@perlig.de> wrote:
>
>> ModMimeUsePathInfo was created for this purpose IIRC.
>
>
> Indeed. The only caveat is that ModMimeUsePathInfo shouldn't
> necessarily be enabled on resources that you edit. The problem comes
> into play when filters are applied and the transmitted representation
> differs from the on-disk (in-storage) resource. (This is actually a
> major collision with REST that WebDAV never quite solved cleanly - it
> just assumes that the resource == representation.)
Note that Content-Type identification would only come into play when the
Content-Type was not explicitly identified by the original PUT (as
happens with the Win32's Web Folders client and the Cadaver client and
surely many others.) Would it be better, instead, to not send a
Content-Type header at all? (It appears that find_ct defaults to
text/plain if it's unable to determine type by extension, should it
instead do nothing?)
> The best example is with mod_include'd files. With ModMimeUsePathInfo
> On, there is no way to disable the output filters when communicating
> with a DAV client. Therefore, the solution is to mount the repository
> twice - once as read-only with ModMimeUsePathInfo On and another one
> with it Off so that the DAV clients can modify without having the
> filters executed and get the 'raw' representation.
>
> Hope that makes sense. If not, I can try to clarify. -- justin
Already addressed by the WebDAV spec, see:
http://asg.web.cmu.edu/rfc/rfc2518.html#sec-5.4
And, of course, this issue is peripheral to Content-Type identification.
Re: fixed -- mod_mime.c#find_ct() redux
Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Thursday, June 5, 2003 10:21 AM -0700 Chris Knight
<Ch...@nasa.gov> wrote:
> I think because content_type was already defined, it skipped calling
> ap_run_type_checker?
I'm fairly sure it is run all the time, but I'd have to look to be sure.
> This is a major issue with non-fs backends...One option would be to re-do
> the filter to use a temporary file to feed to PHP (a gross hack, I know.)
> Thanks for the details/history...
And, that's an option that was suggested and discarded. =)
> Good news is I now have a much firmer grasp of the internal workings of
> Apache...
>
> Is there an active project to write/improve the documentation of these
> internal workings?
Just submit patches to dev@httpd. There's some stuff in
docs/manual/developer/ that might be a good starting point. -- justin
fixed -- mod_mime.c#find_ct() redux
Posted by Chris Knight <Ch...@nasa.gov>.
Justin Erenkrantz wrote:
> It looks like Catacomb (0.8.0 is what I just downloaded) is doing the
> same thing in its dav_repos_set_headers as mod_dav_svn. That hook
> runs after the fixups hooks, so it just trounces on the content-type
> that mod_mime tried to set with ModMimeUsePathInfo.
Ah, good call. I've fixed Catacomb to not do this anymore unless the
RDBMS has Content-Type defined.
>> In doing some debugging, with mod_dav_fs (with DEBUG_GET_HANDLER
>> defined in
>> repos.c) I've found that find_ct is not called to identify the type. I'm
>
> Well, here, find_ct is called in the fixups stage. So, I'm not sure
> what's going on for you. Some more specifics could be helpful.
I think because content_type was already defined, it skipped calling
ap_run_type_checker?
>> assuming the inability to run PHP scripts from a DAV server is the same
>> problem. Bug or feature?
>
> Note that since PHP 4.2.3, which (I believe) introduced the PHP
> handler for httpd-2.0 and removed the PHP filter, it's not possible to
> do this chaining. Implementing PHP as a filter got to be a nightmare,
> and we eventually gave up and did it as a handler instead. One of the
> drawbacks is that PHP can't work off of virtual repositories or
> anything that has its own handler now.
>
> (PHP requires file-backed input into its parser which kills it for
> Subversion and httpd-2.0. I believe there may be some work to allow
> PHP to take in push-based streams like Apache httpd output filters can
> deliver - when that is ready, we can reexamine PHP as filter again and
> determine what needs to happen on httpd's side to make it all happy -
> there is definitely some work that has to be done to httpd as well.)
This is a major issue with non-fs backends...One option would be to
re-do the filter to use a temporary file to feed to PHP (a gross hack, I
know.) Thanks for the details/history...
>> Also, I would highly recommend that testing of Apache include testing
>> the
>> GET handler in mod_dav_fs. There are other backends (Catacomb, ?)
>> that use
>> mod_dav as their front-end and who handle GETs themselves and we're
>> running
>> into these issues.
>
>
> httpd-test's perl-framework already has some tests for WebDAV that
> does exactly this. Feel free to help us expand our tests over on
> test-dev@httpd.apache.org. -- justin
Strangely, I am not seeing the problem with mod_dav_fs anymore...Perhaps
I was hallucinating...I guess it "works for me". :^o
Good news is I now have a much firmer grasp of the internal workings of
Apache...
Is there an active project to write/improve the documentation of these
internal workings?
Re: mod_mime.c#find_ct() redux (Re: automatic type identification (by
extension) in virtual DAV repositories)
Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Wednesday, June 4, 2003 11:33 AM -0700 Chris Knight
<Ch...@nasa.gov> wrote:
> First, apologies for being dense with regards to ModMimeUsePathInfo, the
> issue regarding Content-Type auto-identification has nothing to do with this
> directive.
Um, that is exactly what ModMimeUsePathInfo does. But, let me clarify a few
things: this is only on the Content-Type sent in the response headers - not
for the request (i.e. on a PUT without a Content-Type) - and this isn't
mod_mime_magic which looks at the content to determine the content-type.
mod_dav_fs is a special case because it goes down to the file level rather
than being at the virtual location layer (r->filename is correct on its own).
So, it doesn't need ModMimeUsePathInfo.
ModMimeUsePathInfo doesn't work with mod_dav_svn right now, but that's because
mod_dav_svn unconditionally internally sets the MIME content-type (and uses
the wrong API to boot). Although I should probably fix that, but that's
neither here nor there for this list.
It looks like Catacomb (0.8.0 is what I just downloaded) is doing the same
thing in its dav_repos_set_headers as mod_dav_svn. That hook runs after the
fixups hooks, so it just trounces on the content-type that mod_mime tried to
set with ModMimeUsePathInfo.
> In doing some debugging, with mod_dav_fs (with DEBUG_GET_HANDLER defined in
> repos.c) I've found that find_ct is not called to identify the type. I'm
Well, here, find_ct is called in the fixups stage. So, I'm not sure what's
going on for you. Some more specifics could be helpful.
> assuming the inability to run PHP scripts from a DAV server is the same
> problem. Bug or feature?
Note that since PHP 4.2.3, which (I believe) introduced the PHP handler for
httpd-2.0 and removed the PHP filter, it's not possible to do this chaining.
Implementing PHP as a filter got to be a nightmare, and we eventually gave up
and did it as a handler instead. One of the drawbacks is that PHP can't work
off of virtual repositories or anything that has its own handler now.
(PHP requires file-backed input into its parser which kills it for Subversion
and httpd-2.0. I believe there may be some work to allow PHP to take in
push-based streams like Apache httpd output filters can deliver - when that is
ready, we can reexamine PHP as filter again and determine what needs to happen
on httpd's side to make it all happy - there is definitely some work that has
to be done to httpd as well.)
> Also, I would highly recommend that testing of Apache include testing the
> GET handler in mod_dav_fs. There are other backends (Catacomb, ?) that use
> mod_dav as their front-end and who handle GETs themselves and we're running
> into these issues.
httpd-test's perl-framework already has some tests for WebDAV that does
exactly this. Feel free to help us expand our tests over on
test-dev@httpd.apache.org. -- justin
mod_mime.c#find_ct() redux (Re: automatic type identification (by
extension) in virtual DAV repositories)
Posted by Chris Knight <Ch...@nasa.gov>.
First, apologies for being dense with regards to ModMimeUsePathInfo, the
issue regarding Content-Type auto-identification has nothing to do with
this directive.
In doing some debugging, with mod_dav_fs (with DEBUG_GET_HANDLER defined
in repos.c) I've found that find_ct is not called to identify the type.
I'm assuming the inability to run PHP scripts from a DAV server is the
same problem. Bug or feature?
Is there any way to at least access the extension_mappings hash?
Also, I would highly recommend that testing of Apache include testing
the GET handler in mod_dav_fs. There are other backends (Catacomb, ?)
that use mod_dav as their front-end and who handle GETs themselves and
we're running into these issues.
Re: automatic type identification (by extension) in virtual DAV
repositories
Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Friday, May 30, 2003 2:34 AM +0200 André Malo <nd...@perlig.de> wrote:
> * Chris Knight wrote:
>
>> So, Catacomb and other mod_dav backends that handle GET requests would
>> like to have the Content-Type of resources be automatically identified
>> based on the path information of that resource in the same manner as
>> file resources.
...
>
> ModMimeUsePathInfo was created for this purpose IIRC.
Indeed. The only caveat is that ModMimeUsePathInfo shouldn't necessarily be
enabled on resources that you edit. The problem comes into play when filters
are applied and the transmitted representation differs from the on-disk
(in-storage) resource. (This is actually a major collision with REST that
WebDAV never quite solved cleanly - it just assumes that the resource ==
representation.)
The best example is with mod_include'd files. With ModMimeUsePathInfo On,
there is no way to disable the output filters when communicating with a DAV
client. Therefore, the solution is to mount the repository twice - once as
read-only with ModMimeUsePathInfo On and another one with it Off so that the
DAV clients can modify without having the filters executed and get the 'raw'
representation.
Hope that makes sense. If not, I can try to clarify. -- justin
Re: automatic type identification (by extension) in virtual DAV repositories
Posted by André Malo <nd...@perlig.de>.
* Chris Knight wrote:
> So, Catacomb and other mod_dav backends that handle GET requests would
> like to have the Content-Type of resources be automatically identified
> based on the path information of that resource in the same manner as
> file resources.
>
> How would I best approach this problem? It appears that the mod_dav.c
> hooks are defined as MIDDLE as is the mod_mime.c hooks...Should I force
> a specific ordering of execution? Only other problem is find_ct in
> mod_mime.c wants to see the finfo structure which doesn't apply to
> virtual resources.
ModMimeUsePathInfo was created for this purpose IIRC.
HTH, nd
--
Wenn nur Ingenieure mit Diplom programmieren würden, hätten wir
wahrscheinlich weniger schlechte Software.
Wir hätten allerdings auch weniger gute Software.
-- Felix von Leitner in dasr