You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@subversion.apache.org by Sergey Proskurnya <al...@gmail.com> on 2005/09/15 07:42:51 UTC

Last-Modified HTTP header from mod_dav_svn

Hi there,

I want to discuss one possible feature for mod_dav_svn.
The idea is very basic: let mod_dav_svn set the
"Last-Modified" HTTP header to last modification time
of requested resource in repository (only files, I'm not sure
what to do with directories).
This will allow to use Internet Caching (Squid)
very effectively, which will result in lower traffic and
performance improvement for remote users, who are using
casual HTTP browser to access to SVN repository.

What respected community think about this?

Thanks,
Serge.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Michael Sinz <Mi...@sinz.org>.
Kalle Olavi Niemitalo wrote:
> Erik Huelsmann <eh...@gmail.com> writes:
> 
> 
>>On 9/15/05, Sergey Proskurnya <al...@gmail.com> wrote:
>>
>>>This will allow to use Internet Caching (Squid)
>>>very effectively, which will result in lower traffic and
>>>performance improvement for remote users, who are using
>>>casual HTTP browser to access to SVN repository.
> 
>>It's a great idea, but won't work. Currently Subversion requests
>>REPORTs from the server. These are unique for every single session
>>between a client and a server. Thus, the REPORT response we're using
>>isn't cacheable.
> 
> However, if a "casual HTTP browser" contacts a Subversion repository,
> it will be using GET, which could be cacheable.
> 
> On the other hand, mod_dav_svn is already generating ETag headers,
> and an HTTP/1.1 cache can put the entity tag in an If-None-Match
> request header, and presumably get back a 304 Not Modified status,
> in the same way it would with Last-Modified and If-Modified-Since.

Note that according to the RFC, a strict interpretation would require
both the ETag and teh Last-Modified header in order to allow the client
side to use a conditional get.  I found this out the hard way (did not
read the RFC closely enough) and had to update the Insurrection RSS and Atom
feeds to fill in both ETag and the Last-Modified header.  (I was only
doing ETag since that is what I thought was the minimum set needed for
conditional get...)

> I suppose there could be two advantages with reporting Last-Modified
> in addition to ETag:
> 
> - Perhaps some older caches support Last-Modified but not ETag.
> 
> - A cache can guess an expiry date based on how long the resource
>   has already been unmodified.
> 
> - RFC 2616 section 13.3.4 says servers SHOULD do so.

-- 
Michael Sinz                     Technology and Engineering Director/Consultant
"Starting Startups"                                mailto:michael.sinz@sinz.org
My place on the web                            http://www.sinz.org/Michael.Sinz

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Michael Sinz <Mi...@sinz.org>.
Kalle Olavi Niemitalo wrote:
> Erik Huelsmann <eh...@gmail.com> writes:
> 
> 
>>On 9/15/05, Sergey Proskurnya <al...@gmail.com> wrote:
>>
>>>This will allow to use Internet Caching (Squid)
>>>very effectively, which will result in lower traffic and
>>>performance improvement for remote users, who are using
>>>casual HTTP browser to access to SVN repository.
> 
>>It's a great idea, but won't work. Currently Subversion requests
>>REPORTs from the server. These are unique for every single session
>>between a client and a server. Thus, the REPORT response we're using
>>isn't cacheable.
> 
> However, if a "casual HTTP browser" contacts a Subversion repository,
> it will be using GET, which could be cacheable.
> 
> On the other hand, mod_dav_svn is already generating ETag headers,
> and an HTTP/1.1 cache can put the entity tag in an If-None-Match
> request header, and presumably get back a 304 Not Modified status,
> in the same way it would with Last-Modified and If-Modified-Since.

Note that according to the RFC, a strict interpretation would require
both the ETag and teh Last-Modified header in order to allow the client
side to use a conditional get.  I found this out the hard way (did not
read the RFC closely enough) and had to update the Insurrection RSS and Atom
feeds to fill in both ETag and the Last-Modified header.  (I was only
doing ETag since that is what I thought was the minimum set needed for
conditional get...)

> I suppose there could be two advantages with reporting Last-Modified
> in addition to ETag:
> 
> - Perhaps some older caches support Last-Modified but not ETag.
> 
> - A cache can guess an expiry date based on how long the resource
>   has already been unmodified.
> 
> - RFC 2616 section 13.3.4 says servers SHOULD do so.

-- 
Michael Sinz                     Technology and Engineering Director/Consultant
"Starting Startups"                                mailto:michael.sinz@sinz.org
My place on the web                            http://www.sinz.org/Michael.Sinz

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Daniel Rall <dl...@collab.net>.
On Tue, 29 Nov 2005, Greg Stein wrote:

> On Mon, Nov 28, 2005 at 09:32:19PM -0800, Daniel Rall wrote:
...
> > http://svn.collab.net/viewcvs/svn/trunk/subversion/mod_dav_svn/repos.c?r1=17547&r2=17549
...
> > > > For (at least) version resources, we should also be setting the
> > > > Cache-Control header. The max-age should be set to some ridiculously
> > > > high number since a version resource can't change.
> > > 
> > > Greg, are you referring to this specific type (from mod_dav.h)?:
> > > 
> > >     DAV_RESOURCE_TYPE_VERSION,          /* version or baseline URL */
> 
> Yes. Essentially that type refers to a specific <rev, path> pair.
> 
> > Or did you mean versioned resources like files and directories?
> 
> "version resource" is a DAV term with a specific meaning. We're
> probably talking about the same thing :-)

Yes, we are, thanks for the clarification (and thanks to sussman as
well!).  I didn't understand that DAV_RESOURCE_TYPE_VERSION was an
indicator for a <rev, path> pair.

...
> Note that the LACKS_ETAG macro also provides an etag for REGULAR
> resources, which you can't do. Those change over time, so they
> shouldn't use Cache-Control (let the proxy use the etag and L-M header
> to see if the resource has changed).
> 
> I'm not sure in which cases dav_svn_set_headers() is called, but
> hopefully just for GET/HEAD requests. Should be double-checked.

I've confirmed that it's not called for PROPFIND requests, and is
called for HEAD and GET requests.  I haven't checked other HTTP
methods, but given that it is used in the definition of mod_dav_svn's
dav_hooks_repository, I'm fairly certain dav_svn_set_headers()
functions as desired.  Here's the API it conforms to (as defined by
mod_dav.h):

    /*
    ** If a GET is processed using a stream (open_stream, read_stream)
    ** rather than via a sub-request (on get_pathname), then this function
    ** is used to provide the repository with a way to set the headers
    ** in the response.
    **
    ** This function may be called without a following deliver(), to
    ** handle a HEAD request.
    **
    ** This may be NULL if handle_get is FALSE.
    */
    dav_error * (*set_headers)(request_rec *r,
                               const dav_resource *resource);

> Oh, and note that setting MAX_SECONDS to even just a day would be a
> big win. Make it a year if you want, but if you grow hinky with that
> duration, then something less will still be a Good Thing. (I'd go with
> a week, I think)

I can't seem to get the conditional right.  'svn cat -r2 http://...' 
is apparently neither a DAV_RESOURCE_TYPE_VERSION, nor a
resource->baselined.

--- subversion/mod_dav_svn/repos.c      (revision 17559)
+++ subversion/mod_dav_svn/repos.c      (working copy)
@@ -2176,6 +2176,11 @@
   apr_table_setn(r->headers_out, "ETag",
                  dav_svn_getetag(resource, resource->pool));

+  /* As version resources don't change, encourage caching. */
+  if (resource->type == DAV_RESOURCE_TYPE_VERSION)
+    /* Cache resource for one week (specified in seconds). */
+    apr_table_setn(r->headers_out, "Cache-Control", "max-age=604800");
+
   /* we accept byte-ranges */
   apr_table_setn(r->headers_out, "Accept-Ranges", "bytes");


Any suggestions as to how the conditional ought to be implemented?
-- 

Daniel Rall

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Greg Stein <gs...@lyra.org>.
On Mon, Nov 28, 2005 at 09:32:19PM -0800, Daniel Rall wrote:
>...
> >   /* make sure the proper mtime is in the request record */
> > #if 0
> >   ap_update_mtime(r, resource->info->finfo.mtime);
> > #endif
> > 
> >   /* ### note that these use r->filename rather than <resource> */
> > #if 0
> >   ap_set_last_modified(r);
> > #endif
> 
> #define's aside, this was a straight copy out of
> httpd/modules/dav/fs/repos.c, which I believe was used as the template
> for mod_dav_svn.

Right. I #if'd them out because they wouldn't work. As you noticed
:-), more coding was necessary to get that stuff to work.

> > I'm working on activating these, pulling the value of the "svn:date"
> > revprop using svn_fs_node_created_rev() and svn_fs_revision_prop().
> 
> Done in r17549.
> http://svn.collab.net/viewcvs/svn/trunk/subversion/mod_dav_svn/repos.c?r1=17547&r2=17549

Ooooh!  I'll take a look.

> > > For (at least) version resources, we should also be setting the
> > > Cache-Control header. The max-age should be set to some ridiculously
> > > high number since a version resource can't change.
> > 
> > Greg, are you referring to this specific type (from mod_dav.h)?:
> > 
> >     DAV_RESOURCE_TYPE_VERSION,          /* version or baseline URL */

Yes. Essentially that type refers to a specific <rev, path> pair.

> Or did you mean versioned resources like files and directories?

"version resource" is a DAV term with a specific meaning. We're
probably talking about the same thing :-)

> Assuming so, the implementation would resemble something like the
> get_last_modified(resource) function introduced in r17549, and
> repos.c:dav_svn_set_headers() would include a header of
> "Cache-Control: max-age=MAX_SECONDS" in the response (where
> MAX_SECONDS is some big apr_time_t I haven't figured out yet).

Yup. Note that the LACKS_ETAG macro also provides an etag for REGULAR
resources, which you can't do. Those change over time, so they
shouldn't use Cache-Control (let the proxy use the etag and L-M header
to see if the resource has changed).

I'm not sure in which cases dav_svn_set_headers() is called, but
hopefully just for GET/HEAD requests. Should be double-checked.

Oh, and note that setting MAX_SECONDS to even just a day would be a
big win. Make it a year if you want, but if you grow hinky with that
duration, then something less will still be a Good Thing. (I'd go with
a week, I think)

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Daniel Rall <dl...@collab.net>.
On Mon, 28 Nov 2005, Daniel Rall wrote:

> On Sat, 26 Nov 2005, Greg Stein wrote:
...
>   /* make sure the proper mtime is in the request record */
> #if 0
>   ap_update_mtime(r, resource->info->finfo.mtime);
> #endif
> 
>   /* ### note that these use r->filename rather than <resource> */
> #if 0
>   ap_set_last_modified(r);
> #endif

#define's aside, this was a straight copy out of
httpd/modules/dav/fs/repos.c, which I believe was used as the template
for mod_dav_svn.

> I'm working on activating these, pulling the value of the "svn:date"
> revprop using svn_fs_node_created_rev() and svn_fs_revision_prop().

Done in r17549.
http://svn.collab.net/viewcvs/svn/trunk/subversion/mod_dav_svn/repos.c?r1=17547&r2=17549

> > For (at least) version resources, we should also be setting the
> > Cache-Control header. The max-age should be set to some ridiculously
> > high number since a version resource can't change.
> 
> Greg, are you referring to this specific type (from mod_dav.h)?:
> 
>     DAV_RESOURCE_TYPE_VERSION,          /* version or baseline URL */

Or did you mean versioned resources like files and directories?
Assuming so, the implementation would resemble something like the
get_last_modified(resource) function introduced in r17549, and
repos.c:dav_svn_set_headers() would include a header of
"Cache-Control: max-age=MAX_SECONDS" in the response (where
MAX_SECONDS is some big apr_time_t I haven't figured out yet).
-- 

Daniel Rall

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Daniel Rall <dl...@collab.net>.
On Sat, 26 Nov 2005, Greg Stein wrote:

> On Fri, Nov 25, 2005 at 10:57:31PM +0200, Kalle Olavi Niemitalo wrote:
> >...
> > However, if a "casual HTTP browser" contacts a Subversion repository,
> > it will be using GET, which could be cacheable.
> 
> Yup.
> 
> >...
> > I suppose there could be two advantages with reporting Last-Modified
> > in addition to ETag:
> > 
> > - Perhaps some older caches support Last-Modified but not ETag.
> > 
> > - A cache can guess an expiry date based on how long the resource
> >   has already been unmodified.
> > 
> > - RFC 2616 section 13.3.4 says servers SHOULD do so.
> 
> This last part is good enough. We have the date, so we may as well put
> it into the response (depending upon the type of resource, of course).

Looks like there are already some reminders in repos.c:

  /* make sure the proper mtime is in the request record */
#if 0
  ap_update_mtime(r, resource->info->finfo.mtime);
#endif

  /* ### note that these use r->filename rather than <resource> */
#if 0
  ap_set_last_modified(r);
#endif

I'm working on activating these, pulling the value of the "svn:date"
revprop using svn_fs_node_created_rev() and svn_fs_revision_prop().

> For (at least) version resources, we should also be setting the
> Cache-Control header. The max-age should be set to some ridiculously
> high number since a version resource can't change.

Greg, are you referring to this specific type (from mod_dav.h)?:

    DAV_RESOURCE_TYPE_VERSION,          /* version or baseline URL */

-- 

Daniel Rall

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Greg Stein <gs...@lyra.org>.
On Fri, Nov 25, 2005 at 10:57:31PM +0200, Kalle Olavi Niemitalo wrote:
>...
> However, if a "casual HTTP browser" contacts a Subversion repository,
> it will be using GET, which could be cacheable.

Yup.

>...
> I suppose there could be two advantages with reporting Last-Modified
> in addition to ETag:
> 
> - Perhaps some older caches support Last-Modified but not ETag.
> 
> - A cache can guess an expiry date based on how long the resource
>   has already been unmodified.
> 
> - RFC 2616 section 13.3.4 says servers SHOULD do so.

This last part is good enough. We have the date, so we may as well put
it into the response (depending upon the type of resource, of course).

For (at least) version resources, we should also be setting the
Cache-Control header. The max-age should be set to some ridiculously
high number since a version resource can't change.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Kalle Olavi Niemitalo <ko...@iki.fi>.
Erik Huelsmann <eh...@gmail.com> writes:

> On 9/15/05, Sergey Proskurnya <al...@gmail.com> wrote:
>> This will allow to use Internet Caching (Squid)
>> very effectively, which will result in lower traffic and
>> performance improvement for remote users, who are using
>> casual HTTP browser to access to SVN repository.

> It's a great idea, but won't work. Currently Subversion requests
> REPORTs from the server. These are unique for every single session
> between a client and a server. Thus, the REPORT response we're using
> isn't cacheable.

However, if a "casual HTTP browser" contacts a Subversion repository,
it will be using GET, which could be cacheable.

On the other hand, mod_dav_svn is already generating ETag headers,
and an HTTP/1.1 cache can put the entity tag in an If-None-Match
request header, and presumably get back a 304 Not Modified status,
in the same way it would with Last-Modified and If-Modified-Since.

I suppose there could be two advantages with reporting Last-Modified
in addition to ETag:

- Perhaps some older caches support Last-Modified but not ETag.

- A cache can guess an expiry date based on how long the resource
  has already been unmodified.

- RFC 2616 section 13.3.4 says servers SHOULD do so.

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Sergey Proskurnya <al...@gmail.com>.
Erik Huelsmann пишет:
> On 9/15/05, Sergey Proskurnya <al...@gmail.com> wrote:
> 
> 
>>I want to discuss one possible feature for mod_dav_svn.
>>The idea is very basic: let mod_dav_svn set the
>>"Last-Modified" HTTP header to last modification time
>>of requested resource in repository (only files, I'm not sure
>>what to do with directories).
>>This will allow to use Internet Caching (Squid)
>>very effectively, which will result in lower traffic and
>>performance improvement for remote users, who are using
>>casual HTTP browser to access to SVN repository.
>>
>>What respected community think about this?
> 
> 
> It's a great idea, but won't work. Currently Subversion requests
> REPORTs from the server. These are unique for every single session
> between a client and a server. Thus, the REPORT response we're using
> isn't cacheable.

Thanks for the response!
I was surpised :-)

I've suggested that protocol between SVN client and mod_dav_svn
most probably are not cacheable.

But, SVN repositiry could be also accessed for casual HTTP browsing.
As I understand in this case the browser request by GET method.
And during such browsing the repository via Firefox or IE,
mod_dav_svn can generate Last-Modified header since it really
knows that information.

> 
> However, right as we speak, a thread is going on in the dev@ mailing
> list (I forwarded this mail there) about the usefulness of being able
> to cache HTTP trafic between a client and the repository server.
> Maybe you can step in and add something to that thread?

I'm not subscribed to dev@ right at the moment.
Generally, I don't feel myself skilled enough to talk in dev@
with gurus ;-)

Bye,
Serge.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@subversion.tigris.org
For additional commands, e-mail: users-help@subversion.tigris.org


Re: Last-Modified HTTP header from mod_dav_svn

Posted by Kalle Olavi Niemitalo <ko...@iki.fi>.
Erik Huelsmann <eh...@gmail.com> writes:

> On 9/15/05, Sergey Proskurnya <al...@gmail.com> wrote:
>> This will allow to use Internet Caching (Squid)
>> very effectively, which will result in lower traffic and
>> performance improvement for remote users, who are using
>> casual HTTP browser to access to SVN repository.

> It's a great idea, but won't work. Currently Subversion requests
> REPORTs from the server. These are unique for every single session
> between a client and a server. Thus, the REPORT response we're using
> isn't cacheable.

However, if a "casual HTTP browser" contacts a Subversion repository,
it will be using GET, which could be cacheable.

On the other hand, mod_dav_svn is already generating ETag headers,
and an HTTP/1.1 cache can put the entity tag in an If-None-Match
request header, and presumably get back a 304 Not Modified status,
in the same way it would with Last-Modified and If-Modified-Since.

I suppose there could be two advantages with reporting Last-Modified
in addition to ETag:

- Perhaps some older caches support Last-Modified but not ETag.

- A cache can guess an expiry date based on how long the resource
  has already been unmodified.

- RFC 2616 section 13.3.4 says servers SHOULD do so.

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Erik Huelsmann <eh...@gmail.com>.
On 9/15/05, Sergey Proskurnya <al...@gmail.com> wrote:

> I want to discuss one possible feature for mod_dav_svn.
> The idea is very basic: let mod_dav_svn set the
> "Last-Modified" HTTP header to last modification time
> of requested resource in repository (only files, I'm not sure
> what to do with directories).
> This will allow to use Internet Caching (Squid)
> very effectively, which will result in lower traffic and
> performance improvement for remote users, who are using
> casual HTTP browser to access to SVN repository.
>
> What respected community think about this?

It's a great idea, but won't work. Currently Subversion requests
REPORTs from the server. These are unique for every single session
between a client and a server. Thus, the REPORT response we're using
isn't cacheable.

However, right as we speak, a thread is going on in the dev@ mailing
list (I forwarded this mail there) about the usefulness of being able
to cache HTTP trafic between a client and the repository server.

Maybe you can step in and add something to that thread?

bye,

Erik.

Re: Last-Modified HTTP header from mod_dav_svn

Posted by Erik Huelsmann <eh...@gmail.com>.
On 9/15/05, Sergey Proskurnya <al...@gmail.com> wrote:

> I want to discuss one possible feature for mod_dav_svn.
> The idea is very basic: let mod_dav_svn set the
> "Last-Modified" HTTP header to last modification time
> of requested resource in repository (only files, I'm not sure
> what to do with directories).
> This will allow to use Internet Caching (Squid)
> very effectively, which will result in lower traffic and
> performance improvement for remote users, who are using
> casual HTTP browser to access to SVN repository.
>
> What respected community think about this?

It's a great idea, but won't work. Currently Subversion requests
REPORTs from the server. These are unique for every single session
between a client and a server. Thus, the REPORT response we're using
isn't cacheable.

However, right as we speak, a thread is going on in the dev@ mailing
list (I forwarded this mail there) about the usefulness of being able
to cache HTTP trafic between a client and the repository server.

Maybe you can step in and add something to that thread?

bye,

Erik.