You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by Jack Bates <du...@nottheoilrig.com> on 2012/03/11 05:43:13 UTC

Download mirrors and cache hits

Websites often mirror downloads on several different servers, e.g. 
http://getfirefox.com/

When I download Firefox, I sometimes get:

   * 
http://mozilla.mirror.ac.za/firefox/releases/10.0.2/linux-i686/en-US/firefox-10.0.2.tar.bz2
   * 
http://mozilla.mirror.aarnet.edu.au/pub/mozilla/firefox/releases/10.0.2/linux-i686/en-US/firefox-10.0.2.tar.bz2
   * etc.

What can sites like this do to help intermediate proxies like Apache 
Traffic Server make cache hits from requests for the same content from 
different mirrors?

Are there any best practices?

Re: Download mirrors and cache hits

Posted by Igor Galić <i....@brainsware.org>.
[snip]
> > I am new to Traffic Server code, but there are other plugins to
> > study and the scope of a Metalink plugin is modest
> 
> Leif has volunteered to me a GSoC mentor and he's a very good person
> to help. If you file a JIRA ticket at
> <https://issues.apache.org/jira/browse/TS> describing your proposal
> and label it with the "gsoc2012" tag, that will cause the Apache
> organization to process it.
> 
> If anyone else out there has a GSoC project in mind, I'd encourage
> you to do the same!

I wonder if I should label my Lua jiras as GSoC..

https://issues.apache.org/jira/browse/TS-1136
https://issues.apache.org/jira/browse/TS-1139
https://issues.apache.org/jira/browse/TS-1137

> J


i

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/
GPG: 6880 4155 74BD FD7C B515  2EA5 4B1D 9E08 A097 C9AE

Re: Download mirrors and cache hits

Posted by James Peach <ja...@me.com>.
On Mar 29, 2012, at 3:30 AM, Jack Bates wrote:

> On Sat, Mar 24, 2012 at 06:07:45PM -0700, James Peach wrote:
>> On Mar 24, 2012, at 12:06 AM, Jack Bates <6n...@nottheoilrig.com> wrote:

[snip]

>> 
>>> Could this behavior be achieved with any of the existing "remap" plugins?
>> 
>> I don't think so, but I don't have much experience with them. It might be worth poking around the trafficserver-plugins repository to see whether there's anything similar ...
> 
> Since the Metalink project is accepted to the Google Summer of Code this year, maybe I can work on a Metalink Traffic Server plugin, and participate in GSoC? No one in the Metalink project is familiar with Traffic Server: Are any Traffic Server project members interested in mentoring work on a Metalink plugin?
> 
> I am from Vancouver in Canada but am volunteering here in Rwanda for a year. My open source experience includes maintiaing a couple packages in Debian. Once upon a time I contributed to the Gallery web photo album project: I successfully contributed a WebDAV module. I contributed some small patches to Asterisk, the Python OpenID library, the Android Open Source Project
> 
> I am new to Traffic Server code, but there are other plugins to study and the scope of a Metalink plugin is modest

Leif has volunteered to me a GSoC mentor and he's a very good person to help. If you file a JIRA ticket at <https://issues.apache.org/jira/browse/TS> describing your proposal and label it with the "gsoc2012" tag, that will cause the Apache organization to process it.

If anyone else out there has a GSoC project in mind, I'd encourage you to do the same!

J


Re: Download mirrors and cache hits

Posted by Jack Bates <6n...@nottheoilrig.com>.
On Sat, Mar 24, 2012 at 06:07:45PM -0700, James Peach wrote:
> On Mar 24, 2012, at 12:06 AM, Jack Bates <6n...@nottheoilrig.com> wrote:
> > On 22/03/12 10:08 PM, James Peach wrote:
> >> On 22/03/2012, at 4:19 AM, Jack Bates wrote:
> >>> On 10/03/12 08:43 PM, Jack Bates wrote:
> >>>> What can sites like this do to help intermediate proxies like Apache
> >>>> Traffic Server make cache hits from requests for the same content from
> >>>> different mirrors?
> >>>> 
> >>>> Are there any best practices?
> >>> 
> >>> Has there ever been any discussion or work on supporting RFC 6249 [1] or "Metalink" [2] in Apache Traffic Server?
> >> 
> >> No I don't think so.
> >> 
> >>> I found this [3] mention of Metalink and intermediate caching proxies ("Metalink integration with Proxy/Cache" heading)
> >> 
> >> That sounds like a pretty useful project. There's probably a number of different ways a plugin could implement this. Folks on the dev list or #traffic-server would be happy to help.
> >> 
> >>> [1] http://tools.ietf.org/html/rfc6249
> >>> [2] http://metalinker.org/
> >>> [3] http://sourceforge.net/apps/trac/metalinks/wiki/GsocIdeas
> > 
> > Thank you very much James, I am replying to the dev list
> > 
> > Perhaps a minimum viable product could:
> > 
> >  1. If the response status code is 3XX
> >  2. Scan RFC 6249 "Link: <...>; rel=duplicate" headers for URL that already exist in cache
> >  3. If found, update "Location: ..." header with this URL and pass on response
> 
> That sounds like a promising approach.
> 
> > It could also check for an RFC 3230 "Digest: ..." header and, if found, check for content with same digest already in cache?
> 
> I think there was some conversation on IRC indicating that doing cage lookups by hash would require a fair amount of change to the cache.
> 
> > Could this behavior be achieved with any of the existing "remap" plugins?
> 
> I don't think so, but I don't have much experience with them. It might be worth poking around the trafficserver-plugins repository to see whether there's anything similar ...

Since the Metalink project is accepted to the Google Summer of Code this year, maybe I can work on a Metalink Traffic Server plugin, and participate in GSoC? No one in the Metalink project is familiar with Traffic Server: Are any Traffic Server project members interested in mentoring work on a Metalink plugin?

I am from Vancouver in Canada but am volunteering here in Rwanda for a year. My open source experience includes maintiaing a couple packages in Debian. Once upon a time I contributed to the Gallery web photo album project: I successfully contributed a WebDAV module. I contributed some small patches to Asterisk, the Python OpenID library, the Android Open Source Project

I am new to Traffic Server code, but there are other plugins to study and the scope of a Metalink plugin is modest

Re: Download mirrors and cache hits

Posted by James Peach <ja...@me.com>.
On Mar 24, 2012, at 12:06 AM, Jack Bates <6n...@nottheoilrig.com> wrote:

> On 22/03/12 10:08 PM, James Peach wrote:
>> On 22/03/2012, at 4:19 AM, Jack Bates wrote:
>>> On 10/03/12 08:43 PM, Jack Bates wrote:
>>>> What can sites like this do to help intermediate proxies like Apache
>>>> Traffic Server make cache hits from requests for the same content from
>>>> different mirrors?
>>>> 
>>>> Are there any best practices?
>>> 
>>> Has there ever been any discussion or work on supporting RFC 6249 [1] or "Metalink" [2] in Apache Traffic Server?
>> 
>> No I don't think so.
>> 
>>> I found this [3] mention of Metalink and intermediate caching proxies ("Metalink integration with Proxy/Cache" heading)
>> 
>> That sounds like a pretty useful project. There's probably a number of different ways a plugin could implement this. Folks on the dev list or #traffic-server would be happy to help.
>> 
>>> [1] http://tools.ietf.org/html/rfc6249
>>> [2] http://metalinker.org/
>>> [3] http://sourceforge.net/apps/trac/metalinks/wiki/GsocIdeas
> 
> Thank you very much James, I am replying to the dev list
> 
> Perhaps a minimum viable product could:
> 
>  1. If the response status code is 3XX
>  2. Scan RFC 6249 "Link: <...>; rel=duplicate" headers for URL that already exist in cache
>  3. If found, update "Location: ..." header with this URL and pass on response

That sounds like a promising approach.


> 
> It could also check for an RFC 3230 "Digest: ..." header and, if found, check for content with same digest already in cache?

I think there was some conversation on IRC indicating that doing cage lookups by hash would require a fair amount of change to the cache.

> Could this behavior be achieved with any of the existing "remap" plugins?

I don't think so, but I don't have much experience with them. It might be worth poking around the trafficserver-plugins repository to see whether there's anything similar ...

J

Re: Download mirrors and cache hits

Posted by Jack Bates <6n...@nottheoilrig.com>.
On 22/03/12 10:08 PM, James Peach wrote:
> On 22/03/2012, at 4:19 AM, Jack Bates wrote:
>> On 10/03/12 08:43 PM, Jack Bates wrote:
>>> What can sites like this do to help intermediate proxies like Apache
>>> Traffic Server make cache hits from requests for the same content from
>>> different mirrors?
>>>
>>> Are there any best practices?
>>
>> Has there ever been any discussion or work on supporting RFC 6249 [1] or "Metalink" [2] in Apache Traffic Server?
>
> No I don't think so.
>
>> I found this [3] mention of Metalink and intermediate caching proxies ("Metalink integration with Proxy/Cache" heading)
>
> That sounds like a pretty useful project. There's probably a number of different ways a plugin could implement this. Folks on the dev list or #traffic-server would be happy to help.
>
>> [1] http://tools.ietf.org/html/rfc6249
>> [2] http://metalinker.org/
>> [3] http://sourceforge.net/apps/trac/metalinks/wiki/GsocIdeas

Thank you very much James, I am replying to the dev list

Perhaps a minimum viable product could:

   1. If the response status code is 3XX
   2. Scan RFC 6249 "Link: <...>; rel=duplicate" headers for URL that 
already exist in cache
   3. If found, update "Location: ..." header with this URL and pass on 
response

It could also check for an RFC 3230 "Digest: ..." header and, if found, 
check for content with same digest already in cache?

Could this behavior be achieved with any of the existing "remap" plugins?

Re: Download mirrors and cache hits

Posted by Jack Bates <du...@nottheoilrig.com>.
On 22/03/12 10:08 PM, James Peach wrote:
> On 22/03/2012, at 4:19 AM, Jack Bates wrote:
>> On 10/03/12 08:43 PM, Jack Bates wrote:
>>> What can sites like this do to help intermediate proxies like Apache
>>> Traffic Server make cache hits from requests for the same content from
>>> different mirrors?
>>>
>>> Are there any best practices?
>>
>> Has there ever been any discussion or work on supporting RFC 6249 [1] or "Metalink" [2] in Apache Traffic Server?
>
> No I don't think so.
>
>> I found this [3] mention of Metalink and intermediate caching proxies ("Metalink integration with Proxy/Cache" heading)
>
> That sounds like a pretty useful project. There's probably a number of different ways a plugin could implement this. Folks on the dev list or #traffic-server would be happy to help.
>
>> [1] http://tools.ietf.org/html/rfc6249
>> [2] http://metalinker.org/
>> [3] http://sourceforge.net/apps/trac/metalinks/wiki/GsocIdeas

Thank you very much James, I am replying to the dev list

Perhaps a minimum viable product could:

   1. If the response status code is 3XX
   2. Scan RFC 6249 "Link: <...>; rel=duplicate" headers for URL that 
already exist in cache
   3. If found, update "Location: ..." header with this URL and pass on 
response

It could also check for an RFC 3230 "Digest: ..." header and, if found, 
check for content with same digest already in cache?

Could this behavior be achieved with any of the existing "remap" plugins?

Re: Download mirrors and cache hits

Posted by James Peach <jp...@apache.org>.
On 22/03/2012, at 4:19 AM, Jack Bates wrote:

> On 10/03/12 08:43 PM, Jack Bates wrote:
>> What can sites like this do to help intermediate proxies like Apache
>> Traffic Server make cache hits from requests for the same content from
>> different mirrors?
>> 
>> Are there any best practices?
> 
> Has there ever been any discussion or work on supporting RFC 6249 [1] or "Metalink" [2] in Apache Traffic Server?

No I don't think so.

> 
> I found this [3] mention of Metalink and intermediate caching proxies ("Metalink integration with Proxy/Cache" heading)

That sounds like a pretty useful project. There's probably a number of different ways a plugin could implement this. Folks on the dev list or #traffic-server would be happy to help.

> 
> [1] http://tools.ietf.org/html/rfc6249
> [2] http://metalinker.org/
> [3] http://sourceforge.net/apps/trac/metalinks/wiki/GsocIdeas


Re: Download mirrors and cache hits

Posted by Jack Bates <du...@nottheoilrig.com>.
On 10/03/12 08:43 PM, Jack Bates wrote:
> What can sites like this do to help intermediate proxies like Apache
> Traffic Server make cache hits from requests for the same content from
> different mirrors?
>
> Are there any best practices?

Has there ever been any discussion or work on supporting RFC 6249 [1] or 
"Metalink" [2] in Apache Traffic Server?

I found this [3] mention of Metalink and intermediate caching proxies 
("Metalink integration with Proxy/Cache" heading)

[1] http://tools.ietf.org/html/rfc6249
[2] http://metalinker.org/
[3] http://sourceforge.net/apps/trac/metalinks/wiki/GsocIdeas