You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by J David <j....@gmail.com> on 2015/02/27 06:51:30 UTC

Migrating from squid

(Not sure if this is best for the user list or the dev list, as it's
development, but not of ATS per se.)

Squid offers a feature called url_rewrite_program that can be used to
change its behavior based on client IP, requested URL, and various
other criteria.  It runs a bunch of copies of an external program and
passes information to and from them over pipes.

Overall, ATS is a much better fit for our environment, but we are
heavily dependent on this feature of Squid.

The short version of what we need to do is this:

Based on a (ClientIP,RequestURL) pair, either choose a backend (by
name or IP) to handle the request *or* issue an arbitrary redirect
*or* return an HTTP error.

According to the "squid config translation" docs, the equivalent for
url_rewrite_program is remap.config.  Since that does not actually
support external programs, it appears the intent is to use the
built-in remapping abilities to replicate the functionality of the
external program.  However, that is not always possible.

The messy part that prevents us from doing that or hacking up a module
is that the external program we use with squid is ridiculously
complicated, pulling together information from multiple sources, all
of them dynamic, and gets linked to all kinds of libraries like MySQL,
memcached, and distributed message passing stuff.

What is the best way to migrate this functionality to ATS?  The whole
big ball of wax seems like a really poor candidate for a built-in ATS
module due to all the external dependencies, many of which run
asynchronously and can block.  It seems really advantageous to seal
that off So what we probably really need is a pretty efficient way to
emulate squid's ability to call out to an external program.

If that's true, has anyone done something similar?  Which ATS module
hook(s) would be the best to use, and how would we handle the need to
yield while we wait for the external program to work?

Thanks for any advice!

Re: Migrating from squid

Posted by J David <j....@gmail.com>.
On Fri, Feb 27, 2015 at 10:50 AM, Leif Hedstrom <zw...@apache.org> wrote:
> Dealing with external squid helpers is a bit wonky, but you probably could implement something in a plugin that does it. The fact that you are doing so much weirdness (MySql, Memcached) makes it particularly tough, I’m not sure how Squid deals with that?

Squid prestarts a (configurable, large) number of external rewriters
sufficient to handle the #requests * delay product.

On Fri, Feb 27, 2015 at 11:10 AM, Faysal Banna <de...@gmail.com> wrote:
> I have been doing this using nothing but lua injecting/retrieving  data from
> mysql, mongodb , sqlite ...

Can the lua approach hold resources open between requests, preferably
in some kind of managed resource pool, like the APR offers in Apache?
Forking a process from lua for every incoming request will lead to a
*tremendous* amount of context-switching overhead at high request
rates.  And that's assuming that we pare the process being exec'd down
to some sort of thin client, as the current rewriter takes several
seconds to spin up and assimilate initial data.  And that kind of
split is definitely reasonable & doable.

This *might* work if we can get all the info about the request at the
Lua level and dispatch all the various types of responses (backend
selection for existing URL, 301+new URL, 4XX/5XX error) from there.
But to get really good performance, the best approach would probably
be to maintain a persistent pool of open sockets to the external
logic.  Is that possible?

Thanks!

Re: Migrating from squid

Posted by Faysal Banna <de...@gmail.com>.
I have been doing this using nothing but lua injecting/retrieving  data
from  mysql, mongodb , sqlite ...
controlling linux firewall and shaper. os.execute if just to dispatch an
order to the bash or os.popen if you need a result from a query

and as far ATS+lua been doing great with good reasonable amount of requests
(5k req/s) no performance penalties and not seen any drawback as of yet

much regards

On Fri, Feb 27, 2015 at 6:03 PM, James Peach <jp...@apache.org> wrote:

>
> > On Feb 27, 2015, at 7:50 AM, Leif Hedstrom <zw...@apache.org> wrote:
> >
> >>
> >> On Feb 27, 2015, at 12:51 AM, J David <j....@gmail.com> wrote:
> >>
> >> (Not sure if this is best for the user list or the dev list, as it's
> >> development, but not of ATS per se.)
> >>
> >> Squid offers a feature called url_rewrite_program that can be used to
> >> change its behavior based on client IP, requested URL, and various
> >> other criteria.  It runs a bunch of copies of an external program and
> >> passes information to and from them over pipes.
> >>
> >> Overall, ATS is a much better fit for our environment, but we are
> >> heavily dependent on this feature of Squid.
> >>
> >> The short version of what we need to do is this:
> >>
> >> Based on a (ClientIP,RequestURL) pair, either choose a backend (by
> >> name or IP) to handle the request *or* issue an arbitrary redirect
> >> *or* return an HTTP error.
> >>
> >> According to the "squid config translation" docs, the equivalent for
> >> url_rewrite_program is remap.config.  Since that does not actually
> >> support external programs, it appears the intent is to use the
> >> built-in remapping abilities to replicate the functionality of the
> >> external program.  However, that is not always possible.
> >>
> >> The messy part that prevents us from doing that or hacking up a module
> >> is that the external program we use with squid is ridiculously
> >> complicated, pulling together information from multiple sources, all
> >> of them dynamic, and gets linked to all kinds of libraries like MySQL,
> >> memcached, and distributed message passing stuff.
> >>
> >> What is the best way to migrate this functionality to ATS?  The whole
> >> big ball of wax seems like a really poor candidate for a built-in ATS
> >> module due to all the external dependencies, many of which run
> >> asynchronously and can block.  It seems really advantageous to seal
> >> that off So what we probably really need is a pretty efficient way to
> >> emulate squid's ability to call out to an external program.
> >
> >
> > Dealing with external squid helpers is a bit wonky, but you probably
> could implement something in a plugin that does it. The fact that you are
> doing so much weirdness (MySql, Memcached) makes it particularly tough, I’m
> not sure how Squid deals with that?
> >
> > Dealing with synchronous APIs such as MySQL is tricky to say the least.
> >
> > I’m not sure we have a good answer here, other than you probably need to
> try to implement this as a plugin.
>
> You could do this as a server intercept plugin. You could exec the helper,
> wire up pipes to it's standard I/O, then use TSVConnFdCreate to suck the
> response back into Traffic Server.
>
>
> https://docs.trafficserver.apache.org/en/latest/reference/api/TSVConnFdCreate.en.html
>
> https://github.com/apache/trafficserver/blob/master/example/intercept/intercept.cc
>
> J




-- 
============================
         Faysal Banna
 Meteorological Services
Rafic Harriri International Airport
      Beirut - Lebanon
    Mob: +961-3-258043
=============================

Re: Migrating from squid

Posted by James Peach <jp...@apache.org>.
> On Feb 27, 2015, at 7:50 AM, Leif Hedstrom <zw...@apache.org> wrote:
> 
>> 
>> On Feb 27, 2015, at 12:51 AM, J David <j....@gmail.com> wrote:
>> 
>> (Not sure if this is best for the user list or the dev list, as it's
>> development, but not of ATS per se.)
>> 
>> Squid offers a feature called url_rewrite_program that can be used to
>> change its behavior based on client IP, requested URL, and various
>> other criteria.  It runs a bunch of copies of an external program and
>> passes information to and from them over pipes.
>> 
>> Overall, ATS is a much better fit for our environment, but we are
>> heavily dependent on this feature of Squid.
>> 
>> The short version of what we need to do is this:
>> 
>> Based on a (ClientIP,RequestURL) pair, either choose a backend (by
>> name or IP) to handle the request *or* issue an arbitrary redirect
>> *or* return an HTTP error.
>> 
>> According to the "squid config translation" docs, the equivalent for
>> url_rewrite_program is remap.config.  Since that does not actually
>> support external programs, it appears the intent is to use the
>> built-in remapping abilities to replicate the functionality of the
>> external program.  However, that is not always possible.
>> 
>> The messy part that prevents us from doing that or hacking up a module
>> is that the external program we use with squid is ridiculously
>> complicated, pulling together information from multiple sources, all
>> of them dynamic, and gets linked to all kinds of libraries like MySQL,
>> memcached, and distributed message passing stuff.
>> 
>> What is the best way to migrate this functionality to ATS?  The whole
>> big ball of wax seems like a really poor candidate for a built-in ATS
>> module due to all the external dependencies, many of which run
>> asynchronously and can block.  It seems really advantageous to seal
>> that off So what we probably really need is a pretty efficient way to
>> emulate squid's ability to call out to an external program.
> 
> 
> Dealing with external squid helpers is a bit wonky, but you probably could implement something in a plugin that does it. The fact that you are doing so much weirdness (MySql, Memcached) makes it particularly tough, I’m not sure how Squid deals with that?
> 
> Dealing with synchronous APIs such as MySQL is tricky to say the least.
> 
> I’m not sure we have a good answer here, other than you probably need to try to implement this as a plugin.

You could do this as a server intercept plugin. You could exec the helper, wire up pipes to it's standard I/O, then use TSVConnFdCreate to suck the response back into Traffic Server.

https://docs.trafficserver.apache.org/en/latest/reference/api/TSVConnFdCreate.en.html
https://github.com/apache/trafficserver/blob/master/example/intercept/intercept.cc

J

Re: Migrating from squid

Posted by Leif Hedstrom <zw...@apache.org>.
> On Feb 27, 2015, at 12:51 AM, J David <j....@gmail.com> wrote:
> 
> (Not sure if this is best for the user list or the dev list, as it's
> development, but not of ATS per se.)
> 
> Squid offers a feature called url_rewrite_program that can be used to
> change its behavior based on client IP, requested URL, and various
> other criteria.  It runs a bunch of copies of an external program and
> passes information to and from them over pipes.
> 
> Overall, ATS is a much better fit for our environment, but we are
> heavily dependent on this feature of Squid.
> 
> The short version of what we need to do is this:
> 
> Based on a (ClientIP,RequestURL) pair, either choose a backend (by
> name or IP) to handle the request *or* issue an arbitrary redirect
> *or* return an HTTP error.
> 
> According to the "squid config translation" docs, the equivalent for
> url_rewrite_program is remap.config.  Since that does not actually
> support external programs, it appears the intent is to use the
> built-in remapping abilities to replicate the functionality of the
> external program.  However, that is not always possible.
> 
> The messy part that prevents us from doing that or hacking up a module
> is that the external program we use with squid is ridiculously
> complicated, pulling together information from multiple sources, all
> of them dynamic, and gets linked to all kinds of libraries like MySQL,
> memcached, and distributed message passing stuff.
> 
> What is the best way to migrate this functionality to ATS?  The whole
> big ball of wax seems like a really poor candidate for a built-in ATS
> module due to all the external dependencies, many of which run
> asynchronously and can block.  It seems really advantageous to seal
> that off So what we probably really need is a pretty efficient way to
> emulate squid's ability to call out to an external program.


Dealing with external squid helpers is a bit wonky, but you probably could implement something in a plugin that does it. The fact that you are doing so much weirdness (MySql, Memcached) makes it particularly tough, I’m not sure how Squid deals with that?

Dealing with synchronous APIs such as MySQL is tricky to say the least.

I’m not sure we have a good answer here, other than you probably need to try to implement this as a plugin. Alternatively, I know some people have looked at ICAP (http://www.rfc-editor.org/rfc/rfc3507.txt), but I don’t know if that helps you at all either. And, we still don’t have that in our code, so whoever has ICAP implemented, please open source it :-).

— Leif


Re: Migrating from squid

Posted by Faysal Banna <de...@gmail.com>.
Sir .
you may have a look at https://issues.apache.org/jira/browse/TS-2643
and get an idea

Regards

On Fri, Feb 27, 2015 at 11:16 AM, Faysal Banna <de...@gmail.com> wrote:

> Sir.
> the best way to use or migrate that feature is to do ts_lua .. i been
> doing it for a while now and it works perfectly
>
> it has everything one needs and you can at any time use os.execute from
> inside Lua script
> or you can use
>
> local handle = io.popen(command)local result = handle:read("*a")
> handle:close()
>
> if you need a result. and its asynchronous with the system so it won't block other operations in ATS.
>
> its fast and pretty good to use.
>
>
> i use it lately to rate-limit incoming connections from origin servers and save bandwidth for certain object downloads.
>
> if you need any assistance i may be able to help
>
> much regards
>
>
> On Fri, Feb 27, 2015 at 7:51 AM, J David <j....@gmail.com> wrote:
>
>> (Not sure if this is best for the user list or the dev list, as it's
>> development, but not of ATS per se.)
>>
>> Squid offers a feature called url_rewrite_program that can be used to
>> change its behavior based on client IP, requested URL, and various
>> other criteria.  It runs a bunch of copies of an external program and
>> passes information to and from them over pipes.
>>
>> Overall, ATS is a much better fit for our environment, but we are
>> heavily dependent on this feature of Squid.
>>
>> The short version of what we need to do is this:
>>
>> Based on a (ClientIP,RequestURL) pair, either choose a backend (by
>> name or IP) to handle the request *or* issue an arbitrary redirect
>> *or* return an HTTP error.
>>
>> According to the "squid config translation" docs, the equivalent for
>> url_rewrite_program is remap.config.  Since that does not actually
>> support external programs, it appears the intent is to use the
>> built-in remapping abilities to replicate the functionality of the
>> external program.  However, that is not always possible.
>>
>> The messy part that prevents us from doing that or hacking up a module
>> is that the external program we use with squid is ridiculously
>> complicated, pulling together information from multiple sources, all
>> of them dynamic, and gets linked to all kinds of libraries like MySQL,
>> memcached, and distributed message passing stuff.
>>
>> What is the best way to migrate this functionality to ATS?  The whole
>> big ball of wax seems like a really poor candidate for a built-in ATS
>> module due to all the external dependencies, many of which run
>> asynchronously and can block.  It seems really advantageous to seal
>> that off So what we probably really need is a pretty efficient way to
>> emulate squid's ability to call out to an external program.
>>
>> If that's true, has anyone done something similar?  Which ATS module
>> hook(s) would be the best to use, and how would we handle the need to
>> yield while we wait for the external program to work?
>>
>> Thanks for any advice!
>>
>
>
>
> --
> ============================
>          Faysal Banna
>  Meteorological Services
> Rafic Harriri International Airport
>       Beirut - Lebanon
>     Mob: +961-3-258043
> =============================
>



-- 
============================
         Faysal Banna
 Meteorological Services
Rafic Harriri International Airport
      Beirut - Lebanon
    Mob: +961-3-258043
=============================

Re: Migrating from squid

Posted by Faysal Banna <de...@gmail.com>.
Sir.
the best way to use or migrate that feature is to do ts_lua .. i been doing
it for a while now and it works perfectly

it has everything one needs and you can at any time use os.execute from
inside Lua script
or you can use

local handle = io.popen(command)local result = handle:read("*a")
handle:close()

if you need a result. and its asynchronous with the system so it won't
block other operations in ATS.

its fast and pretty good to use.


i use it lately to rate-limit incoming connections from origin servers
and save bandwidth for certain object downloads.

if you need any assistance i may be able to help

much regards


On Fri, Feb 27, 2015 at 7:51 AM, J David <j....@gmail.com> wrote:

> (Not sure if this is best for the user list or the dev list, as it's
> development, but not of ATS per se.)
>
> Squid offers a feature called url_rewrite_program that can be used to
> change its behavior based on client IP, requested URL, and various
> other criteria.  It runs a bunch of copies of an external program and
> passes information to and from them over pipes.
>
> Overall, ATS is a much better fit for our environment, but we are
> heavily dependent on this feature of Squid.
>
> The short version of what we need to do is this:
>
> Based on a (ClientIP,RequestURL) pair, either choose a backend (by
> name or IP) to handle the request *or* issue an arbitrary redirect
> *or* return an HTTP error.
>
> According to the "squid config translation" docs, the equivalent for
> url_rewrite_program is remap.config.  Since that does not actually
> support external programs, it appears the intent is to use the
> built-in remapping abilities to replicate the functionality of the
> external program.  However, that is not always possible.
>
> The messy part that prevents us from doing that or hacking up a module
> is that the external program we use with squid is ridiculously
> complicated, pulling together information from multiple sources, all
> of them dynamic, and gets linked to all kinds of libraries like MySQL,
> memcached, and distributed message passing stuff.
>
> What is the best way to migrate this functionality to ATS?  The whole
> big ball of wax seems like a really poor candidate for a built-in ATS
> module due to all the external dependencies, many of which run
> asynchronously and can block.  It seems really advantageous to seal
> that off So what we probably really need is a pretty efficient way to
> emulate squid's ability to call out to an external program.
>
> If that's true, has anyone done something similar?  Which ATS module
> hook(s) would be the best to use, and how would we handle the need to
> yield while we wait for the external program to work?
>
> Thanks for any advice!
>



-- 
============================
         Faysal Banna
 Meteorological Services
Rafic Harriri International Airport
      Beirut - Lebanon
    Mob: +961-3-258043
=============================