You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Marcos Mendez <ma...@gmail.com> on 2009/10/26 16:14:59 UTC

[users@httpd] ad-supported apache proxy

Does anyone have any suggestions about what is the best way to
implement an ad-supported proxy? I've got mod_substitute injecting
some content, but it only seems to work on simple websites. Should I
be looking at redirecting urls to a frame, and putting the ads there?
Is there any other way of doing it?

Regards,

Marcos

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Marcos Mendez <ma...@gmail.com>.
Hi, after playing around and not liking the results, I'm redirecting
to a cgi that does allow inserting of content and I'm getting my ads
now. Still using mod_proxy, but no substitutions.

Thanks for the help!

On Tue, Oct 27, 2009 at 1:07 PM, Marcos Mendez <ma...@gmail.com> wrote:
> On Tue, Oct 27, 2009 at 12:45 PM, Mike Cardwell
> <ap...@lists.grepular.com> wrote:
>> Marcos Mendez wrote:
>>
>>> Hi Mike, when I run google (yahoo, eweek etc) through the proxy (using
>>> mod_substitute) i do not see my modification in the page source, no
>>> matter where I try to insert it.
>>
>> Then why don't you start with something really basic like the following:
>>
>> Substitute "s|(<body[^>]*)>|$1>HELLO WORLD|iq"
>>
>> When you've got that inserting content after the body tag on every page by
>> checking the source, then you can worry about inserting more complicated
>> things than "HELLO WORLD"
>>
>> Disclaimer: I've not used "Substitute" before. I'm assuming the above is
>> sane considering what you originally posted.
>>
>> --
>> Mike Cardwell - IT Consultant and LAMP developer
>> Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
>> Technical Blog: https://secure.grepular.com/blog/
>>
>> ---------------------------------------------------------------------
>> The official User-To-User support forum of the Apache HTTP Server Project.
>> See <URL:http://httpd.apache.org/userslist.html> for more info.
>> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>>  "   from the digest: users-digest-unsubscribe@httpd.apache.org
>> For additional commands, e-mail: users-help@httpd.apache.org
>>
>>
>
> Hi Mike, I already posted an example of my proxy.conf and in fact that
> is what I'm doing now. But it does not work all the time, either
> because maybe I'm missing something with more complex sites, or
> they're doing something fancy that I can't follow.
>
> AddOutputFilterByType SUBSTITUTE text/html
> Substitute "s|<body?([^>]*?)>|<body$1><!-- my content -->|iq"
>
> This is why I'm trying to look for other options, or maybe it's just
> something I'm doing incorrectly. As I said, it does work with simple
> sites, but google, eweek, yahoo, slashdot, etc... just doesn't work.
> I've looked into mod_layout (not sure if it'll work through the proxy,
> can't find good documentation on 5.1), mod_publisher seems like it
> would work, but i get a content encoding error in the browser just by
> enabling it on google, yahoo, etc.
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Marcos Mendez <ma...@gmail.com>.
On Tue, Oct 27, 2009 at 12:45 PM, Mike Cardwell
<ap...@lists.grepular.com> wrote:
> Marcos Mendez wrote:
>
>> Hi Mike, when I run google (yahoo, eweek etc) through the proxy (using
>> mod_substitute) i do not see my modification in the page source, no
>> matter where I try to insert it.
>
> Then why don't you start with something really basic like the following:
>
> Substitute "s|(<body[^>]*)>|$1>HELLO WORLD|iq"
>
> When you've got that inserting content after the body tag on every page by
> checking the source, then you can worry about inserting more complicated
> things than "HELLO WORLD"
>
> Disclaimer: I've not used "Substitute" before. I'm assuming the above is
> sane considering what you originally posted.
>
> --
> Mike Cardwell - IT Consultant and LAMP developer
> Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
> Technical Blog: https://secure.grepular.com/blog/
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>  "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

Hi Mike, I already posted an example of my proxy.conf and in fact that
is what I'm doing now. But it does not work all the time, either
because maybe I'm missing something with more complex sites, or
they're doing something fancy that I can't follow.

AddOutputFilterByType SUBSTITUTE text/html
Substitute "s|<body?([^>]*?)>|<body$1><!-- my content -->|iq"

This is why I'm trying to look for other options, or maybe it's just
something I'm doing incorrectly. As I said, it does work with simple
sites, but google, eweek, yahoo, slashdot, etc... just doesn't work.
I've looked into mod_layout (not sure if it'll work through the proxy,
can't find good documentation on 5.1), mod_publisher seems like it
would work, but i get a content encoding error in the browser just by
enabling it on google, yahoo, etc.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Mike Cardwell <ap...@lists.grepular.com>.
Marcos Mendez wrote:

> Hi Mike, when I run google (yahoo, eweek etc) through the proxy (using
> mod_substitute) i do not see my modification in the page source, no
> matter where I try to insert it.

Then why don't you start with something really basic like the following:

Substitute "s|(<body[^>]*)>|$1>HELLO WORLD|iq"

When you've got that inserting content after the body tag on every page 
by checking the source, then you can worry about inserting more 
complicated things than "HELLO WORLD"

Disclaimer: I've not used "Substitute" before. I'm assuming the above is 
sane considering what you originally posted.

-- 
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
Technical Blog: https://secure.grepular.com/blog/

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Marcos Mendez <ma...@gmail.com>.
Hi Mike, when I run google (yahoo, eweek etc) through the proxy (using
mod_substitute) i do not see my modification in the page source, no
matter where I try to insert it.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Mike Cardwell <ap...@lists.grepular.com>.
Marcos Mendez wrote:

> Hi, I've tried this approach but it doesn't work reliably. I've tried
> replacing the start or end body, html and even head tags as I
> typically use a script hosted on my adserver to serve the ads. So I
> simply replaced the script with your suggestion. This approach does
> work for simple sites (and my script), but go to use it for eweek.com,
> google.com, or yahoo.com... the ad is nowhere to be seen. Am I missing
> something?
> 
> Here's my proxy.conf
> 
> <IfModule mod_proxy.c>
> 
>         <IfModule mod_ssl.c>
>                 SSLEngine on
>                 SSLProxyEngine on
>         </IfModule>
>         ProxyRequests On
>         NoProxy .myadserver.com myadserver.com 172.16.1.29
>         <Proxy *>
>                 AddOutputFilterByType SUBSTITUTE text/html
>                 Substitute "s|<body?([^>]*?)>|<body$1><div
> id=\"my_advert_box\"
> style=\"position:absolute;top:0;right:0;z-index:1000\"><a href=\"#\"
> click=\"document.getElementById('my_advert_box').style.display='none';\">[X]</a>Advert
> content</div>|iq"
>                 Order allow,deny
>                 Allow from all
>         </Proxy>
>         ProxyVia Off
> </IfModule>

When you go to the page through your proxy and look at the html source 
is the div actually being inserted correctly?

I just downloaded the google front page and manually inserted my div. I 
added a background color, some padding and a big red border. You can see 
it here:

https://secure.grepular.com/adtest/

All I did on that page was insert the div that you can see immediately 
after the body tag ends.

Looks like it displays properly to me. Although, google could 
technically be doing some clever stuff with javascript to remove 
unexpected html elements, but I doubt it...

Also, could you please avoid top-posting.

-- 
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
Technical Blog: https://secure.grepular.com/blog/

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Marcos Mendez <ma...@gmail.com>.
Had not heard of mod_publisher. I tried it out, and for my simple test
page it seems to work great. However when I visit other sites I'm
getting a content encoding error. I've disabled any content
modification, to test that just passing the pages through would work,
but it doesn't.

So added a LoadModule for mod_publisher and then AddOutputFilterByType
markup-publisher text/html. Try to visit eweek.com, jigsaw.com, etc
and get a content encoding error on the browser. Any ideas?

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Nick Kew <ni...@webthing.com>.
Marcos Mendez wrote:
> Hi, I've tried this approach but it doesn't work reliably. I've tried
> replacing the start or end body, html and even head tags as I
> typically use a script hosted on my adserver to serve the ads. So I
> simply replaced the script with your suggestion. This approach does
> work for simple sites (and my script), but go to use it for eweek.com,
> google.com, or yahoo.com... the ad is nowhere to be seen. Am I missing
> something?

If you want to transform markup, best to use a markup-aware filter.
mod_publisher does this kind of thing.

The alternative is to invert the task: your server serves a
standard template page, but most of that is an inclusion which
is the page originally requested.  Again, you'd want to hack
the included content (remove the <head> section), which makes
it better-suited to your own or syndicated contents rather than
general proxied contents.

-- 
Nick Kew

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by "Mark H. Wood" <mw...@IUPUI.Edu>.
I wouldn't be surprised if there are teams of sharp developers at
those *advertising-supported* organizations tasked with making it
difficult for you to compete with them.

-- 
Mark H. Wood, Lead System Programmer   mwood@IUPUI.Edu
Friends don't let friends publish revisable-form documents.

Re: [users@httpd] ad-supported apache proxy

Posted by Marcos Mendez <ma...@gmail.com>.
Hi, I've tried this approach but it doesn't work reliably. I've tried
replacing the start or end body, html and even head tags as I
typically use a script hosted on my adserver to serve the ads. So I
simply replaced the script with your suggestion. This approach does
work for simple sites (and my script), but go to use it for eweek.com,
google.com, or yahoo.com... the ad is nowhere to be seen. Am I missing
something?

Here's my proxy.conf

<IfModule mod_proxy.c>

        <IfModule mod_ssl.c>
                SSLEngine on
                SSLProxyEngine on
        </IfModule>
        ProxyRequests On
        NoProxy .myadserver.com myadserver.com 172.16.1.29
        <Proxy *>
                AddOutputFilterByType SUBSTITUTE text/html
                Substitute "s|<body?([^>]*?)>|<body$1><div
id=\"my_advert_box\"
style=\"position:absolute;top:0;right:0;z-index:1000\"><a href=\"#\"
click=\"document.getElementById('my_advert_box').style.display='none';\">[X]</a>Advert
content</div>|iq"
                Order allow,deny
                Allow from all
        </Proxy>
        ProxyVia Off
</IfModule>


On Tue, Oct 27, 2009 at 7:25 AM, Mike Cardwell
<ap...@lists.grepular.com> wrote:
> Marcos Mendez wrote:
>>
>> Thanks for the response. I tried compiling it and ran into issues.
>> However, I am not sure that fundamentally this is the best approach.
>> Replacing or injecting content into the proxied site content will
>> work. I've tried modifying the body, head, html, script tags. It just
>> hasn't worked reliably for me... for example with google.com or
>> yahoo.com. I was wondering if doing an internal redirect, framing the
>> url (so that it renders as normal) and placing my ads outside would be
>> better.
>
> How are you inserting the ads? I would probably approach this by inserting
> the ad immediately after the body tag and setting the position to absolute,
> with a high z-index so it's at the front. I would also add a small icon to
> close the advert in case it causes a problem by hovering over the top of
> important content.
>
> Something like this perhaps (untested)
>
> <body>
>   <div id="my_advert_box"
> style="position:absolute;top:0;right:0;z-index:1000">
>      <a href="#"
> onclick="document.getElementById('my_advert_box').style.display='none';">[X]</a>
>      Advert content
>   </div>
>
> --
> Mike Cardwell - IT Consultant and LAMP developer
> Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
> Technical Blog: https://secure.grepular.com/blog/
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>  "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Mike Cardwell <ap...@lists.grepular.com>.
Marcos Mendez wrote:
> Thanks for the response. I tried compiling it and ran into issues.
> However, I am not sure that fundamentally this is the best approach.
> Replacing or injecting content into the proxied site content will
> work. I've tried modifying the body, head, html, script tags. It just
> hasn't worked reliably for me... for example with google.com or
> yahoo.com. I was wondering if doing an internal redirect, framing the
> url (so that it renders as normal) and placing my ads outside would be
> better.

How are you inserting the ads? I would probably approach this by 
inserting the ad immediately after the body tag and setting the position 
to absolute, with a high z-index so it's at the front. I would also add 
a small icon to close the advert in case it causes a problem by hovering 
over the top of important content.

Something like this perhaps (untested)

<body>
    <div id="my_advert_box" 
style="position:absolute;top:0;right:0;z-index:1000">
       <a href="#" 
onclick="document.getElementById('my_advert_box').style.display='none';">[X]</a>
       Advert content
    </div>

-- 
Mike Cardwell - IT Consultant and LAMP developer
Cardwell IT Ltd. (UK Reg'd Company #06920226) http://cardwellit.com/
Technical Blog: https://secure.grepular.com/blog/

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by Marcos Mendez <ma...@gmail.com>.
Thanks for the response. I tried compiling it and ran into issues.
However, I am not sure that fundamentally this is the best approach.
Replacing or injecting content into the proxied site content will
work. I've tried modifying the body, head, html, script tags. It just
hasn't worked reliably for me... for example with google.com or
yahoo.com. I was wondering if doing an internal redirect, framing the
url (so that it renders as normal) and placing my ads outside would be
better.

On Mon, Oct 26, 2009 at 2:59 PM, William A. Rowe, Jr.
<wr...@rowe-clan.net> wrote:
> Marcos Mendez wrote:
>> Does anyone have any suggestions about what is the best way to
>> implement an ad-supported proxy? I've got mod_substitute injecting
>> some content, but it only seems to work on simple websites. Should I
>> be looking at redirecting urls to a frame, and putting the ads there?
>> Is there any other way of doing it?
>
> Take a look at mod_sed in trunk/; because it's a -full- implementation
> of sed rather than just substitute-lines, so you can do line-oriented
> insert/delete/buffer merge etc etc etc.
>
> mod_substitute is replaced with mod_sed in the next major httpd release
> but you should be able to compile in that module without too much trouble
> into 2.2.
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] ad-supported apache proxy

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
Marcos Mendez wrote:
> Does anyone have any suggestions about what is the best way to
> implement an ad-supported proxy? I've got mod_substitute injecting
> some content, but it only seems to work on simple websites. Should I
> be looking at redirecting urls to a frame, and putting the ads there?
> Is there any other way of doing it?

Take a look at mod_sed in trunk/; because it's a -full- implementation
of sed rather than just substitute-lines, so you can do line-oriented
insert/delete/buffer merge etc etc etc.

mod_substitute is replaced with mod_sed in the next major httpd release
but you should be able to compile in that module without too much trouble
into 2.2.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org