You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Brian Kim <09...@gmail.com> on 2009/08/25 23:54:40 UTC

[users@httpd] How to distinguish the first web page?

Hi.

Currently I am using mod_proxy_http module for http apache.
I would like to know how to get the very first page(text/html type)
among a series of returned pages.

For example, the following is a html of a site, www.foo.com. It has
two iframe in itself.

<html>
<iframe src="www.foo1.com".... >
............
<iframe src="www.foo2.com".... >

</html>

We get a html of www.foo.com, a html of www.foo1.com ,and then a html
of www.foo2.com in this order.
All of these pages have text/html type that I want to get.
My proxy wants to modify the very first web page only which is a html
of www.foo.com in the above example.
Is there any way to distinguish the main page and the other webpages
that is requested by the main page?

Thanks.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] How to distinguish the first web page?

Posted by Matus UHLAR - fantomas <uh...@fantomas.sk>.
On 26.08.09 09:40, Brian Kim wrote:
> I have used referer part of HTTP header. The problem is as follows.
> Actually, I also need to keep track on where users go.

need or want?

> In other words, if I only use the refer part of URL header, I cannot
> distinguish it from the case an user click one of the hyperlinks.

when user enters your web directly from URL line in browser or bookmarks, no
referer is set. If it enters your page via clicking on other page, referer
is set to the other page. Unless the user has browser plugin or proxy that
restricts referer from being sent (yes, that happens).

That is nearly everything you can get at thsi level. You can also set some
cookies and track them, but disabling cookies (for your site) is even easier
than disabling referer. 

> For this, I have used time check which may be weak.
> I am looking for a better way than these referer & time check.
> 
> Is there a concept of level in apache? For me, the main page is top
> level, but other iframe links of the main page is the lower level
> than that. Or Isn't there a concept of ID for each page? I mean the
> main page and other iframe links from the main page seems to belong to
> the same page,that is the main page. If they share the globally-unique
> id representing packets for the page, it would be helpful.
> 
> These two are imaginary way that I expect from apache. Is there
> something like that in apache?

There is nothing like that in whole HTTP protocol, therefore nothing like
that in apache. HTTP is designed for transferring HTTP objects, and is does
not support sessions, pages, etc. IT's browser who requests for content and
man provide infromations like referer, cookies and user-agent to the server.

Think carefully what will you expect from your users and their browsers -
by implementing useless requirements you can even loose them.
-- 
Matus UHLAR - fantomas, uhlar@fantomas.sk ; http://www.fantomas.sk/
Warning: I wish NOT to receive e-mail advertising to this address.
Varovanie: na tuto adresu chcem NEDOSTAVAT akukolvek reklamnu postu.
Windows 2000: 640 MB ought to be enough for anybody

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] How to distinguish the first web page?

Posted by Brian Kim <09...@gmail.com>.
Thanks. Krist van Besien.

I have used referer part of HTTP header. The problem is as follows.
Actually, I also need to keep track on where users go. In other words,
if I only use the refer part of URL header,
I cannot distinguish it from the case an user click one of the
hyperlinks. For this, I have used time check which may be weak.
I am looking for a better way than these referer & time check.

Is there a concept of level in apache? For me, the main page is top
level, but other iframe links of the main page is the lower level
than that. Or Isn't there a concept of ID for each page? I mean the
main page and other iframe links from the main page seems to belong to
the same page,that is the main page. If they share the globally-unique
id representing packets for the page, it would be helpful.

These two are imaginary way that I expect from apache. Is there
something like that in apache?

Or Any other suggestion?




On Wed, Aug 26, 2009 at 4:08 AM, Krist van
Besien<kr...@gmail.com> wrote:
> On Tue, Aug 25, 2009 at 11:54 PM, Brian Kim<09...@gmail.com> wrote:
>> Hi.
>>
>> Currently I am using mod_proxy_http module for http apache.
>> I would like to know how to get the very first page(text/html type)
>> among a series of returned pages.
>>
>> For example, the following is a html of a site, www.foo.com. It has
>> two iframe in itself.
>>
>> <html>
>> <iframe src="www.foo1.com".... >
>> ............
>> <iframe src="www.foo2.com".... >
>>
>> </html>
>>
>> We get a html of www.foo.com, a html of www.foo1.com ,and then a html
>> of www.foo2.com in this order.
>> All of these pages have text/html type that I want to get.
>> My proxy wants to modify the very first web page only which is a html
>> of www.foo.com in the above example.
>> Is there any way to distinguish the main page and the other webpages
>> that is requested by the main page?
>
> You can look at the "referrer" field. This contains the URL of the
> page the currently requested URL was found on.
>
> Krist
>
>
> --
> krist.vanbesien@gmail.com
> krist@vanbesien.org
> Bremgarten b. Bern, Switzerland
> --
> A: It reverses the normal flow of conversation.
> Q: What's wrong with top-posting?
> A: Top-posting.
> Q: What's the biggest scourge on plain text email discussions?
>
> ---------------------------------------------------------------------
> The official User-To-User support forum of the Apache HTTP Server Project.
> See <URL:http://httpd.apache.org/userslist.html> for more info.
> To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
>   "   from the digest: users-digest-unsubscribe@httpd.apache.org
> For additional commands, e-mail: users-help@httpd.apache.org
>
>

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] How to distinguish the first web page?

Posted by Krist van Besien <kr...@gmail.com>.
On Tue, Aug 25, 2009 at 11:54 PM, Brian Kim<09...@gmail.com> wrote:
> Hi.
>
> Currently I am using mod_proxy_http module for http apache.
> I would like to know how to get the very first page(text/html type)
> among a series of returned pages.
>
> For example, the following is a html of a site, www.foo.com. It has
> two iframe in itself.
>
> <html>
> <iframe src="www.foo1.com".... >
> ............
> <iframe src="www.foo2.com".... >
>
> </html>
>
> We get a html of www.foo.com, a html of www.foo1.com ,and then a html
> of www.foo2.com in this order.
> All of these pages have text/html type that I want to get.
> My proxy wants to modify the very first web page only which is a html
> of www.foo.com in the above example.
> Is there any way to distinguish the main page and the other webpages
> that is requested by the main page?

You can look at the "referrer" field. This contains the URL of the
page the currently requested URL was found on.

Krist


-- 
krist.vanbesien@gmail.com
krist@vanbesien.org
Bremgarten b. Bern, Switzerland
--
A: It reverses the normal flow of conversation.
Q: What's wrong with top-posting?
A: Top-posting.
Q: What's the biggest scourge on plain text email discussions?

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org