You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@velocity.apache.org by johne <je...@yahoo.com> on 2008/06/24 04:08:26 UTC

Not understanding something

I have been using Velocity for a long time.  I am seeing something odd.  I
was trying to remove jsession ids from non-user accounts (bots) by removing
the URL rewriting in a tomcat 6.0 filter.  I then just had this filter print
out debug information for bots coming in and the URL they see.  I see
something to the affect of this in my debug log:

INFO  [080620 23:47:13] Spider User Agent (msnbot/1.1
(+http://search.msn.com/msnbot.htm)) looking at page:
http://mydomain.com/action/pub/my-listing
INFO  [080620 23:47:13] Spider User Agent (msnbot/1.1
(+http://search.msn.com/msnbot.htm)) looking at page:
http://mydomain.com/WEB-INF/templates/common/myMainTemplate.vm


Though I have one debug message printing out, I was only expecting to see
the first debug message to print out.  The second message is very
unexpected.  It is the template that was being used to render page the bot
was looking at.

Why would this message show up if using Velocity along with Tiles/Struts?

I even have access control prevented in this template directory so very
weird.  I must be missing something.

All the pages look great, I just would not want bots to pick up on things
they should not see.

Regards,

John


-----
JohnE

http://www.jobbank.com jobbank.com 
-- 
View this message in context: http://www.nabble.com/Not-understanding-something-tp18082309p18082309.html
Sent from the Velocity - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@velocity.apache.org
For additional commands, e-mail: user-help@velocity.apache.org


Re: Not understanding something

Posted by johne <je...@yahoo.com>.
That was a great answer.  I learned a lot.  Thank you.

-----
JohnE

http://www.jobbank.com jobbank.com 
-- 
View this message in context: http://www.nabble.com/Not-understanding-something-tp18082309p18135146.html
Sent from the Velocity - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@velocity.apache.org
For additional commands, e-mail: user-help@velocity.apache.org


Re: Not understanding something

Posted by Christopher Schultz <ch...@christopherschultz.net>.
John,

johne wrote:
> I had recently upgraded to Tomcat 6 and am using the following on each of my
> filter mappings (I have several chained filters).  It is the first time I
> had chosen to use this, so maybe I am doing something wrong.
> 
>     <filter-mapping>
>         <filter-name>User Agent Filter</filter-name>
>         <url-pattern>/*</url-pattern>
>         <dispatcher>REQUEST</dispatcher>
>         <dispatcher>FORWARD</dispatcher>
>     </filter-mapping>

This configures the filter to run on every request that is either the 
original request (REQUEST) or a forwarded request (FORWARD) from 
RequestDispatcher.forward(). That means it will /not/ run when 
RequestDispatcher.include() is used, for example.

> Would this allow the search engine to see the *.vm somehow?

I'm not sure what you mean by "see the *.vm". Do you mean "can a crawler 
request the .vm file directly"? That depends on your server 
configuration. Can /you/ request the .vm directly? If so, then the 
crawler can, too (although it probably won't, given that it probably 
doesn't know the paths to any of the templates).

I suspect that when RequestDispatcher.forward() is called, the original 
request is used (plus a ver bits of info -- I'll get to those later) and 
the same headers are available -- such as the User-Agent. In this case, 
your User Agent Filter intercepts the request and logs a request for 
that resource.

During a forward, the container should set the following request attributes:

javax.servlet.forward.request_uri
javax.servlet.forward.context_path
javax.servlet.forward.servlet_path
javax.servlet.forward.path_info
javax.servlet.forward.query_string

These attributes contain the original values for each attribute (for 
instance, /foo forwarding to /bar would have "/foo" as the value for 
javax.servlet.forward.request_uri). See section 8.4.2 of the servlet 
specification (and section 8.3.1 for includes) for more details.

You should have your filter check to see if those values match the 
request you /think/ you are processing. I suspect that the User Agent 
here is not actually requesting those resources... it's a forward or an 
include that is being handled by your filter as if it were the original 
request.

-chris


Re: Not understanding something

Posted by johne <je...@yahoo.com>.
Maybe I have missed something.

I had recently upgraded to Tomcat 6 and am using the following on each of my
filter mappings (I have several chained filters).  It is the first time I
had chosen to use this, so maybe I am doing something wrong.

    <filter-mapping>
        <filter-name>User Agent Filter</filter-name>
        <url-pattern>/*</url-pattern>
        <dispatcher>REQUEST</dispatcher>
        <dispatcher>FORWARD</dispatcher>
    </filter-mapping>

Would this allow the search engine to see the *.vm somehow?

-----
JohnE

http://www.jobbank.com jobbank.com 
-- 
View this message in context: http://www.nabble.com/Not-understanding-something-tp18082309p18125926.html
Sent from the Velocity - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@velocity.apache.org
For additional commands, e-mail: user-help@velocity.apache.org


Re: Not understanding something

Posted by Christopher Schultz <ch...@christopherschultz.net>.
John,

johne wrote:
> 
> Thanks for the response Nathan.
> 
> I am using TilesTool.  I didn't realize that is how it worked.  It is odd
> that it does show, in my filter, that the spider seems to be requesting the
> *.vm file as the second part of the request.  I guess it just sees the
> initial request URL.  I had to do a double take, though.  I thought all of
> my internal code was exposed.

How is your filter configured, and what version of the servlet spec are 
you running under? I think 2.4 (or 2.5?) introduced the ability to run 
filters on a larger subset of the types of requests that are possible 
(normal, include, forward, etc.).

Perhaps you just want to alter the configuration of your filter so that 
it only reports on the original request, and not internal, loopback 
requests made by the server back to itself.

-chris



Re: Not understanding something

Posted by johne <je...@yahoo.com>.

Thanks for the response Nathan.

I am using TilesTool.  I didn't realize that is how it worked.  It is odd
that it does show, in my filter, that the spider seems to be requesting the
*.vm file as the second part of the request.  I guess it just sees the
initial request URL.  I had to do a double take, though.  I thought all of
my internal code was exposed.


-----
JohnE

http://www.jobbank.com jobbank.com 
-- 
View this message in context: http://www.nabble.com/Not-understanding-something-tp18082309p18084336.html
Sent from the Velocity - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@velocity.apache.org
For additional commands, e-mail: user-help@velocity.apache.org


Re: Not understanding something

Posted by Nathan Bubna <nb...@gmail.com>.
On Mon, Jun 23, 2008 at 7:37 PM, Nathan Bubna <nb...@gmail.com> wrote:
> Are you using ImportTool or TilesTool?  These do internal request
> forwarding.  I'm no expert on all that happens when servlets do
> request forwarding (see the ImportSupport class they both extend for
> implementation details), so i don't know offhand if these would be
> visible to the bot or if they just show up in Tomcat's logs.  My guess
> is that the bot wouldn't see these, as they are forwards, not
> redirects.

excuse me, i believe the term is "includes" not "forwards".   it's
done using the javax.servlet.RequestDispatch class' include(request,
response) method, where the response is wrapped to grab the content.

> On Mon, Jun 23, 2008 at 7:08 PM, johne <je...@yahoo.com> wrote:
>>
>> I have been using Velocity for a long time.  I am seeing something odd.  I
>> was trying to remove jsession ids from non-user accounts (bots) by removing
>> the URL rewriting in a tomcat 6.0 filter.  I then just had this filter print
>> out debug information for bots coming in and the URL they see.  I see
>> something to the affect of this in my debug log:
>>
>> INFO  [080620 23:47:13] Spider User Agent (msnbot/1.1
>> (+http://search.msn.com/msnbot.htm)) looking at page:
>> http://mydomain.com/action/pub/my-listing
>> INFO  [080620 23:47:13] Spider User Agent (msnbot/1.1
>> (+http://search.msn.com/msnbot.htm)) looking at page:
>> http://mydomain.com/WEB-INF/templates/common/myMainTemplate.vm
>>
>>
>> Though I have one debug message printing out, I was only expecting to see
>> the first debug message to print out.  The second message is very
>> unexpected.  It is the template that was being used to render page the bot
>> was looking at.
>>
>> Why would this message show up if using Velocity along with Tiles/Struts?
>>
>> I even have access control prevented in this template directory so very
>> weird.  I must be missing something.
>>
>> All the pages look great, I just would not want bots to pick up on things
>> they should not see.
>>
>> Regards,
>>
>> John
>>
>>
>> -----
>> JohnE
>>
>> http://www.jobbank.com jobbank.com
>> --
>> View this message in context: http://www.nabble.com/Not-understanding-something-tp18082309p18082309.html
>> Sent from the Velocity - User mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@velocity.apache.org
>> For additional commands, e-mail: user-help@velocity.apache.org
>>
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@velocity.apache.org
For additional commands, e-mail: user-help@velocity.apache.org


Re: Not understanding something

Posted by Nathan Bubna <nb...@gmail.com>.
Are you using ImportTool or TilesTool?  These do internal request
forwarding.  I'm no expert on all that happens when servlets do
request forwarding (see the ImportSupport class they both extend for
implementation details), so i don't know offhand if these would be
visible to the bot or if they just show up in Tomcat's logs.  My guess
is that the bot wouldn't see these, as they are forwards, not
redirects.

On Mon, Jun 23, 2008 at 7:08 PM, johne <je...@yahoo.com> wrote:
>
> I have been using Velocity for a long time.  I am seeing something odd.  I
> was trying to remove jsession ids from non-user accounts (bots) by removing
> the URL rewriting in a tomcat 6.0 filter.  I then just had this filter print
> out debug information for bots coming in and the URL they see.  I see
> something to the affect of this in my debug log:
>
> INFO  [080620 23:47:13] Spider User Agent (msnbot/1.1
> (+http://search.msn.com/msnbot.htm)) looking at page:
> http://mydomain.com/action/pub/my-listing
> INFO  [080620 23:47:13] Spider User Agent (msnbot/1.1
> (+http://search.msn.com/msnbot.htm)) looking at page:
> http://mydomain.com/WEB-INF/templates/common/myMainTemplate.vm
>
>
> Though I have one debug message printing out, I was only expecting to see
> the first debug message to print out.  The second message is very
> unexpected.  It is the template that was being used to render page the bot
> was looking at.
>
> Why would this message show up if using Velocity along with Tiles/Struts?
>
> I even have access control prevented in this template directory so very
> weird.  I must be missing something.
>
> All the pages look great, I just would not want bots to pick up on things
> they should not see.
>
> Regards,
>
> John
>
>
> -----
> JohnE
>
> http://www.jobbank.com jobbank.com
> --
> View this message in context: http://www.nabble.com/Not-understanding-something-tp18082309p18082309.html
> Sent from the Velocity - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@velocity.apache.org
> For additional commands, e-mail: user-help@velocity.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@velocity.apache.org
For additional commands, e-mail: user-help@velocity.apache.org