You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tapestry.apache.org by sunilmanu <su...@gmail.com> on 2007/04/12 20:59:54 UTC

Incomplete URL requests resulting in Excceptions

Hello Everyone,

We have deployed our Tapestry application to production. After the
deployment we started getting exceptions when the Search engine spiders try
to access our domains with incomplete or invalid URLs. 
eg: below is a Valid URL to one of our public pages of the website. But if
you chop of the last parameter completely / even partially we get
Exceptions.
http://www.testsite.com/external.svc;jsessionid=DD4CD6B63F6C5D7C8DCEE358701C2F48?page=ArticleListPage&sp=l3

We tried to include the robots.txt also in the application, but still the
requests keep coming in and generate the exceptions.

What configuration are we missing here. Anyone any hints / pointers ?

Thanks,
Sunil M
-- 
View this message in context: http://www.nabble.com/Incomplete-URL-requests-resulting-in-Excceptions-tf3567555.html#a9966053
Sent from the Tapestry - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: Incomplete URL requests resulting in Excceptions

Posted by Yann Ramin <at...@stackworks.net>.
Not all of them are going to be well behaved web spiders. Exploit 
scanners tend to hit specific URL suffixes to feed in their exploit code 
(looking for vulnerable phpbb, phpnuke, etc), and they don't respond to 
robots.txt ;)

It clogs up traditional Apache error logs as well.

I would suggest simply filtering the error emails.

jake123 wrote:
> Hi,
> we have a similar problem... we are hosting approximately 300 websites that
> is using our tapestry application to which all the content are red from the
> database and build up on the fly. We also gets a lot of 'ghost' exceptions
> when search engine spiders and robots try to access our application. Our
> application sends us a error email every time an exception occurs in the
> application and that means at least around a 100 emails  a day.
> 
> I also noticed that we get a lot of pageNotFoundException for page names
> that do not exists in our application name space... is this normal?
> 
> How do you prevent the search engines to do this?
> 
> Thanks in advance for any help,
> Jacob
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: Incomplete URL requests resulting in Excceptions

Posted by jake123 <ja...@gmail.com>.
So this is happening for you to?
If this is the normal behavior for the search engines then I guess we can
filter the results, but we only want to do this if this is the normal
thing... otherwise wants to fix it the 'right' way..

Thanks again,
Jacob
-- 
View this message in context: http://www.nabble.com/Incomplete-URL-requests-resulting-in-Excceptions-tf3567555.html#a9986177
Sent from the Tapestry - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org


Re: Incomplete URL requests resulting in Excceptions

Posted by Jesse Kuhnert <jk...@gmail.com>.
I'm not as sure about all of the methods for preventing search engine
spidering on certain parts of your app but I have built lots of application
/ etc monitoring software in the past and you almost always want these kinds
of alerts to go through some kind of intelligent filter that prevents 60
emails coming out about the exact same non-issue .

On 4/13/07, jake123 <ja...@gmail.com> wrote:
>
>
> Hi,
> we have a similar problem... we are hosting approximately 300 websites
> that
> is using our tapestry application to which all the content are red from
> the
> database and build up on the fly. We also gets a lot of 'ghost' exceptions
> when search engine spiders and robots try to access our application. Our
> application sends us a error email every time an exception occurs in the
> application and that means at least around a 100 emails  a day.
>
> I also noticed that we get a lot of pageNotFoundException for page names
> that do not exists in our application name space... is this normal?
>
> How do you prevent the search engines to do this?
>
> Thanks in advance for any help,
> Jacob
>
> --
> View this message in context:
> http://www.nabble.com/Incomplete-URL-requests-resulting-in-Excceptions-tf3567555.html#a9984430
> Sent from the Tapestry - User mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
> For additional commands, e-mail: users-help@tapestry.apache.org
>
>


-- 
Jesse Kuhnert
Tapestry/Dojo team member/developer

Open source based consulting work centered around
dojo/tapestry/tacos/hivemind. http://blog.opencomponentry.com

Re: Incomplete URL requests resulting in Excceptions

Posted by jake123 <ja...@gmail.com>.
Hi,
we have a similar problem... we are hosting approximately 300 websites that
is using our tapestry application to which all the content are red from the
database and build up on the fly. We also gets a lot of 'ghost' exceptions
when search engine spiders and robots try to access our application. Our
application sends us a error email every time an exception occurs in the
application and that means at least around a 100 emails  a day.

I also noticed that we get a lot of pageNotFoundException for page names
that do not exists in our application name space... is this normal?

How do you prevent the search engines to do this?

Thanks in advance for any help,
Jacob

-- 
View this message in context: http://www.nabble.com/Incomplete-URL-requests-resulting-in-Excceptions-tf3567555.html#a9984430
Sent from the Tapestry - User mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tapestry.apache.org
For additional commands, e-mail: users-help@tapestry.apache.org