You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@httpd.apache.org by Charles Michener <mi...@yahoo.com> on 2007/12/15 21:57:17 UTC

[users@httpd] How to rid a pest?

I have a couple of spider bots hitting my server that I do not wish to have access to my pages - they ignore robots.txt, so I finally put them on my 'deny from xxxxx' list. This does deny them access but they persist to keep trying - trying each page address at least 30 times - several hits per second .  Is there a standard method to forward them to some black hole or the FBI or ...?

Charles

       
---------------------------------
Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

Re: [users@httpd] How to rid a pest?

Posted by Dragon <dr...@crimson-dragon.com>.
Charles Michener wrote:
>I have a couple of spider bots hitting my server that I do not wish 
>to have access to my pages - they ignore robots.txt, so I finally 
>put them on my 'deny from xxxxx' list. This does deny them access 
>but they persist to keep trying - trying each page address at least 
>30 times - several hits per second .  Is there a standard method to 
>forward them to some black hole or the FBI or ...?
---------------- End original message. ---------------------

This is the kind of thing a router/firewall will handle for you.

Stopping these requests before they get to your machine is the best 
way to handle them. Otherwise, it doesn't really have a lot of impact 
on the performance of the server for it to send a forbidden response 
back to the offenders. Yeah, it takes a little bit of processing but 
it is pretty insignificant per request.

Hopefully they will eventually give up but if they don't, look into 
using a firewall to deny at the edge of your network.

Dragon

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  Venimus, Saltavimus, Bibimus (et naribus canium capti sumus)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] How to rid a pest?

Posted by Christian Folini <ch...@post.ch>.
You are stopping them inside apache now. Next obvious step is a
firewall. Either on the server on a dedicated box in front
of it.

regs,

Christian

On Sat, Dec 15, 2007 at 12:57:17PM -0800, Charles Michener wrote:
> I have a couple of spider bots hitting my server that I do not wish to have access to my pages - they ignore robots.txt, so I finally put them on my 'deny from xxxxx' list. This does deny them access but they persist to keep trying - trying each page address at least 30 times - several hits per second .  Is there a standard method to forward them to some black hole or the FBI or ...?
> 
> Charles
> 
>        
> ---------------------------------
> Be a better friend, newshound, and know-it-all with Yahoo! Mobile.  Try it now.

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org


Re: [users@httpd] How to rid a pest?

Posted by "S.A. Birl" <sb...@temple.edu>.
On Dec 15, 2007, Charles Michener (nospam-micheck123@yahoo.com.ns) typed:

Charles:  I have a couple of spider bots hitting my server that I do
Charles:  not wish to have access to my pages - they ignore
Charles:  robots.txt, so I finally put them on my 'deny from xxxxx'
Charles:  list. This does deny them access but they persist to keep
Charles:  trying - trying each page address at least 30 times -
Charles:  several hits per second .  Is there a standard method to
Charles:  forward them to some black hole or the FBI or ...?



Ive been through that.  I just Deny them and eventually learn to
ignore the log entries.

You could wrap httpd around TCPwrappers or such.  Or if you have
control over the network traffic, drop it at the router level.

I seriously doubt the authorities will get involved; it's not like
the spiders are cracking you.  Sounds like they might be
mis-configured if they ignore robots.txt.

Hope that helps.


Thanks
 Birl

Please do not CC me responses to my own posts.
I'll read the responses on the list.

Archives   http://mail-archives.apache.org/mod_mbox/httpd-users/

---------------------------------------------------------------------
The official User-To-User support forum of the Apache HTTP Server Project.
See <URL:http://httpd.apache.org/userslist.html> for more info.
To unsubscribe, e-mail: users-unsubscribe@httpd.apache.org
   "   from the digest: users-digest-unsubscribe@httpd.apache.org
For additional commands, e-mail: users-help@httpd.apache.org