You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Massimo Miccoli <mm...@iltrovatore.it> on 2005/09/06 16:31:58 UTC

Help for regex

Hi,

I need some help to skip host name like :
http://new-orleans-gay-men-bondage.xx.toplesss.info

Or

http://www.new-orleans.super-hotels.net/

Can help me to make a regex that remove hosts (not url part) with 
multiple hyphen?

Thanks,

Massimo






Re: Help for regex

Posted by Fredrik Andersson <fi...@gmail.com>.
Hello Massimo.

"*-.*-.*-.*" would match anything with three dashes or more in it, for 
instance. Another more good-looking way would be to use something like
".*(-.*){a,b}",
which will match anything with a < number of dashes < b.

Fredrik

On 9/6/05, Massimo Miccoli <mm...@iltrovatore.it> wrote:
> 
> Hi,
> 
> I need some help to skip host name like :
> http://new-orleans-gay-men-bondage.xx.toplesss.info
> 
> Or
> 
> http://www.new-orleans.super-hotels.net/
> 
> Can help me to make a regex that remove hosts (not url part) with
> multiple hyphen?
> 
> Thanks,
> 
> Massimo
> 
> 
> 
> 
> 
>