You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Fredrik Andersson <fi...@gmail.com> on 2005/09/06 20:47:06 UTC

Re: Help for regex

Hello Massimo.

"*-.*-.*-.*" would match anything with three dashes or more in it, for 
instance. Another more good-looking way would be to use something like
".*(-.*){a,b}",
which will match anything with a < number of dashes < b.

Fredrik

On 9/6/05, Massimo Miccoli <mm...@iltrovatore.it> wrote:
> 
> Hi,
> 
> I need some help to skip host name like :
> http://new-orleans-gay-men-bondage.xx.toplesss.info
> 
> Or
> 
> http://www.new-orleans.super-hotels.net/
> 
> Can help me to make a regex that remove hosts (not url part) with
> multiple hyphen?
> 
> Thanks,
> 
> Massimo
> 
> 
> 
> 
> 
>