You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Fredrik Andersson <fi...@gmail.com> on 2005/09/06 20:47:06 UTC
Re: Help for regex
Hello Massimo.
"*-.*-.*-.*" would match anything with three dashes or more in it, for
instance. Another more good-looking way would be to use something like
".*(-.*){a,b}",
which will match anything with a < number of dashes < b.
Fredrik
On 9/6/05, Massimo Miccoli <mm...@iltrovatore.it> wrote:
>
> Hi,
>
> I need some help to skip host name like :
> http://new-orleans-gay-men-bondage.xx.toplesss.info
>
> Or
>
> http://www.new-orleans.super-hotels.net/
>
> Can help me to make a regex that remove hosts (not url part) with
> multiple hyphen?
>
> Thanks,
>
> Massimo
>
>
>
>
>
>