You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Massimo Miccoli <mm...@iltrovatore.it> on 2005/09/06 16:31:58 UTC
Help for regex
Hi,
I need some help to skip host name like :
http://new-orleans-gay-men-bondage.xx.toplesss.info
Or
http://www.new-orleans.super-hotels.net/
Can help me to make a regex that remove hosts (not url part) with
multiple hyphen?
Thanks,
Massimo
Re: Help for regex
Posted by Fredrik Andersson <fi...@gmail.com>.
Hello Massimo.
"*-.*-.*-.*" would match anything with three dashes or more in it, for
instance. Another more good-looking way would be to use something like
".*(-.*){a,b}",
which will match anything with a < number of dashes < b.
Fredrik
On 9/6/05, Massimo Miccoli <mm...@iltrovatore.it> wrote:
>
> Hi,
>
> I need some help to skip host name like :
> http://new-orleans-gay-men-bondage.xx.toplesss.info
>
> Or
>
> http://www.new-orleans.super-hotels.net/
>
> Can help me to make a regex that remove hosts (not url part) with
> multiple hyphen?
>
> Thanks,
>
> Massimo
>
>
>
>
>
>