You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Ralf R. Kotowski" <rr...@enlle.com> on 2013/11/02 18:08:58 UTC
RE: Language based outlink filtering
Has anyone done something like this andi s willing to share some sample
code?
Thnx
-----Original Message-----
From: Julien Nioche [mailto:lists.digitalpebble@gmail.com]
Sent: Wednesday, October 02, 2013 1:00 PM
To: user@nutch.apache.org
Subject: Re: Language based outlink filtering
Hi,
You can do that by activating the language-identifier plugin then write a
custom ScoringFilter which will remove the outlinks in the
method distributeScoreToOutlinks()
HTH
Julien
On 30 September 2013 11:41, ilhami Kalkan <il...@agmlab.com> wrote:
> Hi,
> i want to extract outlinks from a webpage with a specific language. Any
> ideas about how can I do this?
> Thanks
>
>
--
*
*Open Source Solutions for Text Engineering
http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble