You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by BlackIce <bl...@gmail.com> on 2014/03/21 21:21:01 UTC

Correct sintax for language-identifier plugin?

Hi,

what is the correct sintax for language-identifier plugin?

I have this in my nutch-site.xml:

<property>
<name>plugin.includes</name>
<value>protocol-http|urlfilter-regex|parse-(html|tika|text)|index-(basic|anchor|more)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)|language-identifier</value>
</property>

Do I need something else to get it to work?

Thnx

Re: Correct sintax for language-identifier plugin?

Posted by ilhami Kalkan <il...@agmlab.com>.
Hi BlackIce,

Yes. Its enough to use language-identifier plugin. Also check 
/lang.extraction.policy /and/lang.identification.only.certain/ in 
nutch-default.xml.

On 21-03-2014 22:21, BlackIce wrote:
> Hi,
>
> what is the correct sintax for language-identifier plugin?
>
> I have this in my nutch-site.xml:
>
> <property>
> <name>plugin.includes</name>
> <value>protocol-http|urlfilter-regex|parse-(html|tika|text)|index-(basic|anchor|more)|query-(basic|site|url)|response-(json|xml)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)|language-identifier</value>
> </property>
>
> Do I need something else to get it to work?
>
> Thnx
>


-- 
*I.lhami KALKAN*
Software Developer
(+90) 543 810 0885
ilhami.kalkan@agmlab.com <ma...@agmlab.com>

AGMLab Bilis,im Teknolojileri