You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@nutch.apache.org by Ken Krugler <kk...@transpac.com> on 2005/08/12 23:47:48 UTC

Language detection

Given the recent discussion regarding charset/language detection on 
this list, people might find this IBM reseearch paper interesting:

<ftp://ftp.software.ibm.com/software/globalization/documents/linguini.pdf>ftp://ftp.software.ibm.com/software/globalization/documents/linguini.pdf

     Linguini: Language Identification for Multilingual Documents
     John M. Prager

-- Ken
-- 
Ken Krugler
TransPac Software, Inc.
<http://www.transpac.com>
+1 530-470-9200