You are viewing a plain text version of this content. The canonical link for it is here.
Posted to announce@apache.org by William Colen <co...@apache.org> on 2017/11/02 10:55:48 UTC
[ANNOUNCE] Language Detector Model 1.8.3 Release for Apache OpenNLP
The Apache OpenNLP library is a machine learning based toolkit for the
processing of natural language text.
The Apache OpenNLP team is pleased to announce the release of Language
Detector Model 1.8.3 for Apache OpenNLP 1.8.3.
The Language Detector Model can detect 103 languages and outputs ISO 639-3
codes.
Apache OpenNLP model and reports are available for download from our model
download page:
http://opennlp.apache.org/models.html
This is the first release of the Language Detector Model. It is compatible
with Apache OpenNLP 1.8.3 or better.
It is important to note that this model is trained for and works well with
longer texts that have at least 2 sentences (or more) from a single
language.
More information about this release can be found in the README.txt at:
https://www.apache.org/dist/opennlp/models/langdetect/1.8.3/README.txt
Details about this model effectiveness can be found in the following report:
https://www.apache.org/dist/opennlp/models/langdetect/1.8.3/langdetect-183.bin.report.txt
--The Apache OpenNLP Team
Re: [ANNOUNCE] Language Detector Model 1.8.3 Release for Apache OpenNLP
Posted by Alex Ott <al...@gmail.com>.
Hi all
Maybe you have seen this already, but back in October I did evaluation of different language detection libraries (CLD, langid.py, fastText) - results are available at http://alexott.blogspot.de/2017/10/evaluating-fasttexts-models-for.html. Recently I've updated this post with data for OpenNLP-based language detector.
The test data I used were collected during my experiments with classification of web pages - there is a link to data that you can download (~18Mb).
William Colen at "Thu, 2 Nov 2017 08:55:48 -0200" wrote:
WC> The Apache OpenNLP library is a machine learning based toolkit for the
WC> processing of natural language text.
WC> The Apache OpenNLP team is pleased to announce the release of Language
WC> Detector Model 1.8.3 for Apache OpenNLP 1.8.3.
WC> The Language Detector Model can detect 103 languages and outputs ISO 639-3
WC> codes.
WC> Apache OpenNLP model and reports are available for download from our model
WC> download page:
WC> http://opennlp.apache.org/models.html
WC> This is the first release of the Language Detector Model. It is compatible
WC> with Apache OpenNLP 1.8.3 or better.
WC> It is important to note that this model is trained for and works well with
WC> longer texts that have at least 2 sentences (or more) from a single
WC> language.
WC> More information about this release can be found in the README.txt at:
WC> https://www.apache.org/dist/opennlp/models/langdetect/1.8.3/README.txt
WC> Details about this model effectiveness can be found in the following report:
WC> https://www.apache.org/dist/opennlp/models/langdetect/1.8.3/langdetect-183.bin.report.txt
WC> --The Apache OpenNLP Team
--
With best wishes, Alex Ott
http://alexott.blogspot.com/ http://alexott.net/
http://alexott-ru.blogspot.com/
Skype: alex.ott
Re: [ANNOUNCE] Language Detector Model 1.8.3 Release for Apache OpenNLP
Posted by Alex Ott <al...@gmail.com>.
Hi all
Maybe you have seen this already, but back in October I did evaluation of different language detection libraries (CLD, langid.py, fastText) - results are available at http://alexott.blogspot.de/2017/10/evaluating-fasttexts-models-for.html. Recently I've updated this post with data for OpenNLP-based language detector.
The test data I used were collected during my experiments with classification of web pages - there is a link to data that you can download (~18Mb).
William Colen at "Thu, 2 Nov 2017 08:55:48 -0200" wrote:
WC> The Apache OpenNLP library is a machine learning based toolkit for the
WC> processing of natural language text.
WC> The Apache OpenNLP team is pleased to announce the release of Language
WC> Detector Model 1.8.3 for Apache OpenNLP 1.8.3.
WC> The Language Detector Model can detect 103 languages and outputs ISO 639-3
WC> codes.
WC> Apache OpenNLP model and reports are available for download from our model
WC> download page:
WC> http://opennlp.apache.org/models.html
WC> This is the first release of the Language Detector Model. It is compatible
WC> with Apache OpenNLP 1.8.3 or better.
WC> It is important to note that this model is trained for and works well with
WC> longer texts that have at least 2 sentences (or more) from a single
WC> language.
WC> More information about this release can be found in the README.txt at:
WC> https://www.apache.org/dist/opennlp/models/langdetect/1.8.3/README.txt
WC> Details about this model effectiveness can be found in the following report:
WC> https://www.apache.org/dist/opennlp/models/langdetect/1.8.3/langdetect-183.bin.report.txt
WC> --The Apache OpenNLP Team
--
With best wishes, Alex Ott
http://alexott.blogspot.com/ http://alexott.net/
http://alexott-ru.blogspot.com/
Skype: alex.ott