You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Omid Pourhadi <om...@gmail.com> on 2014/06/15 07:31:46 UTC
Tika Language Detection
Hi,
I'm using Tika for language detection but it can not identify Persian text.
I'm willing to add this language and I realized it uses ngp file . what is
this file and how can I add a new one for example fa.ngp ?
Re: Tika Language Detection
Posted by Chris Mattmann <ma...@apache.org>.
Dear Omid,
Looks like you got it added correctly :)
Thanks for your question and for your Github pull request.
I've filed a JIRA issue for you:
https://issues.apache.org/jira/browse/TIKA-1337
I will get your patch into the sources and I sincerely appreciate it.
In the future, please feel free to sign up at our JIRA issue tracking
system https://issues.apache.org/jira/browse/TIKA (you can sign up
for an account there) and then you can file a JIRA ticket for a
feature request. We use those tickets to track what things to work
on in Tika.
I will of course properly credit your contribution and I thank you
very much, again!
Cheers,
Chris
P.S. Since Apache is a meritocracy, we encourage contributors
and give them credit for all the work they are doing. So thanks again!
-----Original Message-----
From: Omid Pourhadi <om...@gmail.com>
Date: Saturday, June 14, 2014 10:31 PM
To: <de...@tika.apache.org>
Subject: Tika Language Detection
>Hi,
>I'm using Tika for language detection but it can not identify Persian
>text. I'm willing to add this language and I realized it uses ngp file .
>what is this file and how can I add a new one for example fa.ngp ?
>