You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Omid Pourhadi <om...@gmail.com> on 2014/06/15 07:31:46 UTC

Tika Language Detection

Hi,

I'm using Tika for language detection but it can not identify Persian text.
I'm willing to add this language and I realized it uses ngp file . what is
this file and how can I add a new one for example fa.ngp ?

Re: Tika Language Detection

Posted by Chris Mattmann <ma...@apache.org>.
Dear Omid,

Looks like you got it added correctly :)

Thanks for your question and for your Github pull request.
I've filed a JIRA issue for you:

https://issues.apache.org/jira/browse/TIKA-1337

I will get your patch into the sources and I sincerely appreciate it.
In the future, please feel free to sign up at our JIRA issue tracking
system https://issues.apache.org/jira/browse/TIKA (you can sign up
for an account there) and then you can file a JIRA ticket for a
feature request. We use those tickets to track what things to work
on in Tika.

I will of course properly credit your contribution and I thank you
very much, again!

Cheers,
Chris

P.S. Since Apache is a meritocracy, we encourage contributors
and give them credit for all the work they are doing. So thanks again!





-----Original Message-----
From: Omid Pourhadi <om...@gmail.com>
Date: Saturday, June 14, 2014 10:31 PM
To: <de...@tika.apache.org>
Subject: Tika Language Detection

>Hi,
>I'm using Tika for language detection but it can not identify Persian
>text. I'm willing to add this language and I realized it uses ngp file .
>what is this file and how can I add a new one for example fa.ngp ?
>