You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2021/04/03 08:55:00 UTC

[jira] [Commented] (TIKA-3329) RTG Translator with many-to-eng translation

    [ https://issues.apache.org/jira/browse/TIKA-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17314220#comment-17314220 ] 

ASF GitHub Bot commented on TIKA-3329:
--------------------------------------

thammegowda commented on pull request #419:
URL: https://github.com/apache/tika/pull/419#issuecomment-812836701


   Wiki page created:  https://cwiki.apache.org/confluence/display/TIKA/NMT-RTG


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> RTG Translator with many-to-eng translation
> -------------------------------------------
>
>                 Key: TIKA-3329
>                 URL: https://issues.apache.org/jira/browse/TIKA-3329
>             Project: Tika
>          Issue Type: Improvement
>          Components: translation
>            Reporter: Thamme Gowda
>            Assignee: Thamme Gowda
>            Priority: Major
>
> The existing translation services in tika-translate are either commercial/paid engines (e.g. Google, Microsoft  etc ) or not state of the art (such as Joshua, Moses etc). 
> Reader Translator Generator () is a neural machine translation toolkit [https://isi-nlp.github.io/rtg/]
>  and has the implementation of Transformer NMT model (current state of the art). 
> It also has massively multilingual pretrained NMT model  ( many-to-English translation direction)  [https://hub.docker.com/repository/docker/tgowda/rtg-model] 
> in which about 500 source languages are represented, with atleast ~300 source languages have good enough quality (For a comparison Google translate has ~106 languages, and Microsoft has about 80 languages). 
> This issue is for integrating RTG Translator into tika-translate
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)