You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Mark Giaconia (JIRA)" <ji...@apache.org> on 2013/12/11 14:44:07 UTC

[jira] [Created] (OPENNLP-626) GeoEntityLinker should use proper Lucene language Analyzers on NGA Geonames

Mark Giaconia created OPENNLP-626:
-------------------------------------

             Summary: GeoEntityLinker should use proper Lucene language Analyzers on NGA Geonames
                 Key: OPENNLP-626
                 URL: https://issues.apache.org/jira/browse/OPENNLP-626
             Project: OpenNLP
          Issue Type: Sub-task
          Components: Entity Linker
    Affects Versions: 1.6.0
            Reporter: Mark Giaconia
            Assignee: Mark Giaconia
            Priority: Minor


Currently the standard lucene analyzer is used to index all rows in the NGA Geonames. The Geonames dataset contains language codes, and many names are in native language characters. Lucene has many analyzers to handle this, so the code should integrate these analyzers at index and query time to better support non English text.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)