You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Mark Giaconia (JIRA)" <ji...@apache.org> on 2014/01/12 15:42:50 UTC

[jira] [Resolved] (OPENNLP-626) GeoEntityLinker should use proper Lucene language Analyzers on NGA Geonames

     [ https://issues.apache.org/jira/browse/OPENNLP-626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Mark Giaconia resolved OPENNLP-626.
-----------------------------------

    Resolution: Implemented

Implemented a map of language analyzers, and the Geonames' language code is used to get the correct analyzer for the language.
Not used at query time (yet)

> GeoEntityLinker should use proper Lucene language Analyzers on NGA Geonames
> ---------------------------------------------------------------------------
>
>                 Key: OPENNLP-626
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-626
>             Project: OpenNLP
>          Issue Type: Sub-task
>          Components: Entity Linker
>    Affects Versions: 1.6.0
>            Reporter: Mark Giaconia
>            Assignee: Mark Giaconia
>            Priority: Minor
>             Fix For: 1.6.0
>
>
> Currently the standard lucene analyzer is used to index all rows in the NGA Geonames. The Geonames dataset contains language codes, and many names are in native language characters. Lucene has many analyzers to handle this, so the code should integrate these analyzers at index and query time to better support non English text.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)