You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Madhav Sharan (JIRA)" <ji...@apache.org> on 2015/11/26 00:12:11 UTC
[jira] [Updated] (TIKA-1797) Pick best location by analysing
context in GeoTopic parser
[ https://issues.apache.org/jira/browse/TIKA-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Madhav Sharan updated TIKA-1797:
--------------------------------
Issue Type: Sub-task (was: Improvement)
Parent: TIKA-1802
> Pick best location by analysing context in GeoTopic parser
> -----------------------------------------------------------
>
> Key: TIKA-1797
> URL: https://issues.apache.org/jira/browse/TIKA-1797
> Project: Tika
> Issue Type: Sub-task
> Components: parser
> Environment: ALL
> Reporter: Madhav Sharan
> Original Estimate: 336h
> Remaining Estimate: 336h
>
> lucene-geo-gazetteer is enhanced to return multiple results for a string.
> GeoTopic parser should start using new fields country code and admin codes to determine best possible locations within context of a document.
> Proposal:
> 1. Get top 3 results for each location string
> 2. Group all locations on country code, admin code1, admin code2
> 3. Give priority to most occuring country, admin code1, admin code 2
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)