You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Madhav Sharan (JIRA)" <ji...@apache.org> on 2015/11/26 00:12:11 UTC

[jira] [Updated] (TIKA-1797) Pick best location by analysing context in GeoTopic parser

     [ https://issues.apache.org/jira/browse/TIKA-1797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Madhav Sharan updated TIKA-1797:
--------------------------------
    Issue Type: Sub-task  (was: Improvement)
        Parent: TIKA-1802

> Pick best location by analysing context in GeoTopic parser 
> -----------------------------------------------------------
>
>                 Key: TIKA-1797
>                 URL: https://issues.apache.org/jira/browse/TIKA-1797
>             Project: Tika
>          Issue Type: Sub-task
>          Components: parser
>         Environment: ALL
>            Reporter: Madhav Sharan
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> lucene-geo-gazetteer is enhanced to return multiple results for a string. 
> GeoTopic parser should start using new fields country code and admin codes to determine best possible locations within context of a document.
> Proposal:
> 1. Get top 3 results for each location string
> 2. Group all locations on country code, admin code1, admin code2
> 3. Give priority to most occuring country, admin code1, admin code 2



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)