You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@opennlp.apache.org by "Mark Giaconia (JIRA)" <ji...@apache.org> on 2013/06/08 21:10:20 UTC

[jira] [Comment Edited] (OPENNLP-579) Framework to support Gazateer search in concert with NER for location entities.

    [ https://issues.apache.org/jira/browse/OPENNLP-579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13678818#comment-13678818 ] 

Mark Giaconia edited comment on OPENNLP-579 at 6/8/13 7:09 PM:
---------------------------------------------------------------

Please take a close look at the EntityLinker framework. It needs scrutiny. (attached entitylinker_8Jun2013 file). 
It consists of two packages and a properties file. 
Drop the folder into the tools project and debug the Example class's main method, it has three example methods. The first example requires no dependencies so you should be able to step through everything.

The other two examples require PostGIS and MySQL and the USGS and Geonames gazateers "installed" on each. The scripts to do that are in the entitylinker package, and you will need to put the correct password in the properties file.
Thoughts:
- The properties object should be passed all the way through to the implementing Linkable so it can be used for random property acquisition (for DB conns etc), I think this would be helpful.
- I think it would benefit from some base classes that implement some of the basics.
-The factory should pool objects, because there is a lot of unnecessary instantiation at this point (or the way the factories are called needs to be managed better....) this becomes difficult when Span arrays can have multiple types of spans.
-The Find method that Utilizes the Document object is purely experimental, but let me know what you think.


Thanks!
MG
                
      was (Author: giaconia_mark):
    Please take a close look at the EntityLinker framework. It needs scrutiny. (attached entitylinker_8Jun2013 file). 
It consists of two packages and a properties file. 
Drop the folder into the tools project and debug the Example class's main method, it has three example methods. The first example requires no dependencies so you should be able to step through everything.

The other two examples require PostGIS and MySQL and the USGS and Geonames gazateers "installed" on each. The scripts to do that are in the entitylinker package, and you will need to put the correct password in the properties file.
Thoughts:
- The properties object should be passed all the way through to the implementing Linkable so it can be used for random property acquisition (for DB conns etc), I think this would be helpful.
- I think it would benefit from some base classes that implement some of the basics.
-The factory should pool objects, because there is a lot of unnecessary instantiation at this point (or the way the factories are called needs to be managed better....)
-The Find method that Utilizes the Document object is purely experimental, but let me know what you think.


Thanks!
MG
                  
> Framework to support Gazateer search in concert with NER for location entities.
> -------------------------------------------------------------------------------
>
>                 Key: OPENNLP-579
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-579
>             Project: OpenNLP
>          Issue Type: Wish
>          Components: Name Finder
>    Affects Versions: 1.6.0
>         Environment: Any
>            Reporter: Mark Giaconia
>            Priority: Minor
>              Labels: features
>             Fix For: 1.6.0
>
>         Attachments: EntityLinker_30may2013.zip, entitylinker_8Jun2013.zip, entitylinkerFramework.zip, geonamefinder.properties, geonamefind.zip
>
>   Original Estimate: 672h
>  Remaining Estimate: 672h
>
> An interface for defining a Gazeteer and the methods to search it, an extended Span object, and a Namefinder that encapsulates a TokenNameFinder for locations. Commercial applications that do this are extremely expensive, and there are many free gazateers one could use to create a solution with OpenNLP. The capability should provide a simple default implementation using the most popular open source geospatial database, PostGIS.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira