You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2014/02/19 16:34:20 UTC
[jira] [Created] (STANBOL-1285) FST Linking Engine / Linkable Token
Filter should consider Chunks
Rupert Westenthaler created STANBOL-1285:
--------------------------------------------
Summary: FST Linking Engine / Linkable Token Filter should consider Chunks
Key: STANBOL-1285
URL: https://issues.apache.org/jira/browse/STANBOL-1285
Project: Stanbol
Issue Type: Improvement
Components: Enhancement Engines
Affects Versions: 0.12.0
Reporter: Rupert Westenthaler
Assignee: Rupert Westenthaler
Priority: Minor
Fix For: 1.0.0, 0.12.1
The LinkableTokenFilter a Solr TokenFilter is used by the FST linking engine to add the TaggingAttribute (supported by the Solr Text Tagger library) to tokens that should be looked up in the FST - the vocabulary.
This implementation can be improved by taking chunks into consideration that are
* chunks representing named entities
* processable (typically Noun phrases but no Verb phrases ...) AND
* have a linkable token in the chunk OR
* have two or more matchable tokens in the chunk
All tokens in such chunks should be classified as tagable by setting the TaggingAttribute to true.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)