You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2014/01/22 10:57:19 UTC
[jira] [Resolved] (STANBOL-1251) Pos tag based Phrase extraction
Engine
[ https://issues.apache.org/jira/browse/STANBOL-1251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rupert Westenthaler resolved STANBOL-1251.
------------------------------------------
Resolution: Fixed
Fix Version/s: 0.12.0
A first working version of the Engine is available in trunk (1.0.0-SNAPSHOT) and the 0.12 branch. Further improvements (see TODO comments in the engine) should be done in their own issues.
> Pos tag based Phrase extraction Engine
> --------------------------------------
>
> Key: STANBOL-1251
> URL: https://issues.apache.org/jira/browse/STANBOL-1251
> Project: Stanbol
> Issue Type: New Feature
> Components: Enhancement Engines
> Reporter: Rupert Westenthaler
> Assignee: Rupert Westenthaler
> Fix For: 0.12.0
>
>
> Implement an Enhancement Engine that uses POS tags to extract Noun and Verb Phrases
> In Stanbol POS annotations can be aligned to concepts of the OLIA ontology (see documentation at [1] for detailed information). This alignment allows engines to language independent determine the lexical categories of tokens in the text.
> The Pos-Chunker Engine will use those lexical categories of tokens to extract Noun and Verb phrases by using the following rules
> ### Noun Phrases
> * start: noun, pronoun, determiners, adjectives
> * continuation: nouns, adpositions, adjectives, punctations
> * end: noun, pronoun, determiners, adjectives
> * required: noun
> ### Verb Phrases
> * start: verb, adverb
> * continuation: verb, adverb, punctations
> * end: verb, adverb
> * required: verb
> This engine will allow to configure the processed languages (e.g. to deactivate it for languages where other chunker are available).
> The EnhancementEngine ordering will be ServiceProperties.ORDERING_NLP_CHUNK
> The current plan is to make this engine also available in the 0.12 branch
> [1] http://stanbol.staging.apache.org/docs/trunk/components/enhancer/nlp/nlpannotations
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)