You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by turnguard <ts...@yahoo.com> on 2009/09/20 16:21:00 UTC
newbie intro
hi!
could someone please shed some light on the following newbie questions - i'd
definitly would like to
use mahout (currently i'm using kea) for keyphrase extraction based on a
controlled vocabulary, so here's my situation :
1. i'm using a modified version of kea (http://www.nzdl.org/Kea/), that is
capable of getting it's controlled vocabulary from
any SAIL-RDF-Repository (http://www.openrdf.org). kea has two modes to
extract keyphrases from a text
document - a free one and a controlled one (which checks keyphrase
candidate against a skos:thesaurus).
2. it's possible to train such an extraction model on the fly via a
webinterface (i should admit that i'm not conviced that
"training" is correct term for giving an extraction model new input data
- people would assume it's getting better and
better, but in most cases it's only getting different.
3. what i also liked to achieve is, that if someone creates a new
skos:Concept with some skos:prefLabel, i'd like to
suggest where to place this new concept in the thesaurus (suggest what
could be it's skos:narrowers or skos:broaders)
currently i'm doing this via the bridge of indexed documents. (i.e.: a
new skos:Concept gets the skos:prefLabel
"house", i search for all documents containing "house", count the
allready existing skos:Concepts these documents
are tagged with and print out the list of concepts with the number of
their occurrences.
- has anyone some experience with extracting keyphrases from a document
using mahout
- has anyone some experience with extracting keyphrases based on a
controlled vocabulary
from a document using mahout
- has anyone some experience making thesaurus suggestions, i.e in my
thesaurus there's a concept with prefLabel "xml"
someone enters a new concept "extensible markup language" : how could i
suggest, not to create a new concept, but
to use "extensible markup language" as an altLabel for xml.
- could someone point me into the right direction for basic intro into
keyphrase extraction using mahout
any help or comments really appreciated
wkr www.turnguard.com
--
View this message in context: http://www.nabble.com/newbie-intro-tp25530069p25530069.html
Sent from the Mahout User List mailing list archive at Nabble.com.