You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Rajan Shah <ra...@gmail.com> on 2015/05/30 04:52:18 UTC
Stanbol NER
Hi,
It appears that, the NER doesn't seem to work so I would appreciate help
regarding identifying whether it's a
a. data issue
b. setup issue
c. missing step
With best regards,
Rajan
*1. Input triple set*
<http://dbpedia.org/resource/AAPL> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
http://www.omg.org/spec/FIGI/SecurityTypes/CommonStock>;
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/EquityMarketSector>
"Consumer and electronics";
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/isConstituentOf>
"DJIA";
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/FinancialInstrumentName>
"Equity";
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/Ticker> "AAPL";
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/ExchangeCode>
"NASDAQ";
<http://xmlns.com/foaf/corp#Company>
"Apple Inc." .
<http://dbpedia.org/resource/MSFT> <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
http://www.omg.org/spec/FIGI/SecurityTypes/CommonStock>;
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/EquityMarketSector>
"Information Technology";
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/isConstituentOf>
"NASDAQ";
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/FinancialInstrumentName>
"Equity";
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/Ticker> "MSFT";
<
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/ExchangeCode>
"NASDAQ";
<http://xmlns.com/foaf/corp#Company>
"Microsoft Corp." .
*2. mappings.txt*
figigii:*
figigii:EquityMarketSector
figigii:isConstituentOf
figigii:FinancialInstrumentName
figigii:Ticker
figigii:ExchangeCode
foaf:*
*3. query failure*
After creating referenced site, it appears that following query doesn't
yield any results.
curl -X POST -H "Content-Type:application/json" --data "@fieldQuery1.json"
http://localhost:9099/entityhub/site/securitymaster/query
fieldQuery1.json
{
"selected": [
"http:\/\/www.w3.org\/2000\/01\/rdf-schema#label",
"http:\/\/xmlns.com\/foaf\/corp#Company"
],
"offset": "0",
"limit": "3",
"constraints": [{
"type": "value",
"value": "AAPL",
"field": "http:\/\/www.omg.org
\/spec\/FIGI\/GlobalInstrumentIdentifiers\/Ticker",
"datatype": "xsd:string"
}]
}
*4. The Entity Linking or Keyword Linking *
Keyword Linking setup
a. Keyword Tokenizer - checked
b. No type mappings (as one from mappings.txt should be in effect) or
Putting the same exact mapping in "Type Mappings" or
Mapping all of the above to skos:Concept or even rdfs:label does not help
>From logs, it appears that securitymaster solr index core results in 0 hits
Entity Linking setup
a. Leave to default settings
b. Try alternate type mappings
*5. Chain Execution*
The chain execution order is as follows:
Name: sample-chain
Engine: tika;optional, langdetect, opennlp-sentence, opennlp-token,
securitymaster-linking, securitymaster-keyword-linking, opennlp-ner
In result almost all entities have been recognized by creator:
org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine.