You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Rajan Shah <ra...@gmail.com> on 2015/05/30 04:52:18 UTC

Stanbol NER

Hi,

It appears that, the NER doesn't seem to work so I would appreciate help
regarding identifying whether it's a

a. data issue
b. setup issue
c. missing step

With best regards,
Rajan



*1. Input triple set*

<http://dbpedia.org/resource/AAPL>   <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
http://www.omg.org/spec/FIGI/SecurityTypes/CommonStock>;
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/EquityMarketSector>
"Consumer and electronics";
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/isConstituentOf>
"DJIA";
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/FinancialInstrumentName>
"Equity";
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/Ticker> "AAPL";
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/ExchangeCode>
"NASDAQ";
                                    <http://xmlns.com/foaf/corp#Company>
"Apple Inc." .

<http://dbpedia.org/resource/MSFT>   <
http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <
http://www.omg.org/spec/FIGI/SecurityTypes/CommonStock>;
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/EquityMarketSector>
"Information Technology";
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/isConstituentOf>
"NASDAQ";
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/FinancialInstrumentName>
"Equity";
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/Ticker> "MSFT";
                                    <
http://www.omg.org/spec/FIGI/GlobalInstrumentIdentifiers/ExchangeCode>
"NASDAQ";
                                    <http://xmlns.com/foaf/corp#Company>
"Microsoft Corp." .


*2. mappings.txt*

figigii:*
figigii:EquityMarketSector
figigii:isConstituentOf
figigii:FinancialInstrumentName
figigii:Ticker
figigii:ExchangeCode

foaf:*

*3. query failure*

After creating referenced site, it appears that following query doesn't
yield any results.

 curl -X POST -H "Content-Type:application/json" --data "@fieldQuery1.json"
http://localhost:9099/entityhub/site/securitymaster/query

 fieldQuery1.json

 {
   "selected": [
       "http:\/\/www.w3.org\/2000\/01\/rdf-schema#label",
       "http:\/\/xmlns.com\/foaf\/corp#Company"
   ],
   "offset": "0",
   "limit": "3",
   "constraints": [{
        "type": "value",
        "value": "AAPL",
        "field": "http:\/\/www.omg.org
\/spec\/FIGI\/GlobalInstrumentIdentifiers\/Ticker",
        "datatype": "xsd:string"
    }]
}


*4. The Entity Linking or Keyword Linking *

Keyword Linking setup

a. Keyword Tokenizer - checked
b. No type mappings (as one from mappings.txt should be in effect)  or
   Putting the same exact mapping in "Type Mappings" or
   Mapping all of the above to skos:Concept or even rdfs:label does not help

>From logs, it appears that securitymaster solr index core results in 0 hits


Entity Linking setup

a. Leave to default settings
b. Try alternate type mappings


*5. Chain Execution*

The chain execution order is as follows:

Name: sample-chain
Engine: tika;optional, langdetect, opennlp-sentence, opennlp-token,
securitymaster-linking, securitymaster-keyword-linking, opennlp-ner

In result almost all entities have been recognized by creator:
org.apache.stanbol.enhancer.engines.opennlp.impl.NamedEntityExtractionEnhancementEngine.