You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by "Sethi, Keval Krishna" <ks...@innodata.com> on 2013/07/12 15:54:13 UTC

[URGENT] Working with custom vocabulary

Hi,

I am using stanbol to extract entitiies by plugging custom vocabulary as
per http://stanbol.apache.org/docs/trunk/customvocabulary.html

Following are the steps followed -

 Configured Clerezza Yard.
 Configured Managed Yard site.
 Updated the site by plugging ontology(containing custom entities) .
 Configured Entity hub linking Engine(*customLinkingEngine*) with managed
site.
 Configured a customChain which uses following engine

   -  *langdetect*
   - *opennlp-sentence*
   - *opennlp-token*
   - *opennlp-pos*
   - *opennlp-chunker*
   - *customLinkingEngine*

Now, i am able to extract entities like Adidas using *customChain*.

However i am facing an issue in extracting entities which has space in
between. For example "Tommy Hilfiger".

Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
instance) is rightly extracting entities like  "Tommy Hilfiger".

I had tried configuring  *customLinkingEngine* same as *
dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
didn't work to extract above entity.

I have invested more than a week now and running out of options now

i request you to please provide help in resolving this issue

-- 
Regards,
Keval Sethi

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: Working with custom vocabulary

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi all,

Sorry I was offline the whole last week, otherwise I would have
answered earlier.

As Rafa already pointed out, the issue was caused by the ClerezzaYard
not returning multi word literals for queries of a single word. IMO
this is not a bug, but rather an issue caused by SPARQL not supporting
proper full text search features.

EntityLinking works on "word" level. Therefore if a text mentions
"University of Salzburg" will will create a query such as "university"
OR "salzburg". This is translated to SPARQL as an UNION over
"rdf:label" values. However as you might know, a SPARQL endpoint will
not answer such a query with an Entity that has the rdfs:label
"University of Salzburg".

For LARQ and Virtuoso Stanbol is able to use the specific Full text
extensions. In such cases queries like the above might provide
expected results, but for the ClerezzaYard this is not possible (might
change with the introduction of FastLane).

In any case: For users that plan to use a ManagedSite for
EntityLinking it is strongly suggested to use the SolrYard
implementation!

best
Rupert



On Tue, Jul 16, 2013 at 2:31 PM, Sawhney, Tarandeep Singh
<ts...@innodata.com> wrote:
> Sure Rafa. i will add the new issue in jira with details
>
> best regards
> tarandeep
> On Jul 16, 2013 5:59 PM, "Rafa Haro" <rh...@zaizi.com> wrote:
>
>> Hi Tarandeep,
>>
>> Happy to hear you finally solve your problem. Could you please add a new
>> issue in the Stanbol Jira explaining the error with your ClerezzaYard site?
>>
>> Thanks
>>
>> El 16/07/13 13:38, Sawhney, Tarandeep Singh escribió:
>>
>>> Hi Rafa
>>>
>>> I tried using SolrYard and it worked :-) So there seems to be a defect in
>>> ClerezzaYard
>>>
>>> thanks so much for pointing that out
>>>
>>> Do you have any information on when new version of stanbol is planned to
>>> be
>>> released and what will be covered in that release (feature/bug list etc)
>>>
>>> Also can i get some information on stanbol roadmap ahead
>>>
>>> thanks again for your help
>>>
>>> best regards
>>> tarandeep
>>>
>>>
>>> On Mon, Jul 15, 2013 at 10:47 PM, Sawhney, Tarandeep Singh <
>>> tsawhney@innodata.com> wrote:
>>>
>>>  Thanks Rafa for your response
>>>>
>>>> I will try resolving this issue based on pointers you have provided and
>>>> will post the update accordingly.
>>>>
>>>> Best regards
>>>> tarandeep
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>>>
>>>>  Hi Tarandeep,
>>>>>
>>>>> As Sergio already pointed, you can check some different Entity Linking
>>>>> engines configurations at IKS development server:
>>>>> http://dev.iks-project.eu:****8081/enhancer/chain<http://**
>>>>> dev.iks-project.eu:8081/**enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>
>>>>> >.
>>>>> You can try to use the same configuration of some of the chains
>>>>> registered
>>>>> in this Stanbol instance. For that, just go through the Felix Console (
>>>>> http://dev.iks-project.eu:****8081/system/console/configMgr/**<
>>>>> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
>>>>> **>
>>>>> **) and take a look to the different EntityHubLinkingEngine
>>>>> configurations. You can also try to use a Keyword Linking engine
>>>>> instead of
>>>>> an EntityHub Linking engine.
>>>>>
>>>>> Anyway, all the sites configured in this server are SolrYard based, so
>>>>> perhaps there is a bug in the ClerezzaYard entity search process for
>>>>> multi-words entities' labels. We might would need debug logs messages in
>>>>> order to find out the problem.
>>>>>
>>>>> Regards
>>>>>
>>>>> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>>>>>
>>>>>  Hi Sergio
>>>>>>
>>>>>> This is exactly i did and i mentioned in my last email
>>>>>>
>>>>>> *"What i understand is to enable option "Link ProperNouns only" in
>>>>>>
>>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>>> chain"
>>>>>> *
>>>>>>
>>>>>>
>>>>>> I have already checked this option in my own entity hub linking engine
>>>>>>
>>>>>> By the way, did you get a chance to look at files i have shared in
>>>>>> google
>>>>>> drive folder. Did you notice any problems there ?
>>>>>>
>>>>>> I think using custom ontology with stanbol should be a very common use
>>>>>> case
>>>>>> and if there are issues getting it working, either i am doing something
>>>>>> terribly wrong or there are some other reasons which i dont know.
>>>>>>
>>>>>> But anyways, i am persisting to solve this issue and any help on this
>>>>>> from
>>>>>> this dev community will be much appreciated
>>>>>>
>>>>>> best regards
>>>>>> tarandeep
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at><
>>>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>
>>>>>> >>
>>>>>> wrote:
>>>>>>
>>>>>>   http://{stanbol}/system/******console/configMgr sorry
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>>>>>
>>>>>>>   Have you check the
>>>>>>>
>>>>>>>> 1) go to http://{stanbol}/config/******system/console/configMgr
>>>>>>>>
>>>>>>>>
>>>>>>>> 2) find your EntityHub Linking engine
>>>>>>>>
>>>>>>>> 3) and then "Link ProperNouns only"
>>>>>>>>
>>>>>>>> The documentation in that configuration is quite useful I think:
>>>>>>>>
>>>>>>>> "If activated only ProperNouns will be matched against the
>>>>>>>> Vocabulary.
>>>>>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>>>>>> requires a tag of the POS TagSet to be mapped against
>>>>>>>> 'olia:PorperNoun'.
>>>>>>>> Otherwise mapping will not work as expected.
>>>>>>>> (enhancer.engines.linking.******properNounsState)"
>>>>>>>>
>>>>>>>>
>>>>>>>> Hope this help. You have to take into account such kind of issues are
>>>>>>>> not easy to solve by email.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>>>>>
>>>>>>>>   Thanks Sergio for your response
>>>>>>>>
>>>>>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>>>>>> entityhub linking and also to use "opennlp-pos" engine in my
>>>>>>>>> weighted
>>>>>>>>> chain
>>>>>>>>>
>>>>>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>>>>>
>>>>>>>>> Please find below the output RDF/XML from enhancer
>>>>>>>>>
>>>>>>>>> Request you to please let me know if i did not understand your
>>>>>>>>> inputs
>>>>>>>>> correctly
>>>>>>>>>
>>>>>>>>> One more thing, in our ontology (yet to be built) we will have
>>>>>>>>> entities
>>>>>>>>> which are other than people, places and organisations. For example,
>>>>>>>>> belts,
>>>>>>>>> bags etc
>>>>>>>>>
>>>>>>>>> best regards
>>>>>>>>> tarandeep
>>>>>>>>>
>>>>>>>>> <rdf:RDF
>>>>>>>>>        xmlns:rdf="http://www.w3.org/******1999/02/22-rdf-syntax-ns#<http://www.w3.org/****1999/02/22-rdf-syntax-ns#>
>>>>>>>>> <h**ttp://www.w3.org/**1999/02/22-**rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>>>>>> >
>>>>>>>>> <htt**p://www.w3.org/1999/02/**22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-**syntax-ns#>
>>>>>>>>> <http://**www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>>>>>> >
>>>>>>>>> "
>>>>>>>>>        xmlns:j.0="http://purl.org/dc/******terms/<http://purl.org/dc/****terms/>
>>>>>>>>> <http://purl.org/dc/****terms/ <http://purl.org/dc/**terms/>><
>>>>>>>>> http://purl.org/dc/terms/>"
>>>>>>>>>        xmlns:j.1="http://fise.iks-****p**roject.eu/ontology/<
>>>>>>>>> http://**project.eu/ontology/ <http://project.eu/ontology/>>
>>>>>>>>> <http://**fise.iks-project.eu/**ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>>> <http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>> **"
>>>>>>>>>>
>>>>>>>>>      <rdf:Description
>>>>>>>>> rdf:about="urn:enhancement-******197792bf-f1e8-47bf-626a-****
>>>>>>>>> 3cdfbdb863b3">
>>>>>>>>>        <j.0:type rdf:resource="http://purl.org/****<http://purl.org/**>
>>>>>>>>> **dc/terms/LinguisticSystem<ht**tp://purl.org/**dc/terms/**
>>>>>>>>> LinguisticSystem <http://purl.org/**dc/terms/LinguisticSystem>>
>>>>>>>>> <ht**tp://purl.org/dc/terms/****LinguisticSystem<http://purl.org/dc/terms/**LinguisticSystem>
>>>>>>>>> <http://purl.**org/dc/terms/LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>>>>>> >
>>>>>>>>> "/>
>>>>>>>>>        <j.1:extracted-from
>>>>>>>>> rdf:resource="urn:content-******item-sha1-****
>>>>>>>>> 3b2998e66582544035454850d2dd81******
>>>>>>>>> 755b747849"/>
>>>>>>>>>
>>>>>>>>>        <j.1:confidence
>>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#double<**
>>>>>>>>> http**
>>>>>>>>> ://www.w3.org/2001/XMLSchema#****double<http://www.w3.org/2001/XMLSchema#**double>
>>>>>>>>> <http://www.w3.org/**2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>>>>>>>> >
>>>>>>>>> ">0.**9999964817340454</j.1:******confidence>
>>>>>>>>>
>>>>>>>>>        <rdf:type
>>>>>>>>> rdf:resource="http://fise.iks-******project.eu/ontology/******
>>>>>>>>> Enhancement <http://project.eu/ontology/****Enhancement><
>>>>>>>>> http://project.eu/**ontology/**Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>>>> >
>>>>>>>>> <http://fise.iks-**project.eu/**ontology/**Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>>>> <http://**fise.iks-project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>>>>>> >
>>>>>>>>> "/>
>>>>>>>>>        <rdf:type
>>>>>>>>> rdf:resource="http://fise.iks-******project.eu/ontology/****
>>>>>>>>> TextAnnotation <http://project.eu/ontology/****TextAnnotation<http://project.eu/ontology/**TextAnnotation>
>>>>>>>>> ><
>>>>>>>>> http://fise.**iks-project.eu/**ontology/**TextAnnotation<http://iks-project.eu/ontology/**TextAnnotation>
>>>>>>>>> <http**://fise.iks-project.eu/**ontology/TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>>>>>> >
>>>>>>>>> "/>
>>>>>>>>>        <j.0:language>en</j.0:******language>
>>>>>>>>>        <j.0:created
>>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#dateTime<**
>>>>>>>>> ht**
>>>>>>>>> tp://www.w3.org/2001/****XMLSchema#dateTime<http://www.w3.org/2001/**XMLSchema#dateTime>
>>>>>>>>> <http://www.**w3.org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>>>>>> **>
>>>>>>>>> ">**2013-07-15T14:25:43.829Z</****j.0:**created>
>>>>>>>>>
>>>>>>>>>        <j.0:creator
>>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#string<**
>>>>>>>>> http**
>>>>>>>>> ://www.w3.org/2001/XMLSchema#****string<http://www.w3.org/2001/XMLSchema#**string>
>>>>>>>>> <http://www.w3.org/**2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>>>>>>>> >
>>>>>>>>> ">**org.apache.stanbol.****enhancer.**engines.langdetect.******
>>>>>>>>> LanguageDetectionEnhancementEn******gine</j.0:creator>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>      </rdf:Description>
>>>>>>>>> </rdf:RDF>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>> >
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>    As I said: have you check the proper noun detection and POS
>>>>>>>>> tagging
>>>>>>>>> in
>>>>>>>>>
>>>>>>>>>  your chain?
>>>>>>>>>>
>>>>>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>>>>>> Salzburg,
>>>>>>>>>> which is based in Austria" works at the demo server:
>>>>>>>>>>
>>>>>>>>>> http://dev.iks-project.eu:********8081/enhancer/chain/dbpedia-**
>>>>>>>>>> ******
>>>>>>>>>> proper-noun<http://dev.iks-****p**roject.eu:8081/enhancer/**<**
>>>>>>>>>> http://project.eu:8081/**enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>>>>>> >
>>>>>>>>>> chain/dbpedia-proper-noun<**http**://dev.iks-project.eu:**8081/**<http://dev.iks-project.eu:8081/**>
>>>>>>>>>> enhancer/chain/dbpedia-proper-****noun<http://dev.iks-project.**
>>>>>>>>>> eu:8081/enhancer/chain/**dbpedia-proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>>>>>> >
>>>>>>>>>> Here the details:
>>>>>>>>>>
>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/****<http://stanbol.apache.org/******docs/trunk/components/****>
>>>>>>>>>> <h**ttp://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>>> >
>>>>>>>>>> enhancer/engines/**<http://**s**tanbol.apache.org/**docs/**<http://stanbol.apache.org/**docs/**>
>>>>>>>>>> trunk/components/**enhancer/****engines/**<http://stanbol.**
>>>>>>>>>> apache.org/**docs/trunk/**components/**enhancer/engines/****<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>>> >
>>>>>>>>>> entitylinking#proper-noun-********linking-****
>>>>>>>>>> wzxhzdk14enhancerengineslinkin********
>>>>>>>>>> gpropernounsstatewzxhzdk15<****htt**p://stanbol.apache.org/****
>>>>>>>>>> docs/** <http://stanbol.apache.org/**docs/**><
>>>>>>>>>> http://stanbol.apache.**org/docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>>>> >
>>>>>>>>>> trunk/components/enhancer/******engines/entitylinking#proper-***
>>>>>>>>>> ***
>>>>>>>>>> noun-linking-******wzxhzdk14enhancerengineslinkin******
>>>>>>>>>>
>>>>>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>>>>>> >
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>>>>>
>>>>>>>>>>    Just to add to my previous email
>>>>>>>>>>
>>>>>>>>>>  If i add another individual in my ontology "MyUniversity" under
>>>>>>>>>>> class
>>>>>>>>>>> University
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>         <!--
>>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> untitled-ontology-13#********MyUniversity--<http://www.**
>>>>>>>>>>> semanticweb.org/vi5/******ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**>
>>>>>>>>>>> <**http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>>> >
>>>>>>>>>>> ontology-13#MyUniversity--<**htt**p://www.semanticweb.org/**
>>>>>>>>>>> vi5/** <http://www.semanticweb.org/vi5/**>
>>>>>>>>>>> ontologies/2013/6/untitled-****ontology-13#MyUniversity--<htt**
>>>>>>>>>>> p://www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>>>> ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>>>>>> >
>>>>>>>>>>>         <owl:NamedIndividual rdf:about="
>>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> untitled-ontology-13#********MyUniversity<http://www.**
>>>>>>>>>>> semanticweb.org/vi5/******ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**>
>>>>>>>>>>> <**http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>>> >
>>>>>>>>>>> ontology-13#MyUniversity<http:****//www.semanticweb.org/vi5/**
>>>>>>>>>>> ontologies/2013/6/untitled-****ontology-13#MyUniversity<http:**
>>>>>>>>>>> //www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>>>> ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>>>>>> >
>>>>>>>>>>> ">
>>>>>>>>>>>             <rdf:type rdf:resource="
>>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> untitled-ontology-13#********University<http://www.****semant**
>>>>>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>>>>>> *ontologies/2013/6/untitled-******ontology-13#University<http:**
>>>>>>>>>>> //**
>>>>>>>>>>> www.semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>>> ontology-13#University<http://**www.semanticweb.org/vi5/**
>>>>>>>>>>> ontologies/2013/6/untitled-**ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>>>>>> >
>>>>>>>>>>> "/>
>>>>>>>>>>>             <rdfs:label>MyUniversity</********rdfs:label>
>>>>>>>>>>>
>>>>>>>>>>>         </owl:NamedIndividual>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So with all configurations i have mentioned in the word document
>>>>>>>>>>> (in
>>>>>>>>>>> google
>>>>>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>>>>>> enhancement
>>>>>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>>>>>> "University" type
>>>>>>>>>>>
>>>>>>>>>>> But same set of configurations doesn't work with individual
>>>>>>>>>>> "University of
>>>>>>>>>>> Salzburg"
>>>>>>>>>>>
>>>>>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>>>>>> able to
>>>>>>>>>>> extract custom entities which has space in between, will be a
>>>>>>>>>>> great
>>>>>>>>>>> help
>>>>>>>>>>> to
>>>>>>>>>>> proceed further on our journey with using and contributing to
>>>>>>>>>>> stanbol
>>>>>>>>>>>
>>>>>>>>>>> with best regards,
>>>>>>>>>>> tarandeep
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>     Thanks Sergio and Dileepa for your responses
>>>>>>>>>>>
>>>>>>>>>>>   We haven't been able to resolve the issue. We therefore decided
>>>>>>>>>>> to
>>>>>>>>>>>
>>>>>>>>>>>> keep
>>>>>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>>>>>> our
>>>>>>>>>>>> custom ontology and try to extract this entity and also link it
>>>>>>>>>>>> but we
>>>>>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>>>>>> configurations.
>>>>>>>>>>>>
>>>>>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>>>>>
>>>>>>>>>>>> https://drive.google.com/********folderview?id=0B-**<https://drive.google.com/******folderview?id=0B-**>
>>>>>>>>>>>> <https://**drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <https://**drive.google.com/****folderview?**id=0B-**<http://drive.google.com/**folderview?**id=0B-**>
>>>>>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>>>>>> >
>>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=********sharing<https://drive.**
>>>>>>>>>>>> google.com/folderview?id=0B-******vX9idwHlRtRFFOR000ZnBBOWM&**
>>>>>>>>>>>> usp=**<http://google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**>
>>>>>>>>>>>> **sharing<http://google.com/**folderview?id=0B-****
>>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>>>>>> >
>>>>>>>>>>>> <https://drive.**google.com/**folderview?id=0B-**<http://google.com/folderview?id=0B-**>
>>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<https://drive.**
>>>>>>>>>>>> google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>>>>>> sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>> This folder has 3 files:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>>>>>> configurations
>>>>>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>>>>>> weighted
>>>>>>>>>>>> chain
>>>>>>>>>>>> 2) our custom ontology
>>>>>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL
>>>>>>>>>>>> endpoint
>>>>>>>>>>>>
>>>>>>>>>>>> May i request you all to please look at these files and let us
>>>>>>>>>>>> know
>>>>>>>>>>>> if we
>>>>>>>>>>>> are missing something in configurations.
>>>>>>>>>>>>
>>>>>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>>>>>> for
>>>>>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>>>>>
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/**
>>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>>> >
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> entityhub/managedsite<http://******stanbol.apache.org/docs/**
>>>>>>>>>>>> trunk/** <http://stanbol.apache.org/**docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> components/entityhub/******managedsite<http://stanbol.**
>>>>>>>>>>>> apache.org/docs/trunk/****components/entityhub/****managedsite<http://apache.org/docs/trunk/**components/entityhub/**managedsite>
>>>>>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**components/entityhub/*
>>>>>>>>>>>> *managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>>>>>> >
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/****<http://stanbol.apache.org/******docs/trunk/components/****>
>>>>>>>>>>>> <h**ttp://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>>>>> >
>>>>>>>>>>>> enhancer/engines/**<http://**s**tanbol.apache.org/**docs/**<http://stanbol.apache.org/**docs/**>
>>>>>>>>>>>> trunk/components/**enhancer/****engines/**<http://stanbol.**
>>>>>>>>>>>> apache.org/**docs/trunk/**components/**enhancer/engines/****<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> entityhublinking<http://****stan**bol.apache.org/docs/**trunk/**<http://bol.apache.org/docs/trunk/**>
>>>>>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> components/enhancer/engines/******entityhublinking<http://**
>>>>>>>>>>>> stanbol.apache.org/docs/trunk/****components/enhancer/engines/**
>>>>>>>>>>>> **<http://stanbol.apache.org/docs/trunk/**components/enhancer/engines/**>
>>>>>>>>>>>> entityhublinking<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>>> components/enhancer/engines/**entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>>>>>> >
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> enhancer/chains/weightedchain.********html<http://stanbol.****
>>>>>>>>>>>> apache.<http://stanbol.apache.**>
>>>>>>>>>>>> **
>>>>>>>>>>>> org/docs/trunk/components/******enhancer/chains/weightedchain.**
>>>>>>>>>>>> **
>>>>>>>>>>>> **html<http://stanbol.apache.****org/docs/trunk/components/**
>>>>>>>>>>>> enhancer/chains/weightedchain.****html<http://stanbol.apache.**
>>>>>>>>>>>> org/docs/trunk/components/**enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards
>>>>>>>>>>>> tarandeep
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>>>>>> sergio.fernandez@******salzburgres**earch.at <
>>>>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>>>>> <sergio.fernandez@****salzburgre**search.at<http://**
>>>>>>>>>>>> salzburgresearch.at <http://salzburgresearch.at>>
>>>>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>>>>> >
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>     Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>   I'm not an expert on entity linking, but from my experience
>>>>>>>>>>>> such
>>>>>>>>>>>>
>>>>>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>>>>>> details
>>>>>>>>>>>>> at:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://stanbol.apache.org/**********docs/trunk/components/**<http://stanbol.apache.org/********docs/trunk/components/**>
>>>>>>>>>>>>> <h**ttp://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <htt**p://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>>> <**http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <http:**//stanbol.apache.org/******docs/**trunk/components/**<http://stanbol.apache.org/****docs/**trunk/components/**>
>>>>>>>>>>>>> <**http://stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <ht**tp://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> enhancer/engines/********entitylinking<http://stanbol.********
>>>>>>>>>>>>> apache.org/docs/trunk/********components/enhancer/engines/****
>>>>>>>>>>>>> ****<http://apache.org/docs/trunk/******components/enhancer/engines/******>
>>>>>>>>>>>>> <http://apache.org/docs/**trunk/****components/enhancer/**
>>>>>>>>>>>>> engines/****<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> entitylinking<http://apache.****org/docs/trunk/**components/**
>>>>>>>>>>>>> enhancer/engines/******entitylinking<http://apache.**
>>>>>>>>>>>>> org/docs/trunk/**components/**enhancer/engines/****
>>>>>>>>>>>>> entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <http://stanbol.**apache.org/****docs/trunk/**<http://apache.org/**docs/trunk/**>
>>>>>>>>>>>>> <http://apache.**org/docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> components/enhancer/engines/******entitylinking<http://**
>>>>>>>>>>>>> stanbol. <http://stanbol.>**
>>>>>>>>>>>>> apache.org/docs/trunk/****components/enhancer/engines/**<http://apache.org/docs/trunk/**components/enhancer/engines/**>
>>>>>>>>>>>>> entitylinking<http://stanbol.**apache.org/docs/trunk/**
>>>>>>>>>>>>> components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>>>>>> netiquette in
>>>>>>>>>>>>> mailing lists. This is an open source community; therefore
>>>>>>>>>>>>> messages
>>>>>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it
>>>>>>>>>>>>> on
>>>>>>>>>>>>> Friday
>>>>>>>>>>>>> afternoon, when people could be already out for weekend, or even
>>>>>>>>>>>>> on
>>>>>>>>>>>>> vacations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Sergio
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>     Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>>   I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>>>>>
>>>>>>>>>>>>>> vocabulary
>>>>>>>>>>>>>> as
>>>>>>>>>>>>>> per
>>>>>>>>>>>>>> http://stanbol.apache.org/**********docs/trunk/**
>>>>>>>>>>>>>> customvocabulary.**<http://stanbol.apache.org/********docs/trunk/customvocabulary.**>
>>>>>>>>>>>>>> ****<http://stanbol.apache.**org/******docs/trunk/**
>>>>>>>>>>>>>> customvocabulary.****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> **html<http://stanbol.apache.****org/****docs/trunk/**
>>>>>>>>>>>>>> customvocabulary.****html<http**://stanbol.apache.org/******
>>>>>>>>>>>>>> docs/trunk/customvocabulary.******html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> <http://stanbol.apache.**org/******docs/trunk/****
>>>>>>>>>>>>>> customvocabulary.**html<http:/****/stanbol.apache.org/**docs/*
>>>>>>>>>>>>>> *** <http://stanbol.apache.org/**docs/**>
>>>>>>>>>>>>>> trunk/customvocabulary.**html<**http://stanbol.apache.org/****
>>>>>>>>>>>>>> docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> <http://stanbol.apache.**org/******docs/trunk/****
>>>>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>>>>> html<http://stanbol.apache.******org/docs/trunk/****
>>>>>>>>>>>>>> customvocabulary.html<http://****stanbol.apache.org/docs/**
>>>>>>>>>>>>>> trunk/** <http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/*
>>>>>>>>>>>>>> *customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>       Configured Clerezza Yard.
>>>>>>>>>>>>>>       Configured Managed Yard site.
>>>>>>>>>>>>>>       Updated the site by plugging ontology(containing custom
>>>>>>>>>>>>>> entities) .
>>>>>>>>>>>>>>       Configured Entity hub linking
>>>>>>>>>>>>>> Engine(*customLinkingEngine*)
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>> managed
>>>>>>>>>>>>>> site.
>>>>>>>>>>>>>>       Configured a customChain which uses following engine
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         -  *langdetect*
>>>>>>>>>>>>>>         - *opennlp-sentence*
>>>>>>>>>>>>>>         - *opennlp-token*
>>>>>>>>>>>>>>         - *opennlp-pos*
>>>>>>>>>>>>>>         - *opennlp-chunker*
>>>>>>>>>>>>>>         - *customLinkingEngine*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>>>>>> *customChain*.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>>>>>> space in
>>>>>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>>>>>> stanbol
>>>>>>>>>>>>>> instance) is rightly extracting entities like  "Tommy
>>>>>>>>>>>>>> Hilfiger".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>>>>>>>>>> dbpedia-disamb-linking *(configured in
>>>>>>>>>>>>>> *dbpedia-disambiguation* )
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>> it
>>>>>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>>>>>> now
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>     --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>   Sergio Fernández
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Salzburg Research
>>>>>>>>>>>>> +43 662 2288 318
>>>>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>      --
>>>>>>>>>>>>
>>>>>>>>>>> Sergio Fernández
>>>>>>>>>> Salzburg Research
>>>>>>>>>> +43 662 2288 318
>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>   --
>>>>>>>>>>
>>>>>>>>> Sergio Fernández
>>>>>>> Salzburg Research
>>>>>>> +43 662 2288 318
>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>> A-5020 Salzburg (Austria)
>>>>>>> http://www.salzburgresearch.at
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>>>
>>>>> ------------------------------
>>>>> This message should be regarded as confidential. If you have received
>>>>> this email in error please notify the sender and destroy it immediately.
>>>>> Statements of intent shall only become binding when confirmed in hard
>>>>> copy
>>>>> by an authorised signatory.
>>>>>
>>>>> Zaizi Ltd is registered in England and Wales with the registration
>>>>> number
>>>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>>>>> London W6 7AN.
>>>>>
>>>>
>>>>
>>>>
>>
>> --
>>
>> ------------------------------
>> This message should be regarded as confidential. If you have received this
>> email in error please notify the sender and destroy it immediately.
>> Statements of intent shall only become binding when confirmed in hard copy
>> by an authorised signatory.
>>
>> Zaizi Ltd is registered in England and Wales with the registration number
>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>> London W6 7AN.
>
> --
>
> "This e-mail and any attachments transmitted with it are for the sole use
> of the intended recipient(s) and may contain confidential , proprietary or
> privileged information. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of the original
> message. Any unauthorized review, use, disclosure, dissemination,
> forwarding, printing or copying of this e-mail or any action taken in
> reliance on this e-mail is strictly prohibited and may be unlawful."



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: Working with custom vocabulary

Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Sure Rafa. i will add the new issue in jira with details

best regards
tarandeep
On Jul 16, 2013 5:59 PM, "Rafa Haro" <rh...@zaizi.com> wrote:

> Hi Tarandeep,
>
> Happy to hear you finally solve your problem. Could you please add a new
> issue in the Stanbol Jira explaining the error with your ClerezzaYard site?
>
> Thanks
>
> El 16/07/13 13:38, Sawhney, Tarandeep Singh escribió:
>
>> Hi Rafa
>>
>> I tried using SolrYard and it worked :-) So there seems to be a defect in
>> ClerezzaYard
>>
>> thanks so much for pointing that out
>>
>> Do you have any information on when new version of stanbol is planned to
>> be
>> released and what will be covered in that release (feature/bug list etc)
>>
>> Also can i get some information on stanbol roadmap ahead
>>
>> thanks again for your help
>>
>> best regards
>> tarandeep
>>
>>
>> On Mon, Jul 15, 2013 at 10:47 PM, Sawhney, Tarandeep Singh <
>> tsawhney@innodata.com> wrote:
>>
>>  Thanks Rafa for your response
>>>
>>> I will try resolving this issue based on pointers you have provided and
>>> will post the update accordingly.
>>>
>>> Best regards
>>> tarandeep
>>>
>>>
>>> On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>>
>>>  Hi Tarandeep,
>>>>
>>>> As Sergio already pointed, you can check some different Entity Linking
>>>> engines configurations at IKS development server:
>>>> http://dev.iks-project.eu:****8081/enhancer/chain<http://**
>>>> dev.iks-project.eu:8081/**enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>
>>>> >.
>>>> You can try to use the same configuration of some of the chains
>>>> registered
>>>> in this Stanbol instance. For that, just go through the Felix Console (
>>>> http://dev.iks-project.eu:****8081/system/console/configMgr/**<
>>>> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
>>>> **>
>>>> **) and take a look to the different EntityHubLinkingEngine
>>>> configurations. You can also try to use a Keyword Linking engine
>>>> instead of
>>>> an EntityHub Linking engine.
>>>>
>>>> Anyway, all the sites configured in this server are SolrYard based, so
>>>> perhaps there is a bug in the ClerezzaYard entity search process for
>>>> multi-words entities' labels. We might would need debug logs messages in
>>>> order to find out the problem.
>>>>
>>>> Regards
>>>>
>>>> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>>>>
>>>>  Hi Sergio
>>>>>
>>>>> This is exactly i did and i mentioned in my last email
>>>>>
>>>>> *"What i understand is to enable option "Link ProperNouns only" in
>>>>>
>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>> chain"
>>>>> *
>>>>>
>>>>>
>>>>> I have already checked this option in my own entity hub linking engine
>>>>>
>>>>> By the way, did you get a chance to look at files i have shared in
>>>>> google
>>>>> drive folder. Did you notice any problems there ?
>>>>>
>>>>> I think using custom ontology with stanbol should be a very common use
>>>>> case
>>>>> and if there are issues getting it working, either i am doing something
>>>>> terribly wrong or there are some other reasons which i dont know.
>>>>>
>>>>> But anyways, i am persisting to solve this issue and any help on this
>>>>> from
>>>>> this dev community will be much appreciated
>>>>>
>>>>> best regards
>>>>> tarandeep
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at><
>>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>
>>>>> >>
>>>>> wrote:
>>>>>
>>>>>   http://{stanbol}/system/******console/configMgr sorry
>>>>>
>>>>>>
>>>>>>
>>>>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>>>>
>>>>>>   Have you check the
>>>>>>
>>>>>>> 1) go to http://{stanbol}/config/******system/console/configMgr
>>>>>>>
>>>>>>>
>>>>>>> 2) find your EntityHub Linking engine
>>>>>>>
>>>>>>> 3) and then "Link ProperNouns only"
>>>>>>>
>>>>>>> The documentation in that configuration is quite useful I think:
>>>>>>>
>>>>>>> "If activated only ProperNouns will be matched against the
>>>>>>> Vocabulary.
>>>>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>>>>> requires a tag of the POS TagSet to be mapped against
>>>>>>> 'olia:PorperNoun'.
>>>>>>> Otherwise mapping will not work as expected.
>>>>>>> (enhancer.engines.linking.******properNounsState)"
>>>>>>>
>>>>>>>
>>>>>>> Hope this help. You have to take into account such kind of issues are
>>>>>>> not easy to solve by email.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>>>>
>>>>>>>   Thanks Sergio for your response
>>>>>>>
>>>>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>>>>> entityhub linking and also to use "opennlp-pos" engine in my
>>>>>>>> weighted
>>>>>>>> chain
>>>>>>>>
>>>>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>>>>
>>>>>>>> Please find below the output RDF/XML from enhancer
>>>>>>>>
>>>>>>>> Request you to please let me know if i did not understand your
>>>>>>>> inputs
>>>>>>>> correctly
>>>>>>>>
>>>>>>>> One more thing, in our ontology (yet to be built) we will have
>>>>>>>> entities
>>>>>>>> which are other than people, places and organisations. For example,
>>>>>>>> belts,
>>>>>>>> bags etc
>>>>>>>>
>>>>>>>> best regards
>>>>>>>> tarandeep
>>>>>>>>
>>>>>>>> <rdf:RDF
>>>>>>>>        xmlns:rdf="http://www.w3.org/******1999/02/22-rdf-syntax-ns#<http://www.w3.org/****1999/02/22-rdf-syntax-ns#>
>>>>>>>> <h**ttp://www.w3.org/**1999/02/22-**rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>>>>> >
>>>>>>>> <htt**p://www.w3.org/1999/02/**22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-**syntax-ns#>
>>>>>>>> <http://**www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>>>>> >
>>>>>>>> "
>>>>>>>>        xmlns:j.0="http://purl.org/dc/******terms/<http://purl.org/dc/****terms/>
>>>>>>>> <http://purl.org/dc/****terms/ <http://purl.org/dc/**terms/>><
>>>>>>>> http://purl.org/dc/terms/>"
>>>>>>>>        xmlns:j.1="http://fise.iks-****p**roject.eu/ontology/<
>>>>>>>> http://**project.eu/ontology/ <http://project.eu/ontology/>>
>>>>>>>> <http://**fise.iks-project.eu/**ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>> <http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>> >
>>>>>>>>
>>>>>>>>> **"
>>>>>>>>>
>>>>>>>>      <rdf:Description
>>>>>>>> rdf:about="urn:enhancement-******197792bf-f1e8-47bf-626a-****
>>>>>>>> 3cdfbdb863b3">
>>>>>>>>        <j.0:type rdf:resource="http://purl.org/****<http://purl.org/**>
>>>>>>>> **dc/terms/LinguisticSystem<ht**tp://purl.org/**dc/terms/**
>>>>>>>> LinguisticSystem <http://purl.org/**dc/terms/LinguisticSystem>>
>>>>>>>> <ht**tp://purl.org/dc/terms/****LinguisticSystem<http://purl.org/dc/terms/**LinguisticSystem>
>>>>>>>> <http://purl.**org/dc/terms/LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>>>>> >
>>>>>>>> "/>
>>>>>>>>        <j.1:extracted-from
>>>>>>>> rdf:resource="urn:content-******item-sha1-****
>>>>>>>> 3b2998e66582544035454850d2dd81******
>>>>>>>> 755b747849"/>
>>>>>>>>
>>>>>>>>        <j.1:confidence
>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#double<**
>>>>>>>> http**
>>>>>>>> ://www.w3.org/2001/XMLSchema#****double<http://www.w3.org/2001/XMLSchema#**double>
>>>>>>>> <http://www.w3.org/**2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>>>>>>> >
>>>>>>>> ">0.**9999964817340454</j.1:******confidence>
>>>>>>>>
>>>>>>>>        <rdf:type
>>>>>>>> rdf:resource="http://fise.iks-******project.eu/ontology/******
>>>>>>>> Enhancement <http://project.eu/ontology/****Enhancement><
>>>>>>>> http://project.eu/**ontology/**Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>>> >
>>>>>>>> <http://fise.iks-**project.eu/**ontology/**Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>>> <http://**fise.iks-project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>>>>> >
>>>>>>>> "/>
>>>>>>>>        <rdf:type
>>>>>>>> rdf:resource="http://fise.iks-******project.eu/ontology/****
>>>>>>>> TextAnnotation <http://project.eu/ontology/****TextAnnotation<http://project.eu/ontology/**TextAnnotation>
>>>>>>>> ><
>>>>>>>> http://fise.**iks-project.eu/**ontology/**TextAnnotation<http://iks-project.eu/ontology/**TextAnnotation>
>>>>>>>> <http**://fise.iks-project.eu/**ontology/TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>>>>> >
>>>>>>>> "/>
>>>>>>>>        <j.0:language>en</j.0:******language>
>>>>>>>>        <j.0:created
>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#dateTime<**
>>>>>>>> ht**
>>>>>>>> tp://www.w3.org/2001/****XMLSchema#dateTime<http://www.w3.org/2001/**XMLSchema#dateTime>
>>>>>>>> <http://www.**w3.org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>>>>> **>
>>>>>>>> ">**2013-07-15T14:25:43.829Z</****j.0:**created>
>>>>>>>>
>>>>>>>>        <j.0:creator
>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#string<**
>>>>>>>> http**
>>>>>>>> ://www.w3.org/2001/XMLSchema#****string<http://www.w3.org/2001/XMLSchema#**string>
>>>>>>>> <http://www.w3.org/**2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>>>>>>> >
>>>>>>>> ">**org.apache.stanbol.****enhancer.**engines.langdetect.******
>>>>>>>> LanguageDetectionEnhancementEn******gine</j.0:creator>
>>>>>>>>
>>>>>>>>
>>>>>>>>      </rdf:Description>
>>>>>>>> </rdf:RDF>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>> http://salzburgresearch.at>
>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>> >
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>    As I said: have you check the proper noun detection and POS
>>>>>>>> tagging
>>>>>>>> in
>>>>>>>>
>>>>>>>>  your chain?
>>>>>>>>>
>>>>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>>>>> Salzburg,
>>>>>>>>> which is based in Austria" works at the demo server:
>>>>>>>>>
>>>>>>>>> http://dev.iks-project.eu:********8081/enhancer/chain/dbpedia-**
>>>>>>>>> ******
>>>>>>>>> proper-noun<http://dev.iks-****p**roject.eu:8081/enhancer/**<**
>>>>>>>>> http://project.eu:8081/**enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>>>>> >
>>>>>>>>> chain/dbpedia-proper-noun<**http**://dev.iks-project.eu:**8081/**<http://dev.iks-project.eu:8081/**>
>>>>>>>>> enhancer/chain/dbpedia-proper-****noun<http://dev.iks-project.**
>>>>>>>>> eu:8081/enhancer/chain/**dbpedia-proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>>>>> >
>>>>>>>>> Here the details:
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/****<http://stanbol.apache.org/******docs/trunk/components/****>
>>>>>>>>> <h**ttp://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>> >
>>>>>>>>> enhancer/engines/**<http://**s**tanbol.apache.org/**docs/**<http://stanbol.apache.org/**docs/**>
>>>>>>>>> trunk/components/**enhancer/****engines/**<http://stanbol.**
>>>>>>>>> apache.org/**docs/trunk/**components/**enhancer/engines/****<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>> >
>>>>>>>>> entitylinking#proper-noun-********linking-****
>>>>>>>>> wzxhzdk14enhancerengineslinkin********
>>>>>>>>> gpropernounsstatewzxhzdk15<****htt**p://stanbol.apache.org/****
>>>>>>>>> docs/** <http://stanbol.apache.org/**docs/**><
>>>>>>>>> http://stanbol.apache.**org/docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>>> >
>>>>>>>>> trunk/components/enhancer/******engines/entitylinking#proper-***
>>>>>>>>> ***
>>>>>>>>> noun-linking-******wzxhzdk14enhancerengineslinkin******
>>>>>>>>>
>>>>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>>>>> >
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>>>>
>>>>>>>>>    Just to add to my previous email
>>>>>>>>>
>>>>>>>>>  If i add another individual in my ontology "MyUniversity" under
>>>>>>>>>> class
>>>>>>>>>> University
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>         <!--
>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> untitled-ontology-13#********MyUniversity--<http://www.**
>>>>>>>>>> semanticweb.org/vi5/******ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**>
>>>>>>>>>> <**http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>> >
>>>>>>>>>> ontology-13#MyUniversity--<**htt**p://www.semanticweb.org/**
>>>>>>>>>> vi5/** <http://www.semanticweb.org/vi5/**>
>>>>>>>>>> ontologies/2013/6/untitled-****ontology-13#MyUniversity--<htt**
>>>>>>>>>> p://www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>>> ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>>>>> >
>>>>>>>>>>         <owl:NamedIndividual rdf:about="
>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> untitled-ontology-13#********MyUniversity<http://www.**
>>>>>>>>>> semanticweb.org/vi5/******ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**>
>>>>>>>>>> <**http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>> >
>>>>>>>>>> ontology-13#MyUniversity<http:****//www.semanticweb.org/vi5/**
>>>>>>>>>> ontologies/2013/6/untitled-****ontology-13#MyUniversity<http:**
>>>>>>>>>> //www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>>> ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>>>>> >
>>>>>>>>>> ">
>>>>>>>>>>             <rdf:type rdf:resource="
>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> untitled-ontology-13#********University<http://www.****semant**
>>>>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>>>>> *ontologies/2013/6/untitled-******ontology-13#University<http:**
>>>>>>>>>> //**
>>>>>>>>>> www.semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>> ontology-13#University<http://**www.semanticweb.org/vi5/**
>>>>>>>>>> ontologies/2013/6/untitled-**ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>>>>> >
>>>>>>>>>> "/>
>>>>>>>>>>             <rdfs:label>MyUniversity</********rdfs:label>
>>>>>>>>>>
>>>>>>>>>>         </owl:NamedIndividual>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So with all configurations i have mentioned in the word document
>>>>>>>>>> (in
>>>>>>>>>> google
>>>>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>>>>> enhancement
>>>>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>>>>> "University" type
>>>>>>>>>>
>>>>>>>>>> But same set of configurations doesn't work with individual
>>>>>>>>>> "University of
>>>>>>>>>> Salzburg"
>>>>>>>>>>
>>>>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>>>>> able to
>>>>>>>>>> extract custom entities which has space in between, will be a
>>>>>>>>>> great
>>>>>>>>>> help
>>>>>>>>>> to
>>>>>>>>>> proceed further on our journey with using and contributing to
>>>>>>>>>> stanbol
>>>>>>>>>>
>>>>>>>>>> with best regards,
>>>>>>>>>> tarandeep
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>>>>
>>>>>>>>>>     Thanks Sergio and Dileepa for your responses
>>>>>>>>>>
>>>>>>>>>>   We haven't been able to resolve the issue. We therefore decided
>>>>>>>>>> to
>>>>>>>>>>
>>>>>>>>>>> keep
>>>>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>>>>> our
>>>>>>>>>>> custom ontology and try to extract this entity and also link it
>>>>>>>>>>> but we
>>>>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>>>>> configurations.
>>>>>>>>>>>
>>>>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>>>>
>>>>>>>>>>> https://drive.google.com/********folderview?id=0B-**<https://drive.google.com/******folderview?id=0B-**>
>>>>>>>>>>> <https://**drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>>>>> >
>>>>>>>>>>> <https://**drive.google.com/****folderview?**id=0B-**<http://drive.google.com/**folderview?**id=0B-**>
>>>>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>>>>> >
>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=********sharing<https://drive.**
>>>>>>>>>>> google.com/folderview?id=0B-******vX9idwHlRtRFFOR000ZnBBOWM&**
>>>>>>>>>>> usp=**<http://google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**>
>>>>>>>>>>> **sharing<http://google.com/**folderview?id=0B-****
>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>>>>> >
>>>>>>>>>>> <https://drive.**google.com/**folderview?id=0B-**<http://google.com/folderview?id=0B-**>
>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<https://drive.**
>>>>>>>>>>> google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>>>>> sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> This folder has 3 files:
>>>>>>>>>>>
>>>>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>>>>> configurations
>>>>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>>>>> weighted
>>>>>>>>>>> chain
>>>>>>>>>>> 2) our custom ontology
>>>>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL
>>>>>>>>>>> endpoint
>>>>>>>>>>>
>>>>>>>>>>> May i request you all to please look at these files and let us
>>>>>>>>>>> know
>>>>>>>>>>> if we
>>>>>>>>>>> are missing something in configurations.
>>>>>>>>>>>
>>>>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>>>>> for
>>>>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>>>>
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>> >
>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>> >
>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/**
>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>> >
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>> >
>>>>>>>>>>> entityhub/managedsite<http://******stanbol.apache.org/docs/**
>>>>>>>>>>> trunk/** <http://stanbol.apache.org/**docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>> >
>>>>>>>>>>> components/entityhub/******managedsite<http://stanbol.**
>>>>>>>>>>> apache.org/docs/trunk/****components/entityhub/****managedsite<http://apache.org/docs/trunk/**components/entityhub/**managedsite>
>>>>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**components/entityhub/*
>>>>>>>>>>> *managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>>>>> >
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/****<http://stanbol.apache.org/******docs/trunk/components/****>
>>>>>>>>>>> <h**ttp://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>>>> >
>>>>>>>>>>> enhancer/engines/**<http://**s**tanbol.apache.org/**docs/**<http://stanbol.apache.org/**docs/**>
>>>>>>>>>>> trunk/components/**enhancer/****engines/**<http://stanbol.**
>>>>>>>>>>> apache.org/**docs/trunk/**components/**enhancer/engines/****<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>>>> >
>>>>>>>>>>> entityhublinking<http://****stan**bol.apache.org/docs/**trunk/**<http://bol.apache.org/docs/trunk/**>
>>>>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>> >
>>>>>>>>>>> components/enhancer/engines/******entityhublinking<http://**
>>>>>>>>>>> stanbol.apache.org/docs/trunk/****components/enhancer/engines/**
>>>>>>>>>>> **<http://stanbol.apache.org/docs/trunk/**components/enhancer/engines/**>
>>>>>>>>>>> entityhublinking<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>> components/enhancer/engines/**entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>>>>> >
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>> >
>>>>>>>>>>> enhancer/chains/weightedchain.********html<http://stanbol.****
>>>>>>>>>>> apache.<http://stanbol.apache.**>
>>>>>>>>>>> **
>>>>>>>>>>> org/docs/trunk/components/******enhancer/chains/weightedchain.**
>>>>>>>>>>> **
>>>>>>>>>>> **html<http://stanbol.apache.****org/docs/trunk/components/**
>>>>>>>>>>> enhancer/chains/weightedchain.****html<http://stanbol.apache.**
>>>>>>>>>>> org/docs/trunk/components/**enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>> tarandeep
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>>>>> sergio.fernandez@******salzburgres**earch.at <
>>>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>>>> <sergio.fernandez@****salzburgre**search.at<http://**
>>>>>>>>>>> salzburgresearch.at <http://salzburgresearch.at>>
>>>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>>>> >
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>     Hi,
>>>>>>>>>>>
>>>>>>>>>>>   I'm not an expert on entity linking, but from my experience
>>>>>>>>>>> such
>>>>>>>>>>>
>>>>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>>>>> details
>>>>>>>>>>>> at:
>>>>>>>>>>>>
>>>>>>>>>>>> http://stanbol.apache.org/**********docs/trunk/components/**<http://stanbol.apache.org/********docs/trunk/components/**>
>>>>>>>>>>>> <h**ttp://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <htt**p://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>> <**http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http:**//stanbol.apache.org/******docs/**trunk/components/**<http://stanbol.apache.org/****docs/**trunk/components/**>
>>>>>>>>>>>> <**http://stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <ht**tp://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> enhancer/engines/********entitylinking<http://stanbol.********
>>>>>>>>>>>> apache.org/docs/trunk/********components/enhancer/engines/****
>>>>>>>>>>>> ****<http://apache.org/docs/trunk/******components/enhancer/engines/******>
>>>>>>>>>>>> <http://apache.org/docs/**trunk/****components/enhancer/**
>>>>>>>>>>>> engines/****<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>>>>> >
>>>>>>>>>>>> entitylinking<http://apache.****org/docs/trunk/**components/**
>>>>>>>>>>>> enhancer/engines/******entitylinking<http://apache.**
>>>>>>>>>>>> org/docs/trunk/**components/**enhancer/engines/****
>>>>>>>>>>>> entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http://stanbol.**apache.org/****docs/trunk/**<http://apache.org/**docs/trunk/**>
>>>>>>>>>>>> <http://apache.**org/docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> components/enhancer/engines/******entitylinking<http://**
>>>>>>>>>>>> stanbol. <http://stanbol.>**
>>>>>>>>>>>> apache.org/docs/trunk/****components/enhancer/engines/**<http://apache.org/docs/trunk/**components/enhancer/engines/**>
>>>>>>>>>>>> entitylinking<http://stanbol.**apache.org/docs/trunk/**
>>>>>>>>>>>> components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>>>>> >
>>>>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>>>>> netiquette in
>>>>>>>>>>>> mailing lists. This is an open source community; therefore
>>>>>>>>>>>> messages
>>>>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it
>>>>>>>>>>>> on
>>>>>>>>>>>> Friday
>>>>>>>>>>>> afternoon, when people could be already out for weekend, or even
>>>>>>>>>>>> on
>>>>>>>>>>>> vacations.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Sergio
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>     Hi,
>>>>>>>>>>>>
>>>>>>>>>>>>   I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>>>>
>>>>>>>>>>>>> vocabulary
>>>>>>>>>>>>> as
>>>>>>>>>>>>> per
>>>>>>>>>>>>> http://stanbol.apache.org/**********docs/trunk/**
>>>>>>>>>>>>> customvocabulary.**<http://stanbol.apache.org/********docs/trunk/customvocabulary.**>
>>>>>>>>>>>>> ****<http://stanbol.apache.**org/******docs/trunk/**
>>>>>>>>>>>>> customvocabulary.****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> **html<http://stanbol.apache.****org/****docs/trunk/**
>>>>>>>>>>>>> customvocabulary.****html<http**://stanbol.apache.org/******
>>>>>>>>>>>>> docs/trunk/customvocabulary.******html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <http://stanbol.apache.**org/******docs/trunk/****
>>>>>>>>>>>>> customvocabulary.**html<http:/****/stanbol.apache.org/**docs/*
>>>>>>>>>>>>> *** <http://stanbol.apache.org/**docs/**>
>>>>>>>>>>>>> trunk/customvocabulary.**html<**http://stanbol.apache.org/****
>>>>>>>>>>>>> docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <http://stanbol.apache.**org/******docs/trunk/****
>>>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>>>> html<http://stanbol.apache.******org/docs/trunk/****
>>>>>>>>>>>>> customvocabulary.html<http://****stanbol.apache.org/docs/**
>>>>>>>>>>>>> trunk/** <http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/*
>>>>>>>>>>>>> *customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
>>>>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>>>>
>>>>>>>>>>>>>       Configured Clerezza Yard.
>>>>>>>>>>>>>       Configured Managed Yard site.
>>>>>>>>>>>>>       Updated the site by plugging ontology(containing custom
>>>>>>>>>>>>> entities) .
>>>>>>>>>>>>>       Configured Entity hub linking
>>>>>>>>>>>>> Engine(*customLinkingEngine*)
>>>>>>>>>>>>> with
>>>>>>>>>>>>> managed
>>>>>>>>>>>>> site.
>>>>>>>>>>>>>       Configured a customChain which uses following engine
>>>>>>>>>>>>>
>>>>>>>>>>>>>         -  *langdetect*
>>>>>>>>>>>>>         - *opennlp-sentence*
>>>>>>>>>>>>>         - *opennlp-token*
>>>>>>>>>>>>>         - *opennlp-pos*
>>>>>>>>>>>>>         - *opennlp-chunker*
>>>>>>>>>>>>>         - *customLinkingEngine*
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>>>>> *customChain*.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>>>>> space in
>>>>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>>>>> stanbol
>>>>>>>>>>>>> instance) is rightly extracting entities like  "Tommy
>>>>>>>>>>>>> Hilfiger".
>>>>>>>>>>>>>
>>>>>>>>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>>>>>>>>> dbpedia-disamb-linking *(configured in
>>>>>>>>>>>>> *dbpedia-disambiguation* )
>>>>>>>>>>>>> but
>>>>>>>>>>>>> it
>>>>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>>>>> now
>>>>>>>>>>>>>
>>>>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>     --
>>>>>>>>>>>>>
>>>>>>>>>>>>>   Sergio Fernández
>>>>>>>>>>>>>
>>>>>>>>>>>> Salzburg Research
>>>>>>>>>>>> +43 662 2288 318
>>>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>      --
>>>>>>>>>>>
>>>>>>>>>> Sergio Fernández
>>>>>>>>> Salzburg Research
>>>>>>>>> +43 662 2288 318
>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>   --
>>>>>>>>>
>>>>>>>> Sergio Fernández
>>>>>> Salzburg Research
>>>>>> +43 662 2288 318
>>>>>> Jakob-Haringer Strasse 5/II
>>>>>> A-5020 Salzburg (Austria)
>>>>>> http://www.salzburgresearch.at
>>>>>>
>>>>>>
>>>>>>  --
>>>>
>>>> ------------------------------
>>>> This message should be regarded as confidential. If you have received
>>>> this email in error please notify the sender and destroy it immediately.
>>>> Statements of intent shall only become binding when confirmed in hard
>>>> copy
>>>> by an authorised signatory.
>>>>
>>>> Zaizi Ltd is registered in England and Wales with the registration
>>>> number
>>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>>>> London W6 7AN.
>>>>
>>>
>>>
>>>
>
> --
>
> ------------------------------
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> London W6 7AN.

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: Working with custom vocabulary

Posted by Rafa Haro <rh...@zaizi.com>.
Hi Tarandeep,

Happy to hear you finally solve your problem. Could you please add a new 
issue in the Stanbol Jira explaining the error with your ClerezzaYard site?

Thanks

El 16/07/13 13:38, Sawhney, Tarandeep Singh escribió:
> Hi Rafa
>
> I tried using SolrYard and it worked :-) So there seems to be a defect in
> ClerezzaYard
>
> thanks so much for pointing that out
>
> Do you have any information on when new version of stanbol is planned to be
> released and what will be covered in that release (feature/bug list etc)
>
> Also can i get some information on stanbol roadmap ahead
>
> thanks again for your help
>
> best regards
> tarandeep
>
>
> On Mon, Jul 15, 2013 at 10:47 PM, Sawhney, Tarandeep Singh <
> tsawhney@innodata.com> wrote:
>
>> Thanks Rafa for your response
>>
>> I will try resolving this issue based on pointers you have provided and
>> will post the update accordingly.
>>
>> Best regards
>> tarandeep
>>
>>
>> On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>
>>> Hi Tarandeep,
>>>
>>> As Sergio already pointed, you can check some different Entity Linking
>>> engines configurations at IKS development server:
>>> http://dev.iks-project.eu:**8081/enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>.
>>> You can try to use the same configuration of some of the chains registered
>>> in this Stanbol instance. For that, just go through the Felix Console (
>>> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
>>> **) and take a look to the different EntityHubLinkingEngine
>>> configurations. You can also try to use a Keyword Linking engine instead of
>>> an EntityHub Linking engine.
>>>
>>> Anyway, all the sites configured in this server are SolrYard based, so
>>> perhaps there is a bug in the ClerezzaYard entity search process for
>>> multi-words entities' labels. We might would need debug logs messages in
>>> order to find out the problem.
>>>
>>> Regards
>>>
>>> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>>>
>>>> Hi Sergio
>>>>
>>>> This is exactly i did and i mentioned in my last email
>>>>
>>>> *"What i understand is to enable option "Link ProperNouns only" in
>>>>
>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>> chain"
>>>> *
>>>>
>>>>
>>>> I have already checked this option in my own entity hub linking engine
>>>>
>>>> By the way, did you get a chance to look at files i have shared in google
>>>> drive folder. Did you notice any problems there ?
>>>>
>>>> I think using custom ontology with stanbol should be a very common use
>>>> case
>>>> and if there are issues getting it working, either i am doing something
>>>> terribly wrong or there are some other reasons which i dont know.
>>>>
>>>> But anyways, i am persisting to solve this issue and any help on this
>>>> from
>>>> this dev community will be much appreciated
>>>>
>>>> best regards
>>>> tarandeep
>>>>
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>>> wrote:
>>>>
>>>>   http://{stanbol}/system/****console/configMgr sorry
>>>>>
>>>>>
>>>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>>>
>>>>>   Have you check the
>>>>>> 1) go to http://{stanbol}/config/****system/console/configMgr
>>>>>>
>>>>>>
>>>>>> 2) find your EntityHub Linking engine
>>>>>>
>>>>>> 3) and then "Link ProperNouns only"
>>>>>>
>>>>>> The documentation in that configuration is quite useful I think:
>>>>>>
>>>>>> "If activated only ProperNouns will be matched against the Vocabulary.
>>>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>>>> requires a tag of the POS TagSet to be mapped against
>>>>>> 'olia:PorperNoun'.
>>>>>> Otherwise mapping will not work as expected.
>>>>>> (enhancer.engines.linking.****properNounsState)"
>>>>>>
>>>>>>
>>>>>> Hope this help. You have to take into account such kind of issues are
>>>>>> not easy to solve by email.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>>>
>>>>>>   Thanks Sergio for your response
>>>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>>>> chain
>>>>>>>
>>>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>>>
>>>>>>> Please find below the output RDF/XML from enhancer
>>>>>>>
>>>>>>> Request you to please let me know if i did not understand your inputs
>>>>>>> correctly
>>>>>>>
>>>>>>> One more thing, in our ontology (yet to be built) we will have
>>>>>>> entities
>>>>>>> which are other than people, places and organisations. For example,
>>>>>>> belts,
>>>>>>> bags etc
>>>>>>>
>>>>>>> best regards
>>>>>>> tarandeep
>>>>>>>
>>>>>>> <rdf:RDF
>>>>>>>        xmlns:rdf="http://www.w3.org/****1999/02/22-rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>>>> <htt**p://www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>>>> "
>>>>>>>        xmlns:j.0="http://purl.org/dc/****terms/<http://purl.org/dc/**terms/><
>>>>>>> http://purl.org/dc/terms/>"
>>>>>>>        xmlns:j.1="http://fise.iks-**p**roject.eu/ontology/<http://project.eu/ontology/>
>>>>>>> <http://**fise.iks-project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>> **"
>>>>>>>      <rdf:Description
>>>>>>> rdf:about="urn:enhancement-****197792bf-f1e8-47bf-626a-****
>>>>>>> 3cdfbdb863b3">
>>>>>>>        <j.0:type rdf:resource="http://purl.org/**
>>>>>>> **dc/terms/LinguisticSystem<http://purl.org/**dc/terms/LinguisticSystem>
>>>>>>> <ht**tp://purl.org/dc/terms/**LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>>>> "/>
>>>>>>>        <j.1:extracted-from
>>>>>>> rdf:resource="urn:content-****item-sha1-****
>>>>>>> 3b2998e66582544035454850d2dd81****
>>>>>>> 755b747849"/>
>>>>>>>
>>>>>>>        <j.1:confidence
>>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#double<http**
>>>>>>> ://www.w3.org/2001/XMLSchema#**double<http://www.w3.org/2001/XMLSchema#double>
>>>>>>> ">0.**9999964817340454</j.1:****confidence>
>>>>>>>
>>>>>>>        <rdf:type
>>>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>> <http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>>>> "/>
>>>>>>>        <rdf:type
>>>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****
>>>>>>> TextAnnotation <http://project.eu/ontology/**TextAnnotation><
>>>>>>> http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>>>> "/>
>>>>>>>        <j.0:language>en</j.0:****language>
>>>>>>>        <j.0:created
>>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#dateTime<ht**
>>>>>>> tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>>>> ">**2013-07-15T14:25:43.829Z</**j.0:**created>
>>>>>>>
>>>>>>>        <j.0:creator
>>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<http**
>>>>>>> ://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>>>>>>> ">**org.apache.stanbol.**enhancer.**engines.langdetect.****
>>>>>>> LanguageDetectionEnhancementEn****gine</j.0:creator>
>>>>>>>
>>>>>>>
>>>>>>>      </rdf:Description>
>>>>>>> </rdf:RDF>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at>
>>>>>>> <se...@salzburgresearch.at>
>>>>>>> wrote:
>>>>>>>
>>>>>>>    As I said: have you check the proper noun detection and POS tagging
>>>>>>> in
>>>>>>>
>>>>>>>> your chain?
>>>>>>>>
>>>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>>>> Salzburg,
>>>>>>>> which is based in Austria" works at the demo server:
>>>>>>>>
>>>>>>>> http://dev.iks-project.eu:******8081/enhancer/chain/dbpedia-******
>>>>>>>> proper-noun<http://dev.iks-**p**roject.eu:8081/enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>>>> chain/dbpedia-proper-noun<http**://dev.iks-project.eu:8081/**
>>>>>>>> enhancer/chain/dbpedia-proper-**noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>>>> Here the details:
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>> entitylinking#proper-noun-******linking-****
>>>>>>>> wzxhzdk14enhancerengineslinkin******
>>>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>>>
>>>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>>>
>>>>>>>>    Just to add to my previous email
>>>>>>>>
>>>>>>>>> If i add another individual in my ontology "MyUniversity" under
>>>>>>>>> class
>>>>>>>>> University
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>         <!--
>>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>> untitled-ontology-13#******MyUniversity--<http://www.**
>>>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>> ontology-13#MyUniversity--<htt**p://www.semanticweb.org/vi5/**
>>>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>>>>         <owl:NamedIndividual rdf:about="
>>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>> untitled-ontology-13#******MyUniversity<http://www.**
>>>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>> ontology-13#MyUniversity<http:**//www.semanticweb.org/vi5/**
>>>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>>>> ">
>>>>>>>>>             <rdf:type rdf:resource="
>>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>> untitled-ontology-13#******University<http://www.**semant**
>>>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>>>> *ontologies/2013/6/untitled-****ontology-13#University<http://**
>>>>>>>>> www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>> ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>>>> "/>
>>>>>>>>>             <rdfs:label>MyUniversity</******rdfs:label>
>>>>>>>>>
>>>>>>>>>         </owl:NamedIndividual>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So with all configurations i have mentioned in the word document (in
>>>>>>>>> google
>>>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>>>> enhancement
>>>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>>>> "University" type
>>>>>>>>>
>>>>>>>>> But same set of configurations doesn't work with individual
>>>>>>>>> "University of
>>>>>>>>> Salzburg"
>>>>>>>>>
>>>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>>>> able to
>>>>>>>>> extract custom entities which has space in between, will be a great
>>>>>>>>> help
>>>>>>>>> to
>>>>>>>>> proceed further on our journey with using and contributing to
>>>>>>>>> stanbol
>>>>>>>>>
>>>>>>>>> with best regards,
>>>>>>>>> tarandeep
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>>>
>>>>>>>>>     Thanks Sergio and Dileepa for your responses
>>>>>>>>>
>>>>>>>>>   We haven't been able to resolve the issue. We therefore decided to
>>>>>>>>>> keep
>>>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>>>> our
>>>>>>>>>> custom ontology and try to extract this entity and also link it
>>>>>>>>>> but we
>>>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>>>> configurations.
>>>>>>>>>>
>>>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>>>
>>>>>>>>>> https://drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=******sharing<https://drive.**
>>>>>>>>>> google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>>>> **sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>>>> <https://drive.**google.com/folderview?id=0B-**
>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>>>>
>>>>>>>>>> This folder has 3 files:
>>>>>>>>>>
>>>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>>>> configurations
>>>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>>>> weighted
>>>>>>>>>> chain
>>>>>>>>>> 2) our custom ontology
>>>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>>>>>
>>>>>>>>>> May i request you all to please look at these files and let us know
>>>>>>>>>> if we
>>>>>>>>>> are missing something in configurations.
>>>>>>>>>>
>>>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>>>> for
>>>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>>>
>>>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**
>>>>>>>>>> customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**
>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>> entityhub/managedsite<http://****stanbol.apache.org/docs/**
>>>>>>>>>> trunk/** <http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>> components/entityhub/****managedsite<http://stanbol.**
>>>>>>>>>> apache.org/docs/trunk/**components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>>> entityhublinking<http://**stan**bol.apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>> components/enhancer/engines/****entityhublinking<http://**
>>>>>>>>>> stanbol.apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>> enhancer/chains/weightedchain.******html<http://stanbol.**apache.<http://stanbol.apache.>
>>>>>>>>>> **
>>>>>>>>>> org/docs/trunk/components/****enhancer/chains/weightedchain.**
>>>>>>>>>> **html<http://stanbol.apache.**org/docs/trunk/components/**
>>>>>>>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>>>>
>>>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>> tarandeep
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>     Hi,
>>>>>>>>>>
>>>>>>>>>>   I'm not an expert on entity linking, but from my experience such
>>>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>>>> details
>>>>>>>>>>> at:
>>>>>>>>>>>
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>> enhancer/engines/******entitylinking<http://stanbol.******
>>>>>>>>>>> apache.org/docs/trunk/******components/enhancer/engines/******<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>>>> entitylinking<http://apache.**org/docs/trunk/**components/**
>>>>>>>>>>> enhancer/engines/****entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>>>> <http://stanbol.**apache.org/**docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>>>> components/enhancer/engines/****entitylinking<http://stanbol.**
>>>>>>>>>>> apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>>>> entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>>>> netiquette in
>>>>>>>>>>> mailing lists. This is an open source community; therefore
>>>>>>>>>>> messages
>>>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it
>>>>>>>>>>> on
>>>>>>>>>>> Friday
>>>>>>>>>>> afternoon, when people could be already out for weekend, or even
>>>>>>>>>>> on
>>>>>>>>>>> vacations.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Sergio
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>>>
>>>>>>>>>>>     Hi,
>>>>>>>>>>>
>>>>>>>>>>>   I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>>>> vocabulary
>>>>>>>>>>>> as
>>>>>>>>>>>> per
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/****
>>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>>>
>>>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>>>
>>>>>>>>>>>>       Configured Clerezza Yard.
>>>>>>>>>>>>       Configured Managed Yard site.
>>>>>>>>>>>>       Updated the site by plugging ontology(containing custom
>>>>>>>>>>>> entities) .
>>>>>>>>>>>>       Configured Entity hub linking Engine(*customLinkingEngine*)
>>>>>>>>>>>> with
>>>>>>>>>>>> managed
>>>>>>>>>>>> site.
>>>>>>>>>>>>       Configured a customChain which uses following engine
>>>>>>>>>>>>
>>>>>>>>>>>>         -  *langdetect*
>>>>>>>>>>>>         - *opennlp-sentence*
>>>>>>>>>>>>         - *opennlp-token*
>>>>>>>>>>>>         - *opennlp-pos*
>>>>>>>>>>>>         - *opennlp-chunker*
>>>>>>>>>>>>         - *customLinkingEngine*
>>>>>>>>>>>>
>>>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>>>> *customChain*.
>>>>>>>>>>>>
>>>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>>>> space in
>>>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>>>
>>>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>>>> stanbol
>>>>>>>>>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>>>>>>>>>
>>>>>>>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>>>>>> but
>>>>>>>>>>>> it
>>>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>>>
>>>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>>>> now
>>>>>>>>>>>>
>>>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>     --
>>>>>>>>>>>>
>>>>>>>>>>>>   Sergio Fernández
>>>>>>>>>>> Salzburg Research
>>>>>>>>>>> +43 662 2288 318
>>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>     --
>>>>>>>> Sergio Fernández
>>>>>>>> Salzburg Research
>>>>>>>> +43 662 2288 318
>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>
>>>>>>>>
>>>>>>>>   --
>>>>> Sergio Fernández
>>>>> Salzburg Research
>>>>> +43 662 2288 318
>>>>> Jakob-Haringer Strasse 5/II
>>>>> A-5020 Salzburg (Austria)
>>>>> http://www.salzburgresearch.at
>>>>>
>>>>>
>>> --
>>>
>>> ------------------------------
>>> This message should be regarded as confidential. If you have received
>>> this email in error please notify the sender and destroy it immediately.
>>> Statements of intent shall only become binding when confirmed in hard copy
>>> by an authorised signatory.
>>>
>>> Zaizi Ltd is registered in England and Wales with the registration number
>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>>> London W6 7AN.
>>
>>


-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Working with custom vocabulary

Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Hi Rafa

I tried using SolrYard and it worked :-) So there seems to be a defect in
ClerezzaYard

thanks so much for pointing that out

Do you have any information on when new version of stanbol is planned to be
released and what will be covered in that release (feature/bug list etc)

Also can i get some information on stanbol roadmap ahead

thanks again for your help

best regards
tarandeep


On Mon, Jul 15, 2013 at 10:47 PM, Sawhney, Tarandeep Singh <
tsawhney@innodata.com> wrote:

> Thanks Rafa for your response
>
> I will try resolving this issue based on pointers you have provided and
> will post the update accordingly.
>
> Best regards
> tarandeep
>
>
> On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
>
>> Hi Tarandeep,
>>
>> As Sergio already pointed, you can check some different Entity Linking
>> engines configurations at IKS development server:
>> http://dev.iks-project.eu:**8081/enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>.
>> You can try to use the same configuration of some of the chains registered
>> in this Stanbol instance. For that, just go through the Felix Console (
>> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
>> **) and take a look to the different EntityHubLinkingEngine
>> configurations. You can also try to use a Keyword Linking engine instead of
>> an EntityHub Linking engine.
>>
>> Anyway, all the sites configured in this server are SolrYard based, so
>> perhaps there is a bug in the ClerezzaYard entity search process for
>> multi-words entities' labels. We might would need debug logs messages in
>> order to find out the problem.
>>
>> Regards
>>
>> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>>
>>> Hi Sergio
>>>
>>> This is exactly i did and i mentioned in my last email
>>>
>>> *"What i understand is to enable option "Link ProperNouns only" in
>>>
>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>> chain"
>>> *
>>>
>>>
>>> I have already checked this option in my own entity hub linking engine
>>>
>>> By the way, did you get a chance to look at files i have shared in google
>>> drive folder. Did you notice any problems there ?
>>>
>>> I think using custom ontology with stanbol should be a very common use
>>> case
>>> and if there are issues getting it working, either i am doing something
>>> terribly wrong or there are some other reasons which i dont know.
>>>
>>> But anyways, i am persisting to solve this issue and any help on this
>>> from
>>> this dev community will be much appreciated
>>>
>>> best regards
>>> tarandeep
>>>
>>>
>>>
>>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>> wrote:
>>>
>>>  http://{stanbol}/system/****console/configMgr sorry
>>>>
>>>>
>>>>
>>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>>
>>>>  Have you check the
>>>>>
>>>>> 1) go to http://{stanbol}/config/****system/console/configMgr
>>>>>
>>>>>
>>>>> 2) find your EntityHub Linking engine
>>>>>
>>>>> 3) and then "Link ProperNouns only"
>>>>>
>>>>> The documentation in that configuration is quite useful I think:
>>>>>
>>>>> "If activated only ProperNouns will be matched against the Vocabulary.
>>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>>> requires a tag of the POS TagSet to be mapped against
>>>>> 'olia:PorperNoun'.
>>>>> Otherwise mapping will not work as expected.
>>>>> (enhancer.engines.linking.****properNounsState)"
>>>>>
>>>>>
>>>>> Hope this help. You have to take into account such kind of issues are
>>>>> not easy to solve by email.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>>
>>>>>  Thanks Sergio for your response
>>>>>>
>>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>>> chain
>>>>>>
>>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>>
>>>>>> Please find below the output RDF/XML from enhancer
>>>>>>
>>>>>> Request you to please let me know if i did not understand your inputs
>>>>>> correctly
>>>>>>
>>>>>> One more thing, in our ontology (yet to be built) we will have
>>>>>> entities
>>>>>> which are other than people, places and organisations. For example,
>>>>>> belts,
>>>>>> bags etc
>>>>>>
>>>>>> best regards
>>>>>> tarandeep
>>>>>>
>>>>>> <rdf:RDF
>>>>>>       xmlns:rdf="http://www.w3.org/****1999/02/22-rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>>> <htt**p://www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>>> >
>>>>>> "
>>>>>>       xmlns:j.0="http://purl.org/dc/****terms/<http://purl.org/dc/**terms/><
>>>>>> http://purl.org/dc/terms/>"
>>>>>>       xmlns:j.1="http://fise.iks-**p**roject.eu/ontology/<http://project.eu/ontology/>
>>>>>> <http://**fise.iks-project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>>> >**"
>>>>>>     <rdf:Description
>>>>>> rdf:about="urn:enhancement-****197792bf-f1e8-47bf-626a-****
>>>>>> 3cdfbdb863b3">
>>>>>>       <j.0:type rdf:resource="http://purl.org/**
>>>>>> **dc/terms/LinguisticSystem<http://purl.org/**dc/terms/LinguisticSystem>
>>>>>> <ht**tp://purl.org/dc/terms/**LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>>> >
>>>>>> "/>
>>>>>>       <j.1:extracted-from
>>>>>> rdf:resource="urn:content-****item-sha1-****
>>>>>> 3b2998e66582544035454850d2dd81****
>>>>>> 755b747849"/>
>>>>>>
>>>>>>       <j.1:confidence
>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#double<http**
>>>>>> ://www.w3.org/2001/XMLSchema#**double<http://www.w3.org/2001/XMLSchema#double>
>>>>>> >
>>>>>> ">0.**9999964817340454</j.1:****confidence>
>>>>>>
>>>>>>       <rdf:type
>>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>> <http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>>> >
>>>>>> "/>
>>>>>>       <rdf:type
>>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****
>>>>>> TextAnnotation <http://project.eu/ontology/**TextAnnotation><
>>>>>> http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>>> >
>>>>>> "/>
>>>>>>       <j.0:language>en</j.0:****language>
>>>>>>       <j.0:created
>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#dateTime<ht**
>>>>>> tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>>> >
>>>>>> ">**2013-07-15T14:25:43.829Z</**j.0:**created>
>>>>>>
>>>>>>       <j.0:creator
>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<http**
>>>>>> ://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>>>>>> >
>>>>>> ">**org.apache.stanbol.**enhancer.**engines.langdetect.****
>>>>>> LanguageDetectionEnhancementEn****gine</j.0:creator>
>>>>>>
>>>>>>
>>>>>>     </rdf:Description>
>>>>>> </rdf:RDF>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at>
>>>>>> <se...@salzburgresearch.at>
>>>>>> >>
>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>   As I said: have you check the proper noun detection and POS tagging
>>>>>> in
>>>>>>
>>>>>>> your chain?
>>>>>>>
>>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>>> Salzburg,
>>>>>>> which is based in Austria" works at the demo server:
>>>>>>>
>>>>>>> http://dev.iks-project.eu:******8081/enhancer/chain/dbpedia-******
>>>>>>> proper-noun<http://dev.iks-**p**roject.eu:8081/enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>>> chain/dbpedia-proper-noun<http**://dev.iks-project.eu:8081/**
>>>>>>> enhancer/chain/dbpedia-proper-**noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>>> >
>>>>>>>
>>>>>>> Here the details:
>>>>>>>
>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>> >
>>>>>>> entitylinking#proper-noun-******linking-****
>>>>>>> wzxhzdk14enhancerengineslinkin******
>>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>>
>>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>>> >
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>>
>>>>>>>   Just to add to my previous email
>>>>>>>
>>>>>>>> If i add another individual in my ontology "MyUniversity" under
>>>>>>>> class
>>>>>>>> University
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>        <!--
>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>> >
>>>>>>>> untitled-ontology-13#******MyUniversity--<http://www.**
>>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>> ontology-13#MyUniversity--<htt**p://www.semanticweb.org/vi5/**
>>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>>> >
>>>>>>>>
>>>>>>>>        <owl:NamedIndividual rdf:about="
>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>> >
>>>>>>>> untitled-ontology-13#******MyUniversity<http://www.**
>>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>> ontology-13#MyUniversity<http:**//www.semanticweb.org/vi5/**
>>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>>> >
>>>>>>>> ">
>>>>>>>>            <rdf:type rdf:resource="
>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>> >
>>>>>>>> untitled-ontology-13#******University<http://www.**semant**
>>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>>> *ontologies/2013/6/untitled-****ontology-13#University<http://**
>>>>>>>> www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>> ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>>> >
>>>>>>>> "/>
>>>>>>>>            <rdfs:label>MyUniversity</******rdfs:label>
>>>>>>>>
>>>>>>>>        </owl:NamedIndividual>
>>>>>>>>
>>>>>>>>
>>>>>>>> So with all configurations i have mentioned in the word document (in
>>>>>>>> google
>>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>>> enhancement
>>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>>> "University" type
>>>>>>>>
>>>>>>>> But same set of configurations doesn't work with individual
>>>>>>>> "University of
>>>>>>>> Salzburg"
>>>>>>>>
>>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>>> able to
>>>>>>>> extract custom entities which has space in between, will be a great
>>>>>>>> help
>>>>>>>> to
>>>>>>>> proceed further on our journey with using and contributing to
>>>>>>>> stanbol
>>>>>>>>
>>>>>>>> with best regards,
>>>>>>>> tarandeep
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>>
>>>>>>>>    Thanks Sergio and Dileepa for your responses
>>>>>>>>
>>>>>>>>  We haven't been able to resolve the issue. We therefore decided to
>>>>>>>>> keep
>>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>>> our
>>>>>>>>> custom ontology and try to extract this entity and also link it
>>>>>>>>> but we
>>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>>> configurations.
>>>>>>>>>
>>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>>
>>>>>>>>> https://drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>>> >
>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=******sharing<https://drive.**
>>>>>>>>> google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>>> **sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>>> <https://drive.**google.com/folderview?id=0B-**
>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This folder has 3 files:
>>>>>>>>>
>>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>>> configurations
>>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>>> weighted
>>>>>>>>> chain
>>>>>>>>> 2) our custom ontology
>>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>>>>
>>>>>>>>> May i request you all to please look at these files and let us know
>>>>>>>>> if we
>>>>>>>>> are missing something in configurations.
>>>>>>>>>
>>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>>> for
>>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**
>>>>>>>>> customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>> >
>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**
>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>> >
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>> >
>>>>>>>>> entityhub/managedsite<http://****stanbol.apache.org/docs/**
>>>>>>>>> trunk/** <http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>> components/entityhub/****managedsite<http://stanbol.**
>>>>>>>>> apache.org/docs/trunk/**components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> entityhublinking<http://**stan**bol.apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>> components/enhancer/engines/****entityhublinking<http://**
>>>>>>>>> stanbol.apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>> >
>>>>>>>>> enhancer/chains/weightedchain.******html<http://stanbol.**apache.<http://stanbol.apache.>
>>>>>>>>> **
>>>>>>>>> org/docs/trunk/components/****enhancer/chains/weightedchain.**
>>>>>>>>> **html<http://stanbol.apache.**org/docs/trunk/components/**
>>>>>>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>> tarandeep
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>> >
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>    Hi,
>>>>>>>>>
>>>>>>>>>  I'm not an expert on entity linking, but from my experience such
>>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>>> details
>>>>>>>>>> at:
>>>>>>>>>>
>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>> >
>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>> >
>>>>>>>>>> enhancer/engines/******entitylinking<http://stanbol.******
>>>>>>>>>> apache.org/docs/trunk/******components/enhancer/engines/******<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>>> entitylinking<http://apache.**org/docs/trunk/**components/**
>>>>>>>>>> enhancer/engines/****entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>> <http://stanbol.**apache.org/**docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>>> components/enhancer/engines/****entitylinking<http://stanbol.**
>>>>>>>>>> apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>>> entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>>> netiquette in
>>>>>>>>>> mailing lists. This is an open source community; therefore
>>>>>>>>>> messages
>>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it
>>>>>>>>>> on
>>>>>>>>>> Friday
>>>>>>>>>> afternoon, when people could be already out for weekend, or even
>>>>>>>>>> on
>>>>>>>>>> vacations.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Sergio
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>>
>>>>>>>>>>    Hi,
>>>>>>>>>>
>>>>>>>>>>  I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>>> vocabulary
>>>>>>>>>>> as
>>>>>>>>>>> per
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>> >
>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>> >
>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/****
>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>>
>>>>>>>>>>>      Configured Clerezza Yard.
>>>>>>>>>>>      Configured Managed Yard site.
>>>>>>>>>>>      Updated the site by plugging ontology(containing custom
>>>>>>>>>>> entities) .
>>>>>>>>>>>      Configured Entity hub linking Engine(*customLinkingEngine*)
>>>>>>>>>>> with
>>>>>>>>>>> managed
>>>>>>>>>>> site.
>>>>>>>>>>>      Configured a customChain which uses following engine
>>>>>>>>>>>
>>>>>>>>>>>        -  *langdetect*
>>>>>>>>>>>        - *opennlp-sentence*
>>>>>>>>>>>        - *opennlp-token*
>>>>>>>>>>>        - *opennlp-pos*
>>>>>>>>>>>        - *opennlp-chunker*
>>>>>>>>>>>        - *customLinkingEngine*
>>>>>>>>>>>
>>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>>> *customChain*.
>>>>>>>>>>>
>>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>>> space in
>>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>>
>>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>>> stanbol
>>>>>>>>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>>>>>>>>
>>>>>>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>>>>> but
>>>>>>>>>>> it
>>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>>
>>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>>> now
>>>>>>>>>>>
>>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>    --
>>>>>>>>>>>
>>>>>>>>>>>  Sergio Fernández
>>>>>>>>>> Salzburg Research
>>>>>>>>>> +43 662 2288 318
>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>    --
>>>>>>>>
>>>>>>> Sergio Fernández
>>>>>>> Salzburg Research
>>>>>>> +43 662 2288 318
>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>> A-5020 Salzburg (Austria)
>>>>>>> http://www.salzburgresearch.at
>>>>>>>
>>>>>>>
>>>>>>>  --
>>>> Sergio Fernández
>>>> Salzburg Research
>>>> +43 662 2288 318
>>>> Jakob-Haringer Strasse 5/II
>>>> A-5020 Salzburg (Austria)
>>>> http://www.salzburgresearch.at
>>>>
>>>>
>>
>> --
>>
>> ------------------------------
>> This message should be regarded as confidential. If you have received
>> this email in error please notify the sender and destroy it immediately.
>> Statements of intent shall only become binding when confirmed in hard copy
>> by an authorised signatory.
>>
>> Zaizi Ltd is registered in England and Wales with the registration number
>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>> London W6 7AN.
>
>
>

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: Working with custom vocabulary

Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Thanks Rafa for your response

I will try resolving this issue based on pointers you have provided and
will post the update accordingly.

Best regards
tarandeep


On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:

> Hi Tarandeep,
>
> As Sergio already pointed, you can check some different Entity Linking
> engines configurations at IKS development server:
> http://dev.iks-project.eu:**8081/enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>.
> You can try to use the same configuration of some of the chains registered
> in this Stanbol instance. For that, just go through the Felix Console (
> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
> **) and take a look to the different EntityHubLinkingEngine
> configurations. You can also try to use a Keyword Linking engine instead of
> an EntityHub Linking engine.
>
> Anyway, all the sites configured in this server are SolrYard based, so
> perhaps there is a bug in the ClerezzaYard entity search process for
> multi-words entities' labels. We might would need debug logs messages in
> order to find out the problem.
>
> Regards
>
> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>
>> Hi Sergio
>>
>> This is exactly i did and i mentioned in my last email
>>
>> *"What i understand is to enable option "Link ProperNouns only" in
>>
>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>> chain"
>> *
>>
>>
>> I have already checked this option in my own entity hub linking engine
>>
>> By the way, did you get a chance to look at files i have shared in google
>> drive folder. Did you notice any problems there ?
>>
>> I think using custom ontology with stanbol should be a very common use
>> case
>> and if there are issues getting it working, either i am doing something
>> terribly wrong or there are some other reasons which i dont know.
>>
>> But anyways, i am persisting to solve this issue and any help on this from
>> this dev community will be much appreciated
>>
>> best regards
>> tarandeep
>>
>>
>>
>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>> wrote:
>>
>>  http://{stanbol}/system/****console/configMgr sorry
>>>
>>>
>>>
>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>
>>>  Have you check the
>>>>
>>>> 1) go to http://{stanbol}/config/****system/console/configMgr
>>>>
>>>>
>>>> 2) find your EntityHub Linking engine
>>>>
>>>> 3) and then "Link ProperNouns only"
>>>>
>>>> The documentation in that configuration is quite useful I think:
>>>>
>>>> "If activated only ProperNouns will be matched against the Vocabulary.
>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>> requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
>>>> Otherwise mapping will not work as expected.
>>>> (enhancer.engines.linking.****properNounsState)"
>>>>
>>>>
>>>> Hope this help. You have to take into account such kind of issues are
>>>> not easy to solve by email.
>>>>
>>>> Cheers,
>>>>
>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>
>>>>  Thanks Sergio for your response
>>>>>
>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>> chain
>>>>>
>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>
>>>>> Please find below the output RDF/XML from enhancer
>>>>>
>>>>> Request you to please let me know if i did not understand your inputs
>>>>> correctly
>>>>>
>>>>> One more thing, in our ontology (yet to be built) we will have entities
>>>>> which are other than people, places and organisations. For example,
>>>>> belts,
>>>>> bags etc
>>>>>
>>>>> best regards
>>>>> tarandeep
>>>>>
>>>>> <rdf:RDF
>>>>>       xmlns:rdf="http://www.w3.org/****1999/02/22-rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>> <htt**p://www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>> >
>>>>> "
>>>>>       xmlns:j.0="http://purl.org/dc/****terms/<http://purl.org/dc/**terms/><
>>>>> http://purl.org/dc/terms/>"
>>>>>       xmlns:j.1="http://fise.iks-**p**roject.eu/ontology/<http://project.eu/ontology/>
>>>>> <http://**fise.iks-project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>> >**"
>>>>>     <rdf:Description
>>>>> rdf:about="urn:enhancement-****197792bf-f1e8-47bf-626a-****
>>>>> 3cdfbdb863b3">
>>>>>       <j.0:type rdf:resource="http://purl.org/**
>>>>> **dc/terms/LinguisticSystem<http://purl.org/**dc/terms/LinguisticSystem>
>>>>> <ht**tp://purl.org/dc/terms/**LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>> >
>>>>> "/>
>>>>>       <j.1:extracted-from
>>>>> rdf:resource="urn:content-****item-sha1-****
>>>>> 3b2998e66582544035454850d2dd81****
>>>>> 755b747849"/>
>>>>>
>>>>>       <j.1:confidence
>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#double<http**
>>>>> ://www.w3.org/2001/XMLSchema#**double<http://www.w3.org/2001/XMLSchema#double>
>>>>> >
>>>>> ">0.**9999964817340454</j.1:****confidence>
>>>>>
>>>>>       <rdf:type
>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>>>>> <http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>> >
>>>>> "/>
>>>>>       <rdf:type
>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****
>>>>> TextAnnotation <http://project.eu/ontology/**TextAnnotation><
>>>>> http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>> >
>>>>> "/>
>>>>>       <j.0:language>en</j.0:****language>
>>>>>       <j.0:created
>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#dateTime<ht**
>>>>> tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>> >
>>>>> ">**2013-07-15T14:25:43.829Z</**j.0:**created>
>>>>>
>>>>>       <j.0:creator
>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<http**
>>>>> ://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>>>>> >
>>>>> ">**org.apache.stanbol.**enhancer.**engines.langdetect.****
>>>>> LanguageDetectionEnhancementEn****gine</j.0:creator>
>>>>>
>>>>>
>>>>>     </rdf:Description>
>>>>> </rdf:RDF>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at><
>>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>
>>>>> >>
>>>>>
>>>>> wrote:
>>>>>
>>>>>   As I said: have you check the proper noun detection and POS tagging
>>>>> in
>>>>>
>>>>>> your chain?
>>>>>>
>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>> Salzburg,
>>>>>> which is based in Austria" works at the demo server:
>>>>>>
>>>>>> http://dev.iks-project.eu:******8081/enhancer/chain/dbpedia-******
>>>>>> proper-noun<http://dev.iks-**p**roject.eu:8081/enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>> chain/dbpedia-proper-noun<http**://dev.iks-project.eu:8081/**
>>>>>> enhancer/chain/dbpedia-proper-**noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>> >
>>>>>>
>>>>>> Here the details:
>>>>>>
>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>> >
>>>>>> entitylinking#proper-noun-******linking-****
>>>>>> wzxhzdk14enhancerengineslinkin******
>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>
>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>> >
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>
>>>>>>   Just to add to my previous email
>>>>>>
>>>>>>> If i add another individual in my ontology "MyUniversity" under class
>>>>>>> University
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>        <!--
>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>> >
>>>>>>> untitled-ontology-13#******MyUniversity--<http://www.**
>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>> ontology-13#MyUniversity--<htt**p://www.semanticweb.org/vi5/**
>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>> >
>>>>>>>
>>>>>>>        <owl:NamedIndividual rdf:about="
>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>> >
>>>>>>> untitled-ontology-13#******MyUniversity<http://www.**
>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>> ontology-13#MyUniversity<http:**//www.semanticweb.org/vi5/**
>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>> >
>>>>>>> ">
>>>>>>>            <rdf:type rdf:resource="
>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>> >
>>>>>>> untitled-ontology-13#******University<http://www.**semant**
>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>> *ontologies/2013/6/untitled-****ontology-13#University<http://**
>>>>>>> www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>> ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>> >
>>>>>>> "/>
>>>>>>>            <rdfs:label>MyUniversity</******rdfs:label>
>>>>>>>
>>>>>>>        </owl:NamedIndividual>
>>>>>>>
>>>>>>>
>>>>>>> So with all configurations i have mentioned in the word document (in
>>>>>>> google
>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>> enhancement
>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>> "University" type
>>>>>>>
>>>>>>> But same set of configurations doesn't work with individual
>>>>>>> "University of
>>>>>>> Salzburg"
>>>>>>>
>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>> able to
>>>>>>> extract custom entities which has space in between, will be a great
>>>>>>> help
>>>>>>> to
>>>>>>> proceed further on our journey with using and contributing to stanbol
>>>>>>>
>>>>>>> with best regards,
>>>>>>> tarandeep
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>
>>>>>>>    Thanks Sergio and Dileepa for your responses
>>>>>>>
>>>>>>>  We haven't been able to resolve the issue. We therefore decided to
>>>>>>>> keep
>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>> our
>>>>>>>> custom ontology and try to extract this entity and also link it but
>>>>>>>> we
>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>> configurations.
>>>>>>>>
>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>
>>>>>>>> https://drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>> >
>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=******sharing<https://drive.**
>>>>>>>> google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>> **sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>> <https://drive.**google.com/folderview?id=0B-**
>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>> This folder has 3 files:
>>>>>>>>
>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>> configurations
>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>> weighted
>>>>>>>> chain
>>>>>>>> 2) our custom ontology
>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>>>
>>>>>>>> May i request you all to please look at these files and let us know
>>>>>>>> if we
>>>>>>>> are missing something in configurations.
>>>>>>>>
>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>> for
>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>> >
>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>> >
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>> >
>>>>>>>> entityhub/managedsite<http://****stanbol.apache.org/docs/**trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>> components/entityhub/****managedsite<http://stanbol.**
>>>>>>>> apache.org/docs/trunk/**components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>> >
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>> >
>>>>>>>>
>>>>>>>> entityhublinking<http://**stan**bol.apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>> components/enhancer/engines/****entityhublinking<http://**
>>>>>>>> stanbol.apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>> >
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>> >
>>>>>>>> enhancer/chains/weightedchain.******html<http://stanbol.**apache.<http://stanbol.apache.>
>>>>>>>> **
>>>>>>>> org/docs/trunk/components/****enhancer/chains/weightedchain.**
>>>>>>>> **html<http://stanbol.apache.**org/docs/trunk/components/**
>>>>>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>> tarandeep
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>> http://salzburgresearch.at>
>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>> >
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>    Hi,
>>>>>>>>
>>>>>>>>  I'm not an expert on entity linking, but from my experience such
>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>> details
>>>>>>>>> at:
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>> >
>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>> >
>>>>>>>>> enhancer/engines/******entitylinking<http://stanbol.******
>>>>>>>>> apache.org/docs/trunk/******components/enhancer/engines/******<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>> entitylinking<http://apache.**org/docs/trunk/**components/**
>>>>>>>>> enhancer/engines/****entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> <http://stanbol.**apache.org/**docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>> components/enhancer/engines/****entitylinking<http://stanbol.**
>>>>>>>>> apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>> entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>> netiquette in
>>>>>>>>> mailing lists. This is an open source community; therefore messages
>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>>>>>> Friday
>>>>>>>>> afternoon, when people could be already out for weekend, or even on
>>>>>>>>> vacations.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Sergio
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>
>>>>>>>>>    Hi,
>>>>>>>>>
>>>>>>>>>  I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>> vocabulary
>>>>>>>>>> as
>>>>>>>>>> per
>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>> >
>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>> >
>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>> customvocabulary.**
>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/****
>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>
>>>>>>>>>>      Configured Clerezza Yard.
>>>>>>>>>>      Configured Managed Yard site.
>>>>>>>>>>      Updated the site by plugging ontology(containing custom
>>>>>>>>>> entities) .
>>>>>>>>>>      Configured Entity hub linking Engine(*customLinkingEngine*)
>>>>>>>>>> with
>>>>>>>>>> managed
>>>>>>>>>> site.
>>>>>>>>>>      Configured a customChain which uses following engine
>>>>>>>>>>
>>>>>>>>>>        -  *langdetect*
>>>>>>>>>>        - *opennlp-sentence*
>>>>>>>>>>        - *opennlp-token*
>>>>>>>>>>        - *opennlp-pos*
>>>>>>>>>>        - *opennlp-chunker*
>>>>>>>>>>        - *customLinkingEngine*
>>>>>>>>>>
>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>> *customChain*.
>>>>>>>>>>
>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>> space in
>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>
>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>> stanbol
>>>>>>>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>>>>>>>
>>>>>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>>>> but
>>>>>>>>>> it
>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>
>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>> now
>>>>>>>>>>
>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>    --
>>>>>>>>>>
>>>>>>>>>>  Sergio Fernández
>>>>>>>>> Salzburg Research
>>>>>>>>> +43 662 2288 318
>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>    --
>>>>>>>
>>>>>> Sergio Fernández
>>>>>> Salzburg Research
>>>>>> +43 662 2288 318
>>>>>> Jakob-Haringer Strasse 5/II
>>>>>> A-5020 Salzburg (Austria)
>>>>>> http://www.salzburgresearch.at
>>>>>>
>>>>>>
>>>>>>  --
>>> Sergio Fernández
>>> Salzburg Research
>>> +43 662 2288 318
>>> Jakob-Haringer Strasse 5/II
>>> A-5020 Salzburg (Austria)
>>> http://www.salzburgresearch.at
>>>
>>>
>
> --
>
> ------------------------------
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> London W6 7AN.

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: Working with custom vocabulary

Posted by Rafa Haro <rh...@zaizi.com>.
Hi Tarandeep,

As Sergio already pointed, you can check some different Entity Linking 
engines configurations at IKS development server: 
http://dev.iks-project.eu:8081/enhancer/chain. You can try to use the 
same configuration of some of the chains registered in this Stanbol 
instance. For that, just go through the Felix Console 
(http://dev.iks-project.eu:8081/system/console/configMgr/) and take a 
look to the different EntityHubLinkingEngine configurations. You can 
also try to use a Keyword Linking engine instead of an EntityHub Linking 
engine.

Anyway, all the sites configured in this server are SolrYard based, so 
perhaps there is a bug in the ClerezzaYard entity search process for 
multi-words entities' labels. We might would need debug logs messages in 
order to find out the problem.

Regards

El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
> Hi Sergio
>
> This is exactly i did and i mentioned in my last email
>
> *"What i understand is to enable option "Link ProperNouns only" in
> entityhub linking and also to use "opennlp-pos" engine in my weighted chain"
> *
>
> I have already checked this option in my own entity hub linking engine
>
> By the way, did you get a chance to look at files i have shared in google
> drive folder. Did you notice any problems there ?
>
> I think using custom ontology with stanbol should be a very common use case
> and if there are issues getting it working, either i am doing something
> terribly wrong or there are some other reasons which i dont know.
>
> But anyways, i am persisting to solve this issue and any help on this from
> this dev community will be much appreciated
>
> best regards
> tarandeep
>
>
>
> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
> sergio.fernandez@salzburgresearch.at> wrote:
>
>> http://{stanbol}/system/**console/configMgr sorry
>>
>>
>> On 15/07/13 18:15, Sergio Fernández wrote:
>>
>>> Have you check the
>>>
>>> 1) go to http://{stanbol}/config/**system/console/configMgr
>>>
>>> 2) find your EntityHub Linking engine
>>>
>>> 3) and then "Link ProperNouns only"
>>>
>>> The documentation in that configuration is quite useful I think:
>>>
>>> "If activated only ProperNouns will be matched against the Vocabulary.
>>> If deactivated any Noun will be matched. NOTE that this parameter
>>> requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
>>> Otherwise mapping will not work as expected.
>>> (enhancer.engines.linking.**properNounsState)"
>>>
>>> Hope this help. You have to take into account such kind of issues are
>>> not easy to solve by email.
>>>
>>> Cheers,
>>>
>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>
>>>> Thanks Sergio for your response
>>>>
>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>> chain
>>>>
>>>> I did these changes but unable to extract "University of Salzberg"
>>>>
>>>> Please find below the output RDF/XML from enhancer
>>>>
>>>> Request you to please let me know if i did not understand your inputs
>>>> correctly
>>>>
>>>> One more thing, in our ontology (yet to be built) we will have entities
>>>> which are other than people, places and organisations. For example,
>>>> belts,
>>>> bags etc
>>>>
>>>> best regards
>>>> tarandeep
>>>>
>>>> <rdf:RDF
>>>>       xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>> "
>>>>       xmlns:j.0="http://purl.org/dc/**terms/ <http://purl.org/dc/terms/>"
>>>>       xmlns:j.1="http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>"
>>>>     <rdf:Description
>>>> rdf:about="urn:enhancement-**197792bf-f1e8-47bf-626a-**3cdfbdb863b3">
>>>>       <j.0:type rdf:resource="http://purl.org/**dc/terms/LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>> "/>
>>>>       <j.1:extracted-from
>>>> rdf:resource="urn:content-**item-sha1-**3b2998e66582544035454850d2dd81**
>>>> 755b747849"/>
>>>>
>>>>       <j.1:confidence
>>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>>> ">0.**9999964817340454</j.1:**confidence>
>>>>
>>>>       <rdf:type
>>>> rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>> "/>
>>>>       <rdf:type
>>>> rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>> "/>
>>>>       <j.0:language>en</j.0:**language>
>>>>       <j.0:created
>>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>> ">**2013-07-15T14:25:43.829Z</j.0:**created>
>>>>
>>>>       <j.0:creator
>>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>>> ">**org.apache.stanbol.enhancer.**engines.langdetect.**
>>>> LanguageDetectionEnhancementEn**gine</j.0:creator>
>>>>
>>>>     </rdf:Description>
>>>> </rdf:RDF>
>>>>
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>>> wrote:
>>>>
>>>>   As I said: have you check the proper noun detection and POS tagging in
>>>>> your chain?
>>>>>
>>>>> For instance, enhancing the text "I studied at the University of
>>>>> Salzburg,
>>>>> which is based in Austria" works at the demo server:
>>>>>
>>>>> http://dev.iks-project.eu:****8081/enhancer/chain/dbpedia-****
>>>>> proper-noun<http://dev.iks-**project.eu:8081/enhancer/**
>>>>> chain/dbpedia-proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>
>>>>> Here the details:
>>>>>
>>>>> http://stanbol.apache.org/****docs/trunk/components/****
>>>>> enhancer/engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>> entitylinking#proper-noun-****linking-****
>>>>> wzxhzdk14enhancerengineslinkin****
>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>>
>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>
>>>>>   Just to add to my previous email
>>>>>> If i add another individual in my ontology "MyUniversity" under class
>>>>>> University
>>>>>>
>>>>>>
>>>>>>
>>>>>>        <!--
>>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>> untitled-ontology-13#****MyUniversity--<http://www.**
>>>>>> semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>> ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>
>>>>>>        <owl:NamedIndividual rdf:about="
>>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>> untitled-ontology-13#****MyUniversity<http://www.**
>>>>>> semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>> ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>> ">
>>>>>>            <rdf:type rdf:resource="
>>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>> untitled-ontology-13#****University<http://www.**semanticweb.org/vi5/*
>>>>>> *ontologies/2013/6/untitled-**ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>> "/>
>>>>>>            <rdfs:label>MyUniversity</****rdfs:label>
>>>>>>        </owl:NamedIndividual>
>>>>>>
>>>>>>
>>>>>> So with all configurations i have mentioned in the word document (in
>>>>>> google
>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>> enhancement
>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>> "University" type
>>>>>>
>>>>>> But same set of configurations doesn't work with individual
>>>>>> "University of
>>>>>> Salzburg"
>>>>>>
>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>> able to
>>>>>> extract custom entities which has space in between, will be a great
>>>>>> help
>>>>>> to
>>>>>> proceed further on our journey with using and contributing to stanbol
>>>>>>
>>>>>> with best regards,
>>>>>> tarandeep
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>> tsawhney@innodata.com> wrote:
>>>>>>
>>>>>>    Thanks Sergio and Dileepa for your responses
>>>>>>
>>>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>>>> keep
>>>>>>> just one class and one instance value "University of Salzburg" in our
>>>>>>> custom ontology and try to extract this entity and also link it but we
>>>>>>> could not get this running. I am sure we are missing some
>>>>>>> configurations.
>>>>>>>
>>>>>>> I am sharing a google drive folder at below link
>>>>>>>
>>>>>>> https://drive.google.com/****folderview?id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<https://drive.**
>>>>>>> google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>
>>>>>>> This folder has 3 files:
>>>>>>>
>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>> configurations
>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>> weighted
>>>>>>> chain
>>>>>>> 2) our custom ontology
>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>>
>>>>>>> May i request you all to please look at these files and let us know
>>>>>>> if we
>>>>>>> are missing something in configurations.
>>>>>>>
>>>>>>> We have referred to below web links in order to configure stanbol for
>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>
>>>>>>> http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>> entityhub/managedsite<http://**stanbol.apache.org/docs/trunk/**
>>>>>>> components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>
>>>>>>> http://stanbol.apache.org/****docs/trunk/components/****
>>>>>>> enhancer/engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>
>>>>>>> entityhublinking<http://**stanbol.apache.org/docs/trunk/**
>>>>>>> components/enhancer/engines/**entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>
>>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>> enhancer/chains/weightedchain.****html<http://stanbol.apache.**
>>>>>>> org/docs/trunk/components/**enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>
>>>>>>> Thanks in advance for your valuable help.
>>>>>>>
>>>>>>> Best regards
>>>>>>> tarandeep
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at>
>>>>>>> <se...@salzburgresearch.at>
>>>>>>> wrote:
>>>>>>>
>>>>>>>    Hi,
>>>>>>>
>>>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>> details
>>>>>>>> at:
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>> enhancer/engines/****entitylinking<http://stanbol.****
>>>>>>>> apache.org/docs/trunk/****components/enhancer/engines/****
>>>>>>>> entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**
>>>>>>>> components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>
>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>> netiquette in
>>>>>>>> mailing lists. This is an open source community; therefore messages
>>>>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>>>>> Friday
>>>>>>>> afternoon, when people could be already out for weekend, or even on
>>>>>>>> vacations.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Sergio
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>
>>>>>>>>    Hi,
>>>>>>>>
>>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>>> vocabulary
>>>>>>>>> as
>>>>>>>>> per
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>
>>>>>>>>> Following are the steps followed -
>>>>>>>>>
>>>>>>>>>      Configured Clerezza Yard.
>>>>>>>>>      Configured Managed Yard site.
>>>>>>>>>      Updated the site by plugging ontology(containing custom
>>>>>>>>> entities) .
>>>>>>>>>      Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>>>>>> managed
>>>>>>>>> site.
>>>>>>>>>      Configured a customChain which uses following engine
>>>>>>>>>
>>>>>>>>>        -  *langdetect*
>>>>>>>>>        - *opennlp-sentence*
>>>>>>>>>        - *opennlp-token*
>>>>>>>>>        - *opennlp-pos*
>>>>>>>>>        - *opennlp-chunker*
>>>>>>>>>        - *customLinkingEngine*
>>>>>>>>>
>>>>>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>>>>>
>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>> space in
>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>
>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>> stanbol
>>>>>>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>>>>>>
>>>>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>>> but
>>>>>>>>> it
>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>
>>>>>>>>> I have invested more than a week now and running out of options now
>>>>>>>>>
>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>    --
>>>>>>>>>
>>>>>>>> Sergio Fernández
>>>>>>>> Salzburg Research
>>>>>>>> +43 662 2288 318
>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>   --
>>>>> Sergio Fernández
>>>>> Salzburg Research
>>>>> +43 662 2288 318
>>>>> Jakob-Haringer Strasse 5/II
>>>>> A-5020 Salzburg (Austria)
>>>>> http://www.salzburgresearch.at
>>>>>
>>>>>
>> --
>> Sergio Fernández
>> Salzburg Research
>> +43 662 2288 318
>> Jakob-Haringer Strasse 5/II
>> A-5020 Salzburg (Austria)
>> http://www.salzburgresearch.at
>>


-- 

------------------------------
This message should be regarded as confidential. If you have received this 
email in error please notify the sender and destroy it immediately. 
Statements of intent shall only become binding when confirmed in hard copy 
by an authorised signatory.

Zaizi Ltd is registered in England and Wales with the registration number 
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road, 
London W6 7AN. 

Re: Working with custom vocabulary

Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Hi Sergio

This is exactly i did and i mentioned in my last email

*"What i understand is to enable option "Link ProperNouns only" in
entityhub linking and also to use "opennlp-pos" engine in my weighted chain"
*

I have already checked this option in my own entity hub linking engine

By the way, did you get a chance to look at files i have shared in google
drive folder. Did you notice any problems there ?

I think using custom ontology with stanbol should be a very common use case
and if there are issues getting it working, either i am doing something
terribly wrong or there are some other reasons which i dont know.

But anyways, i am persisting to solve this issue and any help on this from
this dev community will be much appreciated

best regards
tarandeep



On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
sergio.fernandez@salzburgresearch.at> wrote:

> http://{stanbol}/system/**console/configMgr sorry
>
>
> On 15/07/13 18:15, Sergio Fernández wrote:
>
>> Have you check the
>>
>> 1) go to http://{stanbol}/config/**system/console/configMgr
>>
>> 2) find your EntityHub Linking engine
>>
>> 3) and then "Link ProperNouns only"
>>
>> The documentation in that configuration is quite useful I think:
>>
>> "If activated only ProperNouns will be matched against the Vocabulary.
>> If deactivated any Noun will be matched. NOTE that this parameter
>> requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
>> Otherwise mapping will not work as expected.
>> (enhancer.engines.linking.**properNounsState)"
>>
>> Hope this help. You have to take into account such kind of issues are
>> not easy to solve by email.
>>
>> Cheers,
>>
>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>
>>> Thanks Sergio for your response
>>>
>>> What i understand is to enable option *"Link ProperNouns only"* in
>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>> chain
>>>
>>> I did these changes but unable to extract "University of Salzberg"
>>>
>>> Please find below the output RDF/XML from enhancer
>>>
>>> Request you to please let me know if i did not understand your inputs
>>> correctly
>>>
>>> One more thing, in our ontology (yet to be built) we will have entities
>>> which are other than people, places and organisations. For example,
>>> belts,
>>> bags etc
>>>
>>> best regards
>>> tarandeep
>>>
>>> <rdf:RDF
>>>      xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>> "
>>>      xmlns:j.0="http://purl.org/dc/**terms/ <http://purl.org/dc/terms/>"
>>>      xmlns:j.1="http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>"
>>> >
>>>    <rdf:Description
>>> rdf:about="urn:enhancement-**197792bf-f1e8-47bf-626a-**3cdfbdb863b3">
>>>      <j.0:type rdf:resource="http://purl.org/**dc/terms/LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>> "/>
>>>      <j.1:extracted-from
>>> rdf:resource="urn:content-**item-sha1-**3b2998e66582544035454850d2dd81**
>>> 755b747849"/>
>>>
>>>      <j.1:confidence
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>> ">0.**9999964817340454</j.1:**confidence>
>>>
>>>      <rdf:type
>>> rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>> "/>
>>>      <rdf:type
>>> rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>> "/>
>>>      <j.0:language>en</j.0:**language>
>>>      <j.0:created
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>> ">**2013-07-15T14:25:43.829Z</j.0:**created>
>>>
>>>      <j.0:creator
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>> ">**org.apache.stanbol.enhancer.**engines.langdetect.**
>>> LanguageDetectionEnhancementEn**gine</j.0:creator>
>>>
>>>    </rdf:Description>
>>> </rdf:RDF>
>>>
>>>
>>>
>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>> wrote:
>>>
>>>  As I said: have you check the proper noun detection and POS tagging in
>>>> your chain?
>>>>
>>>> For instance, enhancing the text "I studied at the University of
>>>> Salzburg,
>>>> which is based in Austria" works at the demo server:
>>>>
>>>> http://dev.iks-project.eu:****8081/enhancer/chain/dbpedia-****
>>>> proper-noun<http://dev.iks-**project.eu:8081/enhancer/**
>>>> chain/dbpedia-proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>> >
>>>>
>>>>
>>>> Here the details:
>>>>
>>>> http://stanbol.apache.org/****docs/trunk/components/****
>>>> enhancer/engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>> entitylinking#proper-noun-****linking-****
>>>> wzxhzdk14enhancerengineslinkin****
>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>> >
>>>>
>>>>
>>>> Cheers,
>>>>
>>>>
>>>>
>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>
>>>>  Just to add to my previous email
>>>>>
>>>>> If i add another individual in my ontology "MyUniversity" under class
>>>>> University
>>>>>
>>>>>
>>>>>
>>>>>       <!--
>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>> untitled-ontology-13#****MyUniversity--<http://www.**
>>>>> semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>> ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>> >
>>>>>
>>>>>
>>>>>>
>>>>>       <owl:NamedIndividual rdf:about="
>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>> untitled-ontology-13#****MyUniversity<http://www.**
>>>>> semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>> ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>> >
>>>>>
>>>>> ">
>>>>>           <rdf:type rdf:resource="
>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>> untitled-ontology-13#****University<http://www.**semanticweb.org/vi5/*
>>>>> *ontologies/2013/6/untitled-**ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>> >
>>>>>
>>>>> "/>
>>>>>           <rdfs:label>MyUniversity</****rdfs:label>
>>>>>       </owl:NamedIndividual>
>>>>>
>>>>>
>>>>> So with all configurations i have mentioned in the word document (in
>>>>> google
>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>> enhancement
>>>>> chain is able to extract "MyUniversity" and link it with
>>>>> "University" type
>>>>>
>>>>> But same set of configurations doesn't work with individual
>>>>> "University of
>>>>> Salzburg"
>>>>>
>>>>> If anyone of you please provide help on what are we missing to be
>>>>> able to
>>>>> extract custom entities which has space in between, will be a great
>>>>> help
>>>>> to
>>>>> proceed further on our journey with using and contributing to stanbol
>>>>>
>>>>> with best regards,
>>>>> tarandeep
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>> tsawhney@innodata.com> wrote:
>>>>>
>>>>>   Thanks Sergio and Dileepa for your responses
>>>>>
>>>>>>
>>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>>> keep
>>>>>> just one class and one instance value "University of Salzburg" in our
>>>>>> custom ontology and try to extract this entity and also link it but we
>>>>>> could not get this running. I am sure we are missing some
>>>>>> configurations.
>>>>>>
>>>>>> I am sharing a google drive folder at below link
>>>>>>
>>>>>> https://drive.google.com/****folderview?id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<https://drive.**
>>>>>> google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>> >
>>>>>>
>>>>>>
>>>>>> This folder has 3 files:
>>>>>>
>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>> configurations
>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>> weighted
>>>>>> chain
>>>>>> 2) our custom ontology
>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>
>>>>>> May i request you all to please look at these files and let us know
>>>>>> if we
>>>>>> are missing something in configurations.
>>>>>>
>>>>>> We have referred to below web links in order to configure stanbol for
>>>>>> using our custom ontology for entity extraction and linking
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>> >
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>> entityhub/managedsite<http://**stanbol.apache.org/docs/trunk/**
>>>>>> components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>> >
>>>>>>
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/components/****
>>>>>> enhancer/engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>
>>>>>> entityhublinking<http://**stanbol.apache.org/docs/trunk/**
>>>>>> components/enhancer/engines/**entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>> >
>>>>>>
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>> enhancer/chains/weightedchain.****html<http://stanbol.apache.**
>>>>>> org/docs/trunk/components/**enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>> >
>>>>>>
>>>>>>
>>>>>> Thanks in advance for your valuable help.
>>>>>>
>>>>>> Best regards
>>>>>> tarandeep
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at>
>>>>>> <se...@salzburgresearch.at>
>>>>>> >>
>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>   Hi,
>>>>>>
>>>>>>>
>>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>> details
>>>>>>> at:
>>>>>>>
>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>> >
>>>>>>>
>>>>>>> enhancer/engines/****entitylinking<http://stanbol.****
>>>>>>> apache.org/docs/trunk/****components/enhancer/engines/****
>>>>>>> entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>> <http://stanbol.**apache.org/docs/trunk/**
>>>>>>> components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>> netiquette in
>>>>>>> mailing lists. This is an open source community; therefore messages
>>>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>>>> Friday
>>>>>>> afternoon, when people could be already out for weekend, or even on
>>>>>>> vacations.
>>>>>>>
>>>>>>> Best,
>>>>>>> Sergio
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>
>>>>>>>   Hi,
>>>>>>>
>>>>>>>>
>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>> vocabulary
>>>>>>>> as
>>>>>>>> per
>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>> >
>>>>>>>>
>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Following are the steps followed -
>>>>>>>>
>>>>>>>>     Configured Clerezza Yard.
>>>>>>>>     Configured Managed Yard site.
>>>>>>>>     Updated the site by plugging ontology(containing custom
>>>>>>>> entities) .
>>>>>>>>     Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>>>>> managed
>>>>>>>> site.
>>>>>>>>     Configured a customChain which uses following engine
>>>>>>>>
>>>>>>>>       -  *langdetect*
>>>>>>>>       - *opennlp-sentence*
>>>>>>>>       - *opennlp-token*
>>>>>>>>       - *opennlp-pos*
>>>>>>>>       - *opennlp-chunker*
>>>>>>>>       - *customLinkingEngine*
>>>>>>>>
>>>>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>>>>
>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>> space in
>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>
>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>> stanbol
>>>>>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>>>>>
>>>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>> but
>>>>>>>> it
>>>>>>>> didn't work to extract above entity.
>>>>>>>>
>>>>>>>> I have invested more than a week now and running out of options now
>>>>>>>>
>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>
>>>>>>>>
>>>>>>>>   --
>>>>>>>>
>>>>>>> Sergio Fernández
>>>>>>> Salzburg Research
>>>>>>> +43 662 2288 318
>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>> A-5020 Salzburg (Austria)
>>>>>>> http://www.salzburgresearch.at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>  --
>>>> Sergio Fernández
>>>> Salzburg Research
>>>> +43 662 2288 318
>>>> Jakob-Haringer Strasse 5/II
>>>> A-5020 Salzburg (Austria)
>>>> http://www.salzburgresearch.at
>>>>
>>>>
>>>
>>
> --
> Sergio Fernández
> Salzburg Research
> +43 662 2288 318
> Jakob-Haringer Strasse 5/II
> A-5020 Salzburg (Austria)
> http://www.salzburgresearch.at
>

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: Working with custom vocabulary

Posted by Sergio Fernández <se...@salzburgresearch.at>.
http://{stanbol}/system/console/configMgr sorry

On 15/07/13 18:15, Sergio Fernández wrote:
> Have you check the
>
> 1) go to http://{stanbol}/config/system/console/configMgr
>
> 2) find your EntityHub Linking engine
>
> 3) and then "Link ProperNouns only"
>
> The documentation in that configuration is quite useful I think:
>
> "If activated only ProperNouns will be matched against the Vocabulary.
> If deactivated any Noun will be matched. NOTE that this parameter
> requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
> Otherwise mapping will not work as expected.
> (enhancer.engines.linking.properNounsState)"
>
> Hope this help. You have to take into account such kind of issues are
> not easy to solve by email.
>
> Cheers,
>
> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>> Thanks Sergio for your response
>>
>> What i understand is to enable option *"Link ProperNouns only"* in
>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>> chain
>>
>> I did these changes but unable to extract "University of Salzberg"
>>
>> Please find below the output RDF/XML from enhancer
>>
>> Request you to please let me know if i did not understand your inputs
>> correctly
>>
>> One more thing, in our ontology (yet to be built) we will have entities
>> which are other than people, places and organisations. For example,
>> belts,
>> bags etc
>>
>> best regards
>> tarandeep
>>
>> <rdf:RDF
>>      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>>      xmlns:j.0="http://purl.org/dc/terms/"
>>      xmlns:j.1="http://fise.iks-project.eu/ontology/" >
>>    <rdf:Description
>> rdf:about="urn:enhancement-197792bf-f1e8-47bf-626a-3cdfbdb863b3">
>>      <j.0:type rdf:resource="http://purl.org/dc/terms/LinguisticSystem"/>
>>      <j.1:extracted-from
>> rdf:resource="urn:content-item-sha1-3b2998e66582544035454850d2dd81755b747849"/>
>>
>>      <j.1:confidence
>> rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.9999964817340454</j.1:confidence>
>>
>>      <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>>      <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>>      <j.0:language>en</j.0:language>
>>      <j.0:created
>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2013-07-15T14:25:43.829Z</j.0:created>
>>
>>      <j.0:creator
>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine</j.0:creator>
>>
>>    </rdf:Description>
>> </rdf:RDF>
>>
>>
>>
>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>> sergio.fernandez@salzburgresearch.at> wrote:
>>
>>> As I said: have you check the proper noun detection and POS tagging in
>>> your chain?
>>>
>>> For instance, enhancing the text "I studied at the University of
>>> Salzburg,
>>> which is based in Austria" works at the demo server:
>>>
>>> http://dev.iks-project.eu:**8081/enhancer/chain/dbpedia-**proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>
>>>
>>> Here the details:
>>>
>>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>>> entitylinking#proper-noun-**linking-**wzxhzdk14enhancerengineslinkin**
>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>
>>>
>>> Cheers,
>>>
>>>
>>>
>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>
>>>> Just to add to my previous email
>>>>
>>>> If i add another individual in my ontology "MyUniversity" under class
>>>> University
>>>>
>>>>
>>>>
>>>>       <!--
>>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>>> untitled-ontology-13#**MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>
>>>>>
>>>>
>>>>       <owl:NamedIndividual rdf:about="
>>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>>> untitled-ontology-13#**MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>
>>>> ">
>>>>           <rdf:type rdf:resource="
>>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>>> untitled-ontology-13#**University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>
>>>> "/>
>>>>           <rdfs:label>MyUniversity</**rdfs:label>
>>>>       </owl:NamedIndividual>
>>>>
>>>>
>>>> So with all configurations i have mentioned in the word document (in
>>>> google
>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>> enhancement
>>>> chain is able to extract "MyUniversity" and link it with
>>>> "University" type
>>>>
>>>> But same set of configurations doesn't work with individual
>>>> "University of
>>>> Salzburg"
>>>>
>>>> If anyone of you please provide help on what are we missing to be
>>>> able to
>>>> extract custom entities which has space in between, will be a great
>>>> help
>>>> to
>>>> proceed further on our journey with using and contributing to stanbol
>>>>
>>>> with best regards,
>>>> tarandeep
>>>>
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>> tsawhney@innodata.com> wrote:
>>>>
>>>>   Thanks Sergio and Dileepa for your responses
>>>>>
>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>> keep
>>>>> just one class and one instance value "University of Salzburg" in our
>>>>> custom ontology and try to extract this entity and also link it but we
>>>>> could not get this running. I am sure we are missing some
>>>>> configurations.
>>>>>
>>>>> I am sharing a google drive folder at below link
>>>>>
>>>>> https://drive.google.com/**folderview?id=0B-**
>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>
>>>>>
>>>>> This folder has 3 files:
>>>>>
>>>>> 1) A word document which shows felix snapshots of what all
>>>>> configurations
>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>> weighted
>>>>> chain
>>>>> 2) our custom ontology
>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>
>>>>> May i request you all to please look at these files and let us know
>>>>> if we
>>>>> are missing something in configurations.
>>>>>
>>>>> We have referred to below web links in order to configure stanbol for
>>>>> using our custom ontology for entity extraction and linking
>>>>>
>>>>> http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>
>>>>> http://stanbol.apache.org/**docs/trunk/components/**
>>>>> entityhub/managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>
>>>>>
>>>>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>>>>>
>>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>
>>>>>
>>>>> http://stanbol.apache.org/**docs/trunk/components/**
>>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>
>>>>>
>>>>> Thanks in advance for your valuable help.
>>>>>
>>>>> Best regards
>>>>> tarandeep
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>>>>
>>>>> wrote:
>>>>>
>>>>>   Hi,
>>>>>>
>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>> details
>>>>>> at:
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>
>>>>>> enhancer/engines/**entitylinking<http://stanbol.**
>>>>>> apache.org/docs/trunk/**components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>> netiquette in
>>>>>> mailing lists. This is an open source community; therefore messages
>>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>>> Friday
>>>>>> afternoon, when people could be already out for weekend, or even on
>>>>>> vacations.
>>>>>>
>>>>>> Best,
>>>>>> Sergio
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>
>>>>>>   Hi,
>>>>>>>
>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>> vocabulary
>>>>>>> as
>>>>>>> per
>>>>>>> http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>
>>>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Following are the steps followed -
>>>>>>>
>>>>>>>     Configured Clerezza Yard.
>>>>>>>     Configured Managed Yard site.
>>>>>>>     Updated the site by plugging ontology(containing custom
>>>>>>> entities) .
>>>>>>>     Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>>>> managed
>>>>>>> site.
>>>>>>>     Configured a customChain which uses following engine
>>>>>>>
>>>>>>>       -  *langdetect*
>>>>>>>       - *opennlp-sentence*
>>>>>>>       - *opennlp-token*
>>>>>>>       - *opennlp-pos*
>>>>>>>       - *opennlp-chunker*
>>>>>>>       - *customLinkingEngine*
>>>>>>>
>>>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>>>
>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>> space in
>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>
>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>> stanbol
>>>>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>>>>
>>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>> but
>>>>>>> it
>>>>>>> didn't work to extract above entity.
>>>>>>>
>>>>>>> I have invested more than a week now and running out of options now
>>>>>>>
>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>
>>>>>>>
>>>>>>>   --
>>>>>> Sergio Fernández
>>>>>> Salzburg Research
>>>>>> +43 662 2288 318
>>>>>> Jakob-Haringer Strasse 5/II
>>>>>> A-5020 Salzburg (Austria)
>>>>>> http://www.salzburgresearch.at
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>> --
>>> Sergio Fernández
>>> Salzburg Research
>>> +43 662 2288 318
>>> Jakob-Haringer Strasse 5/II
>>> A-5020 Salzburg (Austria)
>>> http://www.salzburgresearch.at
>>>
>>
>

-- 
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at

Re: Working with custom vocabulary

Posted by Sergio Fernández <se...@salzburgresearch.at>.
Have you check the

1) go to http://{stanbol}/config/system/console/configMgr

2) find your EntityHub Linking engine

3) and then "Link ProperNouns only"

The documentation in that configuration is quite useful I think:

"If activated only ProperNouns will be matched against the Vocabulary. 
If deactivated any Noun will be matched. NOTE that this parameter 
requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'. 
Otherwise mapping will not work as expected. 
(enhancer.engines.linking.properNounsState)"

Hope this help. You have to take into account such kind of issues are 
not easy to solve by email.

Cheers,

On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
> Thanks Sergio for your response
>
> What i understand is to enable option *"Link ProperNouns only"* in
> entityhub linking and also to use "opennlp-pos" engine in my weighted chain
>
> I did these changes but unable to extract "University of Salzberg"
>
> Please find below the output RDF/XML from enhancer
>
> Request you to please let me know if i did not understand your inputs
> correctly
>
> One more thing, in our ontology (yet to be built) we will have entities
> which are other than people, places and organisations. For example, belts,
> bags etc
>
> best regards
> tarandeep
>
> <rdf:RDF
>      xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>      xmlns:j.0="http://purl.org/dc/terms/"
>      xmlns:j.1="http://fise.iks-project.eu/ontology/" >
>    <rdf:Description
> rdf:about="urn:enhancement-197792bf-f1e8-47bf-626a-3cdfbdb863b3">
>      <j.0:type rdf:resource="http://purl.org/dc/terms/LinguisticSystem"/>
>      <j.1:extracted-from
> rdf:resource="urn:content-item-sha1-3b2998e66582544035454850d2dd81755b747849"/>
>      <j.1:confidence
> rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.9999964817340454</j.1:confidence>
>      <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>      <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>      <j.0:language>en</j.0:language>
>      <j.0:created
> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2013-07-15T14:25:43.829Z</j.0:created>
>      <j.0:creator
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine</j.0:creator>
>    </rdf:Description>
> </rdf:RDF>
>
>
>
> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
> sergio.fernandez@salzburgresearch.at> wrote:
>
>> As I said: have you check the proper noun detection and POS tagging in
>> your chain?
>>
>> For instance, enhancing the text "I studied at the University of Salzburg,
>> which is based in Austria" works at the demo server:
>>
>> http://dev.iks-project.eu:**8081/enhancer/chain/dbpedia-**proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>
>> Here the details:
>>
>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>> entitylinking#proper-noun-**linking-**wzxhzdk14enhancerengineslinkin**
>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>
>> Cheers,
>>
>>
>>
>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>
>>> Just to add to my previous email
>>>
>>> If i add another individual in my ontology "MyUniversity" under class
>>> University
>>>
>>>
>>>
>>>       <!--
>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>> untitled-ontology-13#**MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>
>>>
>>>       <owl:NamedIndividual rdf:about="
>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>> untitled-ontology-13#**MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>> ">
>>>           <rdf:type rdf:resource="
>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>> untitled-ontology-13#**University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>> "/>
>>>           <rdfs:label>MyUniversity</**rdfs:label>
>>>       </owl:NamedIndividual>
>>>
>>>
>>> So with all configurations i have mentioned in the word document (in
>>> google
>>> drive folder), when i pass text with "MyUniversity" in it, my enhancement
>>> chain is able to extract "MyUniversity" and link it with "University" type
>>>
>>> But same set of configurations doesn't work with individual "University of
>>> Salzburg"
>>>
>>> If anyone of you please provide help on what are we missing to be able to
>>> extract custom entities which has space in between, will be a great help
>>> to
>>> proceed further on our journey with using and contributing to stanbol
>>>
>>> with best regards,
>>> tarandeep
>>>
>>>
>>>
>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>> tsawhney@innodata.com> wrote:
>>>
>>>   Thanks Sergio and Dileepa for your responses
>>>>
>>>> We haven't been able to resolve the issue. We therefore decided to keep
>>>> just one class and one instance value "University of Salzburg" in our
>>>> custom ontology and try to extract this entity and also link it but we
>>>> could not get this running. I am sure we are missing some configurations.
>>>>
>>>> I am sharing a google drive folder at below link
>>>>
>>>> https://drive.google.com/**folderview?id=0B-**
>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>
>>>> This folder has 3 files:
>>>>
>>>> 1) A word document which shows felix snapshots of what all configurations
>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>> weighted
>>>> chain
>>>> 2) our custom ontology
>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>
>>>> May i request you all to please look at these files and let us know if we
>>>> are missing something in configurations.
>>>>
>>>> We have referred to below web links in order to configure stanbol for
>>>> using our custom ontology for entity extraction and linking
>>>>
>>>> http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>> http://stanbol.apache.org/**docs/trunk/components/**
>>>> entityhub/managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>
>>>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>
>>>> http://stanbol.apache.org/**docs/trunk/components/**
>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>
>>>> Thanks in advance for your valuable help.
>>>>
>>>> Best regards
>>>> tarandeep
>>>>
>>>>
>>>>
>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>>> wrote:
>>>>
>>>>   Hi,
>>>>>
>>>>> I'm not an expert on entity linking, but from my experience such
>>>>> behaviour could be caused by the proper noun detection. Further details
>>>>> at:
>>>>>
>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>> enhancer/engines/**entitylinking<http://stanbol.**
>>>>> apache.org/docs/trunk/**components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>
>>>>>
>>>>>
>>>>> In addition, I'd like to suggest you to take a look to the netiquette in
>>>>> mailing lists. This is an open source community; therefore messages
>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>> Friday
>>>>> afternoon, when people could be already out for weekend, or even on
>>>>> vacations.
>>>>>
>>>>> Best,
>>>>> Sergio
>>>>>
>>>>>
>>>>>
>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>
>>>>>   Hi,
>>>>>>
>>>>>> I am using stanbol to extract entitiies by plugging custom vocabulary
>>>>>> as
>>>>>> per http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Following are the steps followed -
>>>>>>
>>>>>>     Configured Clerezza Yard.
>>>>>>     Configured Managed Yard site.
>>>>>>     Updated the site by plugging ontology(containing custom entities) .
>>>>>>     Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>>> managed
>>>>>> site.
>>>>>>     Configured a customChain which uses following engine
>>>>>>
>>>>>>       -  *langdetect*
>>>>>>       - *opennlp-sentence*
>>>>>>       - *opennlp-token*
>>>>>>       - *opennlp-pos*
>>>>>>       - *opennlp-chunker*
>>>>>>       - *customLinkingEngine*
>>>>>>
>>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>>
>>>>>> However i am facing an issue in extracting entities which has space in
>>>>>> between. For example "Tommy Hilfiger".
>>>>>>
>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>>>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>>>
>>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but
>>>>>> it
>>>>>> didn't work to extract above entity.
>>>>>>
>>>>>> I have invested more than a week now and running out of options now
>>>>>>
>>>>>> i request you to please provide help in resolving this issue
>>>>>>
>>>>>>
>>>>>>   --
>>>>> Sergio Fernández
>>>>> Salzburg Research
>>>>> +43 662 2288 318
>>>>> Jakob-Haringer Strasse 5/II
>>>>> A-5020 Salzburg (Austria)
>>>>> http://www.salzburgresearch.at
>>>>>
>>>>>
>>>>
>>>>
>>>
>> --
>> Sergio Fernández
>> Salzburg Research
>> +43 662 2288 318
>> Jakob-Haringer Strasse 5/II
>> A-5020 Salzburg (Austria)
>> http://www.salzburgresearch.at
>>
>

-- 
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at

Re: [URGENT] Working with custom vocabulary

Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Thanks Sergio for your response

What i understand is to enable option *"Link ProperNouns only"* in
entityhub linking and also to use "opennlp-pos" engine in my weighted chain

I did these changes but unable to extract "University of Salzberg"

Please find below the output RDF/XML from enhancer

Request you to please let me know if i did not understand your inputs
correctly

One more thing, in our ontology (yet to be built) we will have entities
which are other than people, places and organisations. For example, belts,
bags etc

best regards
tarandeep

<rdf:RDF
    xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
    xmlns:j.0="http://purl.org/dc/terms/"
    xmlns:j.1="http://fise.iks-project.eu/ontology/" >
  <rdf:Description
rdf:about="urn:enhancement-197792bf-f1e8-47bf-626a-3cdfbdb863b3">
    <j.0:type rdf:resource="http://purl.org/dc/terms/LinguisticSystem"/>
    <j.1:extracted-from
rdf:resource="urn:content-item-sha1-3b2998e66582544035454850d2dd81755b747849"/>
    <j.1:confidence
rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.9999964817340454</j.1:confidence>
    <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
    <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
    <j.0:language>en</j.0:language>
    <j.0:created
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2013-07-15T14:25:43.829Z</j.0:created>
    <j.0:creator
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine</j.0:creator>
  </rdf:Description>
</rdf:RDF>



On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
sergio.fernandez@salzburgresearch.at> wrote:

> As I said: have you check the proper noun detection and POS tagging in
> your chain?
>
> For instance, enhancing the text "I studied at the University of Salzburg,
> which is based in Austria" works at the demo server:
>
> http://dev.iks-project.eu:**8081/enhancer/chain/dbpedia-**proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>
> Here the details:
>
> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
> entitylinking#proper-noun-**linking-**wzxhzdk14enhancerengineslinkin**
> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>
> Cheers,
>
>
>
> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>
>> Just to add to my previous email
>>
>> If i add another individual in my ontology "MyUniversity" under class
>> University
>>
>>
>>
>>      <!--
>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>> untitled-ontology-13#**MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>> >
>>
>>      <owl:NamedIndividual rdf:about="
>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>> untitled-ontology-13#**MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>> ">
>>          <rdf:type rdf:resource="
>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>> untitled-ontology-13#**University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>> "/>
>>          <rdfs:label>MyUniversity</**rdfs:label>
>>      </owl:NamedIndividual>
>>
>>
>> So with all configurations i have mentioned in the word document (in
>> google
>> drive folder), when i pass text with "MyUniversity" in it, my enhancement
>> chain is able to extract "MyUniversity" and link it with "University" type
>>
>> But same set of configurations doesn't work with individual "University of
>> Salzburg"
>>
>> If anyone of you please provide help on what are we missing to be able to
>> extract custom entities which has space in between, will be a great help
>> to
>> proceed further on our journey with using and contributing to stanbol
>>
>> with best regards,
>> tarandeep
>>
>>
>>
>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>> tsawhney@innodata.com> wrote:
>>
>>  Thanks Sergio and Dileepa for your responses
>>>
>>> We haven't been able to resolve the issue. We therefore decided to keep
>>> just one class and one instance value "University of Salzburg" in our
>>> custom ontology and try to extract this entity and also link it but we
>>> could not get this running. I am sure we are missing some configurations.
>>>
>>> I am sharing a google drive folder at below link
>>>
>>> https://drive.google.com/**folderview?id=0B-**
>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>
>>> This folder has 3 files:
>>>
>>> 1) A word document which shows felix snapshots of what all configurations
>>> we did while configuring Yard, yardsite, entiy linking engine and
>>> weighted
>>> chain
>>> 2) our custom ontology
>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>
>>> May i request you all to please look at these files and let us know if we
>>> are missing something in configurations.
>>>
>>> We have referred to below web links in order to configure stanbol for
>>> using our custom ontology for entity extraction and linking
>>>
>>> http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>> http://stanbol.apache.org/**docs/trunk/components/**
>>> entityhub/managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>
>>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>
>>> http://stanbol.apache.org/**docs/trunk/components/**
>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>
>>> Thanks in advance for your valuable help.
>>>
>>> Best regards
>>> tarandeep
>>>
>>>
>>>
>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>> wrote:
>>>
>>>  Hi,
>>>>
>>>> I'm not an expert on entity linking, but from my experience such
>>>> behaviour could be caused by the proper noun detection. Further details
>>>> at:
>>>>
>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>> enhancer/engines/**entitylinking<http://stanbol.**
>>>> apache.org/docs/trunk/**components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>> >
>>>>
>>>>
>>>> In addition, I'd like to suggest you to take a look to the netiquette in
>>>> mailing lists. This is an open source community; therefore messages
>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>> Friday
>>>> afternoon, when people could be already out for weekend, or even on
>>>> vacations.
>>>>
>>>> Best,
>>>> Sergio
>>>>
>>>>
>>>>
>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>
>>>>  Hi,
>>>>>
>>>>> I am using stanbol to extract entitiies by plugging custom vocabulary
>>>>> as
>>>>> per http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>> >
>>>>>
>>>>>
>>>>> Following are the steps followed -
>>>>>
>>>>>    Configured Clerezza Yard.
>>>>>    Configured Managed Yard site.
>>>>>    Updated the site by plugging ontology(containing custom entities) .
>>>>>    Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>> managed
>>>>> site.
>>>>>    Configured a customChain which uses following engine
>>>>>
>>>>>      -  *langdetect*
>>>>>      - *opennlp-sentence*
>>>>>      - *opennlp-token*
>>>>>      - *opennlp-pos*
>>>>>      - *opennlp-chunker*
>>>>>      - *customLinkingEngine*
>>>>>
>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>
>>>>> However i am facing an issue in extracting entities which has space in
>>>>> between. For example "Tommy Hilfiger".
>>>>>
>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>>
>>>>> I had tried configuring  *customLinkingEngine* same as *
>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but
>>>>> it
>>>>> didn't work to extract above entity.
>>>>>
>>>>> I have invested more than a week now and running out of options now
>>>>>
>>>>> i request you to please provide help in resolving this issue
>>>>>
>>>>>
>>>>>  --
>>>> Sergio Fernández
>>>> Salzburg Research
>>>> +43 662 2288 318
>>>> Jakob-Haringer Strasse 5/II
>>>> A-5020 Salzburg (Austria)
>>>> http://www.salzburgresearch.at
>>>>
>>>>
>>>
>>>
>>
> --
> Sergio Fernández
> Salzburg Research
> +43 662 2288 318
> Jakob-Haringer Strasse 5/II
> A-5020 Salzburg (Austria)
> http://www.salzburgresearch.at
>

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: [URGENT] Working with custom vocabulary

Posted by Sergio Fernández <se...@salzburgresearch.at>.
As I said: have you check the proper noun detection and POS tagging in 
your chain?

For instance, enhancing the text "I studied at the University of 
Salzburg, which is based in Austria" works at the demo server:

http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun

Here the details:

http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15

Cheers,


On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
> Just to add to my previous email
>
> If i add another individual in my ontology "MyUniversity" under class
> University
>
>
>
>      <!--
> http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>
>      <owl:NamedIndividual rdf:about="
> http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity
> ">
>          <rdf:type rdf:resource="
> http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University
> "/>
>          <rdfs:label>MyUniversity</rdfs:label>
>      </owl:NamedIndividual>
>
>
> So with all configurations i have mentioned in the word document (in google
> drive folder), when i pass text with "MyUniversity" in it, my enhancement
> chain is able to extract "MyUniversity" and link it with "University" type
>
> But same set of configurations doesn't work with individual "University of
> Salzburg"
>
> If anyone of you please provide help on what are we missing to be able to
> extract custom entities which has space in between, will be a great help to
> proceed further on our journey with using and contributing to stanbol
>
> with best regards,
> tarandeep
>
>
>
> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
> tsawhney@innodata.com> wrote:
>
>> Thanks Sergio and Dileepa for your responses
>>
>> We haven't been able to resolve the issue. We therefore decided to keep
>> just one class and one instance value "University of Salzburg" in our
>> custom ontology and try to extract this entity and also link it but we
>> could not get this running. I am sure we are missing some configurations.
>>
>> I am sharing a google drive folder at below link
>>
>> https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing
>>
>> This folder has 3 files:
>>
>> 1) A word document which shows felix snapshots of what all configurations
>> we did while configuring Yard, yardsite, entiy linking engine and weighted
>> chain
>> 2) our custom ontology
>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>
>> May i request you all to please look at these files and let us know if we
>> are missing something in configurations.
>>
>> We have referred to below web links in order to configure stanbol for
>> using our custom ontology for entity extraction and linking
>>
>> http://stanbol.apache.org/docs/trunk/customvocabulary.html
>> http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite
>>
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking
>>
>> http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html
>>
>> Thanks in advance for your valuable help.
>>
>> Best regards
>> tarandeep
>>
>>
>>
>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>> sergio.fernandez@salzburgresearch.at> wrote:
>>
>>> Hi,
>>>
>>> I'm not an expert on entity linking, but from my experience such
>>> behaviour could be caused by the proper noun detection. Further details at:
>>>
>>> http://stanbol.apache.org/**docs/trunk/components/**
>>> enhancer/engines/entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>
>>> In addition, I'd like to suggest you to take a look to the netiquette in
>>> mailing lists. This is an open source community; therefore messages
>>> starting with "URGENT" are not very polite. Specially sending it on Friday
>>> afternoon, when people could be already out for weekend, or even on
>>> vacations.
>>>
>>> Best,
>>> Sergio
>>>
>>>
>>>
>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>
>>>> Hi,
>>>>
>>>> I am using stanbol to extract entitiies by plugging custom vocabulary as
>>>> per http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>
>>>> Following are the steps followed -
>>>>
>>>>    Configured Clerezza Yard.
>>>>    Configured Managed Yard site.
>>>>    Updated the site by plugging ontology(containing custom entities) .
>>>>    Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>> managed
>>>> site.
>>>>    Configured a customChain which uses following engine
>>>>
>>>>      -  *langdetect*
>>>>      - *opennlp-sentence*
>>>>      - *opennlp-token*
>>>>      - *opennlp-pos*
>>>>      - *opennlp-chunker*
>>>>      - *customLinkingEngine*
>>>>
>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>
>>>> However i am facing an issue in extracting entities which has space in
>>>> between. For example "Tommy Hilfiger".
>>>>
>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>>
>>>> I had tried configuring  *customLinkingEngine* same as *
>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
>>>> didn't work to extract above entity.
>>>>
>>>> I have invested more than a week now and running out of options now
>>>>
>>>> i request you to please provide help in resolving this issue
>>>>
>>>>
>>> --
>>> Sergio Fernández
>>> Salzburg Research
>>> +43 662 2288 318
>>> Jakob-Haringer Strasse 5/II
>>> A-5020 Salzburg (Austria)
>>> http://www.salzburgresearch.at
>>>
>>
>>
>

-- 
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at

Re: [URGENT] Working with custom vocabulary

Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Just to add to my previous email

If i add another individual in my ontology "MyUniversity" under class
University



    <!--
http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->

    <owl:NamedIndividual rdf:about="
http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity
">
        <rdf:type rdf:resource="
http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University
"/>
        <rdfs:label>MyUniversity</rdfs:label>
    </owl:NamedIndividual>


So with all configurations i have mentioned in the word document (in google
drive folder), when i pass text with "MyUniversity" in it, my enhancement
chain is able to extract "MyUniversity" and link it with "University" type

But same set of configurations doesn't work with individual "University of
Salzburg"

If anyone of you please provide help on what are we missing to be able to
extract custom entities which has space in between, will be a great help to
proceed further on our journey with using and contributing to stanbol

with best regards,
tarandeep



On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
tsawhney@innodata.com> wrote:

> Thanks Sergio and Dileepa for your responses
>
> We haven't been able to resolve the issue. We therefore decided to keep
> just one class and one instance value "University of Salzburg" in our
> custom ontology and try to extract this entity and also link it but we
> could not get this running. I am sure we are missing some configurations.
>
> I am sharing a google drive folder at below link
>
> https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing
>
> This folder has 3 files:
>
> 1) A word document which shows felix snapshots of what all configurations
> we did while configuring Yard, yardsite, entiy linking engine and weighted
> chain
> 2) our custom ontology
> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>
> May i request you all to please look at these files and let us know if we
> are missing something in configurations.
>
> We have referred to below web links in order to configure stanbol for
> using our custom ontology for entity extraction and linking
>
> http://stanbol.apache.org/docs/trunk/customvocabulary.html
> http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite
>
> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking
>
> http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html
>
> Thanks in advance for your valuable help.
>
> Best regards
> tarandeep
>
>
>
> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
> sergio.fernandez@salzburgresearch.at> wrote:
>
>> Hi,
>>
>> I'm not an expert on entity linking, but from my experience such
>> behaviour could be caused by the proper noun detection. Further details at:
>>
>> http://stanbol.apache.org/**docs/trunk/components/**
>> enhancer/engines/entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>
>> In addition, I'd like to suggest you to take a look to the netiquette in
>> mailing lists. This is an open source community; therefore messages
>> starting with "URGENT" are not very polite. Specially sending it on Friday
>> afternoon, when people could be already out for weekend, or even on
>> vacations.
>>
>> Best,
>> Sergio
>>
>>
>>
>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>
>>> Hi,
>>>
>>> I am using stanbol to extract entitiies by plugging custom vocabulary as
>>> per http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>
>>> Following are the steps followed -
>>>
>>>   Configured Clerezza Yard.
>>>   Configured Managed Yard site.
>>>   Updated the site by plugging ontology(containing custom entities) .
>>>   Configured Entity hub linking Engine(*customLinkingEngine*) with
>>> managed
>>> site.
>>>   Configured a customChain which uses following engine
>>>
>>>     -  *langdetect*
>>>     - *opennlp-sentence*
>>>     - *opennlp-token*
>>>     - *opennlp-pos*
>>>     - *opennlp-chunker*
>>>     - *customLinkingEngine*
>>>
>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>
>>> However i am facing an issue in extracting entities which has space in
>>> between. For example "Tommy Hilfiger".
>>>
>>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>>
>>> I had tried configuring  *customLinkingEngine* same as *
>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
>>> didn't work to extract above entity.
>>>
>>> I have invested more than a week now and running out of options now
>>>
>>> i request you to please provide help in resolving this issue
>>>
>>>
>> --
>> Sergio Fernández
>> Salzburg Research
>> +43 662 2288 318
>> Jakob-Haringer Strasse 5/II
>> A-5020 Salzburg (Austria)
>> http://www.salzburgresearch.at
>>
>
>

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: [URGENT] Working with custom vocabulary

Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Thanks Sergio and Dileepa for your responses

We haven't been able to resolve the issue. We therefore decided to keep
just one class and one instance value "University of Salzburg" in our
custom ontology and try to extract this entity and also link it but we
could not get this running. I am sure we are missing some configurations.

I am sharing a google drive folder at below link
https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing

This folder has 3 files:

1) A word document which shows felix snapshots of what all configurations
we did while configuring Yard, yardsite, entiy linking engine and weighted
chain
2) our custom ontology
3) the result of SPARQL against our graphuri using SPARQL endpoint

May i request you all to please look at these files and let us know if we
are missing something in configurations.

We have referred to below web links in order to configure stanbol for using
our custom ontology for entity extraction and linking

http://stanbol.apache.org/docs/trunk/customvocabulary.html
http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite
http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking
http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html

Thanks in advance for your valuable help.

Best regards
tarandeep



On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
sergio.fernandez@salzburgresearch.at> wrote:

> Hi,
>
> I'm not an expert on entity linking, but from my experience such behaviour
> could be caused by the proper noun detection. Further details at:
>
> http://stanbol.apache.org/**docs/trunk/components/**
> enhancer/engines/entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>
> In addition, I'd like to suggest you to take a look to the netiquette in
> mailing lists. This is an open source community; therefore messages
> starting with "URGENT" are not very polite. Specially sending it on Friday
> afternoon, when people could be already out for weekend, or even on
> vacations.
>
> Best,
> Sergio
>
>
>
> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>
>> Hi,
>>
>> I am using stanbol to extract entitiies by plugging custom vocabulary as
>> per http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>
>> Following are the steps followed -
>>
>>   Configured Clerezza Yard.
>>   Configured Managed Yard site.
>>   Updated the site by plugging ontology(containing custom entities) .
>>   Configured Entity hub linking Engine(*customLinkingEngine*) with managed
>> site.
>>   Configured a customChain which uses following engine
>>
>>     -  *langdetect*
>>     - *opennlp-sentence*
>>     - *opennlp-token*
>>     - *opennlp-pos*
>>     - *opennlp-chunker*
>>     - *customLinkingEngine*
>>
>> Now, i am able to extract entities like Adidas using *customChain*.
>>
>> However i am facing an issue in extracting entities which has space in
>> between. For example "Tommy Hilfiger".
>>
>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>> instance) is rightly extracting entities like  "Tommy Hilfiger".
>>
>> I had tried configuring  *customLinkingEngine* same as *
>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
>> didn't work to extract above entity.
>>
>> I have invested more than a week now and running out of options now
>>
>> i request you to please provide help in resolving this issue
>>
>>
> --
> Sergio Fernández
> Salzburg Research
> +43 662 2288 318
> Jakob-Haringer Strasse 5/II
> A-5020 Salzburg (Austria)
> http://www.salzburgresearch.at
>

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: [URGENT] Working with custom vocabulary

Posted by Sergio Fernández <se...@salzburgresearch.at>.
Hi,

I'm not an expert on entity linking, but from my experience such 
behaviour could be caused by the proper noun detection. Further details at:

http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking

In addition, I'd like to suggest you to take a look to the netiquette in 
mailing lists. This is an open source community; therefore messages 
starting with "URGENT" are not very polite. Specially sending it on 
Friday afternoon, when people could be already out for weekend, or even 
on vacations.

Best,
Sergio


On 12/07/13 15:54, Sethi, Keval Krishna wrote:
> Hi,
>
> I am using stanbol to extract entitiies by plugging custom vocabulary as
> per http://stanbol.apache.org/docs/trunk/customvocabulary.html
>
> Following are the steps followed -
>
>   Configured Clerezza Yard.
>   Configured Managed Yard site.
>   Updated the site by plugging ontology(containing custom entities) .
>   Configured Entity hub linking Engine(*customLinkingEngine*) with managed
> site.
>   Configured a customChain which uses following engine
>
>     -  *langdetect*
>     - *opennlp-sentence*
>     - *opennlp-token*
>     - *opennlp-pos*
>     - *opennlp-chunker*
>     - *customLinkingEngine*
>
> Now, i am able to extract entities like Adidas using *customChain*.
>
> However i am facing an issue in extracting entities which has space in
> between. For example "Tommy Hilfiger".
>
> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
> instance) is rightly extracting entities like  "Tommy Hilfiger".
>
> I had tried configuring  *customLinkingEngine* same as *
> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
> didn't work to extract above entity.
>
> I have invested more than a week now and running out of options now
>
> i request you to please provide help in resolving this issue
>

-- 
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at

Re: [URGENT] Working with custom vocabulary

Posted by "Sethi, Keval Krishna" <ks...@innodata.com>.
Hi Dileepa,

Thanks for replying.

Yes i had "Tommy Hilfiger" as an entity in my configured managed site.
To confirm it i had queried it through sparql end point. Following is the
query and its result

Query
select * {<http://demo.com#Tommy_Hilfiger> ?p ?o}

Result
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
<head>
<variable name="p"/>
<variable name="o"/>
</head>
<results>
<result>
<binding name="p">
<uri>http://www.w3.org/1999/02/22-rdf-syntax-ns#type</uri>
</binding>
<binding name="o">
<uri>http://www.w3.org/2002/07/owl#NamedIndividual</uri>
</binding>
</result>
<result>
<binding name="p">
<uri>http://www.w3.org/1999/02/22-rdf-syntax-ns#type</uri>
</binding>
<binding name="o">
<uri>http://demo.com#Designer_Brands</uri>
</binding>
</result>
<result>
<binding name="p">
<uri>http://www.w3.org/2000/01/rdf-schema#label</uri>
</binding>
<binding name="o">
<literal>Tommy Hilfiger</literal>
</binding>
</result>
</results>
</sparql>

I hadn't done any custom configurations while creating managed site, just
followed referenced documentation (
http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite)
and configured clerezza yard site.

Please suggest if i am  missing something.

On Sat, Jul 13, 2013 at 11:57 AM, Dileepa Jayakody <
dileepajayakody@gmail.com> wrote:

> Hi Sethi,
>
> I'm also quite a newbie to Stanbol and configured a new site (for a foaf
> dataset),a entityhub linking engine and a custom chain to use the engine
> and was able to extract entities from my entityhub.
> These are the set of engines I chained in my weighted enhancement chain :
> langdetect, opennlp-sentence, opennlp-token, opennlp-pos, foaf-site-linking
> (my custom engine).
>
> If you are only using your new Site (reference site) in your engine, you
> will detect entities from that site only during entity linking. Are you
> sure entity identified by " Tommy Hilfiger" is available in your site? Did
> you do any custom configurations when you created your reference site?
>
> Thanks,
> Dileepa
>
>
> On Sat, Jul 13, 2013 at 7:28 AM, Sawhney, Tarandeep Singh <
> tsawhney@innodata.com> wrote:
>
> > A polite reminder to stanbol dev community
> >
> > Can anyone please provide some pointers to resolve below issue in entity
> > extraction using custom ontology with stanbol.
> >
> > Please let us know if some more information is required to understand
> what
> > we are doing so you can suggest some help.
> >
> > Best regards
> > tarandeep
> >
> >
> > On Fri, Jul 12, 2013 at 7:24 PM, Sethi, Keval Krishna
> > <ks...@innodata.com>wrote:
> >
> > > Hi,
> > >
> > > I am using stanbol to extract entitiies by plugging custom vocabulary
> as
> > > per http://stanbol.apache.org/docs/trunk/customvocabulary.html
> > >
> > > Following are the steps followed -
> > >
> > >  Configured Clerezza Yard.
> > >  Configured Managed Yard site.
> > >  Updated the site by plugging ontology(containing custom entities) .
> > >  Configured Entity hub linking Engine(*customLinkingEngine*) with
> managed
> > > site.
> > >  Configured a customChain which uses following engine
> > >
> > >    -  *langdetect*
> > >    - *opennlp-sentence*
> > >    - *opennlp-token*
> > >    - *opennlp-pos*
> > >    - *opennlp-chunker*
> > >    - *customLinkingEngine*
> > >
> > > Now, i am able to extract entities like Adidas using *customChain*.
> > >
> > > However i am facing an issue in extracting entities which has space in
> > > between. For example "Tommy Hilfiger".
> > >
> > > Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
> > > instance) is rightly extracting entities like  "Tommy Hilfiger".
> > >
> > > I had tried configuring  *customLinkingEngine* same as *
> > > dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but
> it
> > > didn't work to extract above entity.
> > >
> > > I have invested more than a week now and running out of options now
> > >
> > > i request you to please provide help in resolving this issue
> > >
> > > --
> > > Regards,
> > > Keval Sethi
> > >
> > > --
> > >
> > > "This e-mail and any attachments transmitted with it are for the sole
> use
> > > of the intended recipient(s) and may contain confidential , proprietary
> > or
> > > privileged information. If you are not the intended recipient, please
> > > contact the sender by reply e-mail and destroy all copies of the
> original
> > > message. Any unauthorized review, use, disclosure, dissemination,
> > > forwarding, printing or copying of this e-mail or any action taken in
> > > reliance on this e-mail is strictly prohibited and may be unlawful."
> > >
> >
> > --
> >
> > "This e-mail and any attachments transmitted with it are for the sole use
> > of the intended recipient(s) and may contain confidential , proprietary
> or
> > privileged information. If you are not the intended recipient, please
> > contact the sender by reply e-mail and destroy all copies of the original
> > message. Any unauthorized review, use, disclosure, dissemination,
> > forwarding, printing or copying of this e-mail or any action taken in
> > reliance on this e-mail is strictly prohibited and may be unlawful."
> >
>



-- 
Regards,
Keval Sethi

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."

Re: [URGENT] Working with custom vocabulary

Posted by Dileepa Jayakody <di...@gmail.com>.
Hi Sethi,

I'm also quite a newbie to Stanbol and configured a new site (for a foaf
dataset),a entityhub linking engine and a custom chain to use the engine
and was able to extract entities from my entityhub.
These are the set of engines I chained in my weighted enhancement chain :
langdetect, opennlp-sentence, opennlp-token, opennlp-pos, foaf-site-linking
(my custom engine).

If you are only using your new Site (reference site) in your engine, you
will detect entities from that site only during entity linking. Are you
sure entity identified by " Tommy Hilfiger" is available in your site? Did
you do any custom configurations when you created your reference site?

Thanks,
Dileepa


On Sat, Jul 13, 2013 at 7:28 AM, Sawhney, Tarandeep Singh <
tsawhney@innodata.com> wrote:

> A polite reminder to stanbol dev community
>
> Can anyone please provide some pointers to resolve below issue in entity
> extraction using custom ontology with stanbol.
>
> Please let us know if some more information is required to understand what
> we are doing so you can suggest some help.
>
> Best regards
> tarandeep
>
>
> On Fri, Jul 12, 2013 at 7:24 PM, Sethi, Keval Krishna
> <ks...@innodata.com>wrote:
>
> > Hi,
> >
> > I am using stanbol to extract entitiies by plugging custom vocabulary as
> > per http://stanbol.apache.org/docs/trunk/customvocabulary.html
> >
> > Following are the steps followed -
> >
> >  Configured Clerezza Yard.
> >  Configured Managed Yard site.
> >  Updated the site by plugging ontology(containing custom entities) .
> >  Configured Entity hub linking Engine(*customLinkingEngine*) with managed
> > site.
> >  Configured a customChain which uses following engine
> >
> >    -  *langdetect*
> >    - *opennlp-sentence*
> >    - *opennlp-token*
> >    - *opennlp-pos*
> >    - *opennlp-chunker*
> >    - *customLinkingEngine*
> >
> > Now, i am able to extract entities like Adidas using *customChain*.
> >
> > However i am facing an issue in extracting entities which has space in
> > between. For example "Tommy Hilfiger".
> >
> > Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
> > instance) is rightly extracting entities like  "Tommy Hilfiger".
> >
> > I had tried configuring  *customLinkingEngine* same as *
> > dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
> > didn't work to extract above entity.
> >
> > I have invested more than a week now and running out of options now
> >
> > i request you to please provide help in resolving this issue
> >
> > --
> > Regards,
> > Keval Sethi
> >
> > --
> >
> > "This e-mail and any attachments transmitted with it are for the sole use
> > of the intended recipient(s) and may contain confidential , proprietary
> or
> > privileged information. If you are not the intended recipient, please
> > contact the sender by reply e-mail and destroy all copies of the original
> > message. Any unauthorized review, use, disclosure, dissemination,
> > forwarding, printing or copying of this e-mail or any action taken in
> > reliance on this e-mail is strictly prohibited and may be unlawful."
> >
>
> --
>
> "This e-mail and any attachments transmitted with it are for the sole use
> of the intended recipient(s) and may contain confidential , proprietary or
> privileged information. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of the original
> message. Any unauthorized review, use, disclosure, dissemination,
> forwarding, printing or copying of this e-mail or any action taken in
> reliance on this e-mail is strictly prohibited and may be unlawful."
>

Re: [URGENT] Working with custom vocabulary

Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
A polite reminder to stanbol dev community

Can anyone please provide some pointers to resolve below issue in entity
extraction using custom ontology with stanbol.

Please let us know if some more information is required to understand what
we are doing so you can suggest some help.

Best regards
tarandeep


On Fri, Jul 12, 2013 at 7:24 PM, Sethi, Keval Krishna
<ks...@innodata.com>wrote:

> Hi,
>
> I am using stanbol to extract entitiies by plugging custom vocabulary as
> per http://stanbol.apache.org/docs/trunk/customvocabulary.html
>
> Following are the steps followed -
>
>  Configured Clerezza Yard.
>  Configured Managed Yard site.
>  Updated the site by plugging ontology(containing custom entities) .
>  Configured Entity hub linking Engine(*customLinkingEngine*) with managed
> site.
>  Configured a customChain which uses following engine
>
>    -  *langdetect*
>    - *opennlp-sentence*
>    - *opennlp-token*
>    - *opennlp-pos*
>    - *opennlp-chunker*
>    - *customLinkingEngine*
>
> Now, i am able to extract entities like Adidas using *customChain*.
>
> However i am facing an issue in extracting entities which has space in
> between. For example "Tommy Hilfiger".
>
> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
> instance) is rightly extracting entities like  "Tommy Hilfiger".
>
> I had tried configuring  *customLinkingEngine* same as *
> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
> didn't work to extract above entity.
>
> I have invested more than a week now and running out of options now
>
> i request you to please provide help in resolving this issue
>
> --
> Regards,
> Keval Sethi
>
> --
>
> "This e-mail and any attachments transmitted with it are for the sole use
> of the intended recipient(s) and may contain confidential , proprietary or
> privileged information. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of the original
> message. Any unauthorized review, use, disclosure, dissemination,
> forwarding, printing or copying of this e-mail or any action taken in
> reliance on this e-mail is strictly prohibited and may be unlawful."
>

-- 

"This e-mail and any attachments transmitted with it are for the sole use 
of the intended recipient(s) and may contain confidential , proprietary or 
privileged information. If you are not the intended recipient, please 
contact the sender by reply e-mail and destroy all copies of the original 
message. Any unauthorized review, use, disclosure, dissemination, 
forwarding, printing or copying of this e-mail or any action taken in 
reliance on this e-mail is strictly prohibited and may be unlawful."