You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by "Sethi, Keval Krishna" <ks...@innodata.com> on 2013/07/12 15:54:13 UTC
[URGENT] Working with custom vocabulary
Hi,
I am using stanbol to extract entitiies by plugging custom vocabulary as
per http://stanbol.apache.org/docs/trunk/customvocabulary.html
Following are the steps followed -
Configured Clerezza Yard.
Configured Managed Yard site.
Updated the site by plugging ontology(containing custom entities) .
Configured Entity hub linking Engine(*customLinkingEngine*) with managed
site.
Configured a customChain which uses following engine
- *langdetect*
- *opennlp-sentence*
- *opennlp-token*
- *opennlp-pos*
- *opennlp-chunker*
- *customLinkingEngine*
Now, i am able to extract entities like Adidas using *customChain*.
However i am facing an issue in extracting entities which has space in
between. For example "Tommy Hilfiger".
Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
instance) is rightly extracting entities like "Tommy Hilfiger".
I had tried configuring *customLinkingEngine* same as *
dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
didn't work to extract above entity.
I have invested more than a week now and running out of options now
i request you to please provide help in resolving this issue
--
Regards,
Keval Sethi
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: Working with custom vocabulary
Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi all,
Sorry I was offline the whole last week, otherwise I would have
answered earlier.
As Rafa already pointed out, the issue was caused by the ClerezzaYard
not returning multi word literals for queries of a single word. IMO
this is not a bug, but rather an issue caused by SPARQL not supporting
proper full text search features.
EntityLinking works on "word" level. Therefore if a text mentions
"University of Salzburg" will will create a query such as "university"
OR "salzburg". This is translated to SPARQL as an UNION over
"rdf:label" values. However as you might know, a SPARQL endpoint will
not answer such a query with an Entity that has the rdfs:label
"University of Salzburg".
For LARQ and Virtuoso Stanbol is able to use the specific Full text
extensions. In such cases queries like the above might provide
expected results, but for the ClerezzaYard this is not possible (might
change with the introduction of FastLane).
In any case: For users that plan to use a ManagedSite for
EntityLinking it is strongly suggested to use the SolrYard
implementation!
best
Rupert
On Tue, Jul 16, 2013 at 2:31 PM, Sawhney, Tarandeep Singh
<ts...@innodata.com> wrote:
> Sure Rafa. i will add the new issue in jira with details
>
> best regards
> tarandeep
> On Jul 16, 2013 5:59 PM, "Rafa Haro" <rh...@zaizi.com> wrote:
>
>> Hi Tarandeep,
>>
>> Happy to hear you finally solve your problem. Could you please add a new
>> issue in the Stanbol Jira explaining the error with your ClerezzaYard site?
>>
>> Thanks
>>
>> El 16/07/13 13:38, Sawhney, Tarandeep Singh escribió:
>>
>>> Hi Rafa
>>>
>>> I tried using SolrYard and it worked :-) So there seems to be a defect in
>>> ClerezzaYard
>>>
>>> thanks so much for pointing that out
>>>
>>> Do you have any information on when new version of stanbol is planned to
>>> be
>>> released and what will be covered in that release (feature/bug list etc)
>>>
>>> Also can i get some information on stanbol roadmap ahead
>>>
>>> thanks again for your help
>>>
>>> best regards
>>> tarandeep
>>>
>>>
>>> On Mon, Jul 15, 2013 at 10:47 PM, Sawhney, Tarandeep Singh <
>>> tsawhney@innodata.com> wrote:
>>>
>>> Thanks Rafa for your response
>>>>
>>>> I will try resolving this issue based on pointers you have provided and
>>>> will post the update accordingly.
>>>>
>>>> Best regards
>>>> tarandeep
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>>>
>>>> Hi Tarandeep,
>>>>>
>>>>> As Sergio already pointed, you can check some different Entity Linking
>>>>> engines configurations at IKS development server:
>>>>> http://dev.iks-project.eu:****8081/enhancer/chain<http://**
>>>>> dev.iks-project.eu:8081/**enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>
>>>>> >.
>>>>> You can try to use the same configuration of some of the chains
>>>>> registered
>>>>> in this Stanbol instance. For that, just go through the Felix Console (
>>>>> http://dev.iks-project.eu:****8081/system/console/configMgr/**<
>>>>> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
>>>>> **>
>>>>> **) and take a look to the different EntityHubLinkingEngine
>>>>> configurations. You can also try to use a Keyword Linking engine
>>>>> instead of
>>>>> an EntityHub Linking engine.
>>>>>
>>>>> Anyway, all the sites configured in this server are SolrYard based, so
>>>>> perhaps there is a bug in the ClerezzaYard entity search process for
>>>>> multi-words entities' labels. We might would need debug logs messages in
>>>>> order to find out the problem.
>>>>>
>>>>> Regards
>>>>>
>>>>> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>>>>>
>>>>> Hi Sergio
>>>>>>
>>>>>> This is exactly i did and i mentioned in my last email
>>>>>>
>>>>>> *"What i understand is to enable option "Link ProperNouns only" in
>>>>>>
>>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>>> chain"
>>>>>> *
>>>>>>
>>>>>>
>>>>>> I have already checked this option in my own entity hub linking engine
>>>>>>
>>>>>> By the way, did you get a chance to look at files i have shared in
>>>>>> google
>>>>>> drive folder. Did you notice any problems there ?
>>>>>>
>>>>>> I think using custom ontology with stanbol should be a very common use
>>>>>> case
>>>>>> and if there are issues getting it working, either i am doing something
>>>>>> terribly wrong or there are some other reasons which i dont know.
>>>>>>
>>>>>> But anyways, i am persisting to solve this issue and any help on this
>>>>>> from
>>>>>> this dev community will be much appreciated
>>>>>>
>>>>>> best regards
>>>>>> tarandeep
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at><
>>>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>
>>>>>> >>
>>>>>> wrote:
>>>>>>
>>>>>> http://{stanbol}/system/******console/configMgr sorry
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>>>>>
>>>>>>> Have you check the
>>>>>>>
>>>>>>>> 1) go to http://{stanbol}/config/******system/console/configMgr
>>>>>>>>
>>>>>>>>
>>>>>>>> 2) find your EntityHub Linking engine
>>>>>>>>
>>>>>>>> 3) and then "Link ProperNouns only"
>>>>>>>>
>>>>>>>> The documentation in that configuration is quite useful I think:
>>>>>>>>
>>>>>>>> "If activated only ProperNouns will be matched against the
>>>>>>>> Vocabulary.
>>>>>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>>>>>> requires a tag of the POS TagSet to be mapped against
>>>>>>>> 'olia:PorperNoun'.
>>>>>>>> Otherwise mapping will not work as expected.
>>>>>>>> (enhancer.engines.linking.******properNounsState)"
>>>>>>>>
>>>>>>>>
>>>>>>>> Hope this help. You have to take into account such kind of issues are
>>>>>>>> not easy to solve by email.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>>>>>
>>>>>>>> Thanks Sergio for your response
>>>>>>>>
>>>>>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>>>>>> entityhub linking and also to use "opennlp-pos" engine in my
>>>>>>>>> weighted
>>>>>>>>> chain
>>>>>>>>>
>>>>>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>>>>>
>>>>>>>>> Please find below the output RDF/XML from enhancer
>>>>>>>>>
>>>>>>>>> Request you to please let me know if i did not understand your
>>>>>>>>> inputs
>>>>>>>>> correctly
>>>>>>>>>
>>>>>>>>> One more thing, in our ontology (yet to be built) we will have
>>>>>>>>> entities
>>>>>>>>> which are other than people, places and organisations. For example,
>>>>>>>>> belts,
>>>>>>>>> bags etc
>>>>>>>>>
>>>>>>>>> best regards
>>>>>>>>> tarandeep
>>>>>>>>>
>>>>>>>>> <rdf:RDF
>>>>>>>>> xmlns:rdf="http://www.w3.org/******1999/02/22-rdf-syntax-ns#<http://www.w3.org/****1999/02/22-rdf-syntax-ns#>
>>>>>>>>> <h**ttp://www.w3.org/**1999/02/22-**rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>>>>>> >
>>>>>>>>> <htt**p://www.w3.org/1999/02/**22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-**syntax-ns#>
>>>>>>>>> <http://**www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>>>>>> >
>>>>>>>>> "
>>>>>>>>> xmlns:j.0="http://purl.org/dc/******terms/<http://purl.org/dc/****terms/>
>>>>>>>>> <http://purl.org/dc/****terms/ <http://purl.org/dc/**terms/>><
>>>>>>>>> http://purl.org/dc/terms/>"
>>>>>>>>> xmlns:j.1="http://fise.iks-****p**roject.eu/ontology/<
>>>>>>>>> http://**project.eu/ontology/ <http://project.eu/ontology/>>
>>>>>>>>> <http://**fise.iks-project.eu/**ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>>> <http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>> **"
>>>>>>>>>>
>>>>>>>>> <rdf:Description
>>>>>>>>> rdf:about="urn:enhancement-******197792bf-f1e8-47bf-626a-****
>>>>>>>>> 3cdfbdb863b3">
>>>>>>>>> <j.0:type rdf:resource="http://purl.org/****<http://purl.org/**>
>>>>>>>>> **dc/terms/LinguisticSystem<ht**tp://purl.org/**dc/terms/**
>>>>>>>>> LinguisticSystem <http://purl.org/**dc/terms/LinguisticSystem>>
>>>>>>>>> <ht**tp://purl.org/dc/terms/****LinguisticSystem<http://purl.org/dc/terms/**LinguisticSystem>
>>>>>>>>> <http://purl.**org/dc/terms/LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>>>>>> >
>>>>>>>>> "/>
>>>>>>>>> <j.1:extracted-from
>>>>>>>>> rdf:resource="urn:content-******item-sha1-****
>>>>>>>>> 3b2998e66582544035454850d2dd81******
>>>>>>>>> 755b747849"/>
>>>>>>>>>
>>>>>>>>> <j.1:confidence
>>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#double<**
>>>>>>>>> http**
>>>>>>>>> ://www.w3.org/2001/XMLSchema#****double<http://www.w3.org/2001/XMLSchema#**double>
>>>>>>>>> <http://www.w3.org/**2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>>>>>>>> >
>>>>>>>>> ">0.**9999964817340454</j.1:******confidence>
>>>>>>>>>
>>>>>>>>> <rdf:type
>>>>>>>>> rdf:resource="http://fise.iks-******project.eu/ontology/******
>>>>>>>>> Enhancement <http://project.eu/ontology/****Enhancement><
>>>>>>>>> http://project.eu/**ontology/**Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>>>> >
>>>>>>>>> <http://fise.iks-**project.eu/**ontology/**Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>>>> <http://**fise.iks-project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>>>>>> >
>>>>>>>>> "/>
>>>>>>>>> <rdf:type
>>>>>>>>> rdf:resource="http://fise.iks-******project.eu/ontology/****
>>>>>>>>> TextAnnotation <http://project.eu/ontology/****TextAnnotation<http://project.eu/ontology/**TextAnnotation>
>>>>>>>>> ><
>>>>>>>>> http://fise.**iks-project.eu/**ontology/**TextAnnotation<http://iks-project.eu/ontology/**TextAnnotation>
>>>>>>>>> <http**://fise.iks-project.eu/**ontology/TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>>>>>> >
>>>>>>>>> "/>
>>>>>>>>> <j.0:language>en</j.0:******language>
>>>>>>>>> <j.0:created
>>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#dateTime<**
>>>>>>>>> ht**
>>>>>>>>> tp://www.w3.org/2001/****XMLSchema#dateTime<http://www.w3.org/2001/**XMLSchema#dateTime>
>>>>>>>>> <http://www.**w3.org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>>>>>> **>
>>>>>>>>> ">**2013-07-15T14:25:43.829Z</****j.0:**created>
>>>>>>>>>
>>>>>>>>> <j.0:creator
>>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#string<**
>>>>>>>>> http**
>>>>>>>>> ://www.w3.org/2001/XMLSchema#****string<http://www.w3.org/2001/XMLSchema#**string>
>>>>>>>>> <http://www.w3.org/**2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>>>>>>>> >
>>>>>>>>> ">**org.apache.stanbol.****enhancer.**engines.langdetect.******
>>>>>>>>> LanguageDetectionEnhancementEn******gine</j.0:creator>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> </rdf:Description>
>>>>>>>>> </rdf:RDF>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>> >
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> As I said: have you check the proper noun detection and POS
>>>>>>>>> tagging
>>>>>>>>> in
>>>>>>>>>
>>>>>>>>> your chain?
>>>>>>>>>>
>>>>>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>>>>>> Salzburg,
>>>>>>>>>> which is based in Austria" works at the demo server:
>>>>>>>>>>
>>>>>>>>>> http://dev.iks-project.eu:********8081/enhancer/chain/dbpedia-**
>>>>>>>>>> ******
>>>>>>>>>> proper-noun<http://dev.iks-****p**roject.eu:8081/enhancer/**<**
>>>>>>>>>> http://project.eu:8081/**enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>>>>>> >
>>>>>>>>>> chain/dbpedia-proper-noun<**http**://dev.iks-project.eu:**8081/**<http://dev.iks-project.eu:8081/**>
>>>>>>>>>> enhancer/chain/dbpedia-proper-****noun<http://dev.iks-project.**
>>>>>>>>>> eu:8081/enhancer/chain/**dbpedia-proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>>>>>> >
>>>>>>>>>> Here the details:
>>>>>>>>>>
>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/****<http://stanbol.apache.org/******docs/trunk/components/****>
>>>>>>>>>> <h**ttp://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>>> >
>>>>>>>>>> enhancer/engines/**<http://**s**tanbol.apache.org/**docs/**<http://stanbol.apache.org/**docs/**>
>>>>>>>>>> trunk/components/**enhancer/****engines/**<http://stanbol.**
>>>>>>>>>> apache.org/**docs/trunk/**components/**enhancer/engines/****<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>>> >
>>>>>>>>>> entitylinking#proper-noun-********linking-****
>>>>>>>>>> wzxhzdk14enhancerengineslinkin********
>>>>>>>>>> gpropernounsstatewzxhzdk15<****htt**p://stanbol.apache.org/****
>>>>>>>>>> docs/** <http://stanbol.apache.org/**docs/**><
>>>>>>>>>> http://stanbol.apache.**org/docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>>>> >
>>>>>>>>>> trunk/components/enhancer/******engines/entitylinking#proper-***
>>>>>>>>>> ***
>>>>>>>>>> noun-linking-******wzxhzdk14enhancerengineslinkin******
>>>>>>>>>>
>>>>>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>>>>>> >
>>>>>>>>>> Cheers,
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>>>>>
>>>>>>>>>> Just to add to my previous email
>>>>>>>>>>
>>>>>>>>>> If i add another individual in my ontology "MyUniversity" under
>>>>>>>>>>> class
>>>>>>>>>>> University
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> <!--
>>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> untitled-ontology-13#********MyUniversity--<http://www.**
>>>>>>>>>>> semanticweb.org/vi5/******ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**>
>>>>>>>>>>> <**http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>>> >
>>>>>>>>>>> ontology-13#MyUniversity--<**htt**p://www.semanticweb.org/**
>>>>>>>>>>> vi5/** <http://www.semanticweb.org/vi5/**>
>>>>>>>>>>> ontologies/2013/6/untitled-****ontology-13#MyUniversity--<htt**
>>>>>>>>>>> p://www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>>>> ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>>>>>> >
>>>>>>>>>>> <owl:NamedIndividual rdf:about="
>>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> untitled-ontology-13#********MyUniversity<http://www.**
>>>>>>>>>>> semanticweb.org/vi5/******ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**>
>>>>>>>>>>> <**http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>>> >
>>>>>>>>>>> ontology-13#MyUniversity<http:****//www.semanticweb.org/vi5/**
>>>>>>>>>>> ontologies/2013/6/untitled-****ontology-13#MyUniversity<http:**
>>>>>>>>>>> //www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>>>> ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>>>>>> >
>>>>>>>>>>> ">
>>>>>>>>>>> <rdf:type rdf:resource="
>>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>>> >
>>>>>>>>>>> untitled-ontology-13#********University<http://www.****semant**
>>>>>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>>>>>> *ontologies/2013/6/untitled-******ontology-13#University<http:**
>>>>>>>>>>> //**
>>>>>>>>>>> www.semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>>> ontology-13#University<http://**www.semanticweb.org/vi5/**
>>>>>>>>>>> ontologies/2013/6/untitled-**ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>>>>>> >
>>>>>>>>>>> "/>
>>>>>>>>>>> <rdfs:label>MyUniversity</********rdfs:label>
>>>>>>>>>>>
>>>>>>>>>>> </owl:NamedIndividual>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> So with all configurations i have mentioned in the word document
>>>>>>>>>>> (in
>>>>>>>>>>> google
>>>>>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>>>>>> enhancement
>>>>>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>>>>>> "University" type
>>>>>>>>>>>
>>>>>>>>>>> But same set of configurations doesn't work with individual
>>>>>>>>>>> "University of
>>>>>>>>>>> Salzburg"
>>>>>>>>>>>
>>>>>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>>>>>> able to
>>>>>>>>>>> extract custom entities which has space in between, will be a
>>>>>>>>>>> great
>>>>>>>>>>> help
>>>>>>>>>>> to
>>>>>>>>>>> proceed further on our journey with using and contributing to
>>>>>>>>>>> stanbol
>>>>>>>>>>>
>>>>>>>>>>> with best regards,
>>>>>>>>>>> tarandeep
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thanks Sergio and Dileepa for your responses
>>>>>>>>>>>
>>>>>>>>>>> We haven't been able to resolve the issue. We therefore decided
>>>>>>>>>>> to
>>>>>>>>>>>
>>>>>>>>>>>> keep
>>>>>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>>>>>> our
>>>>>>>>>>>> custom ontology and try to extract this entity and also link it
>>>>>>>>>>>> but we
>>>>>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>>>>>> configurations.
>>>>>>>>>>>>
>>>>>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>>>>>
>>>>>>>>>>>> https://drive.google.com/********folderview?id=0B-**<https://drive.google.com/******folderview?id=0B-**>
>>>>>>>>>>>> <https://**drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <https://**drive.google.com/****folderview?**id=0B-**<http://drive.google.com/**folderview?**id=0B-**>
>>>>>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>>>>>> >
>>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=********sharing<https://drive.**
>>>>>>>>>>>> google.com/folderview?id=0B-******vX9idwHlRtRFFOR000ZnBBOWM&**
>>>>>>>>>>>> usp=**<http://google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**>
>>>>>>>>>>>> **sharing<http://google.com/**folderview?id=0B-****
>>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>>>>>> >
>>>>>>>>>>>> <https://drive.**google.com/**folderview?id=0B-**<http://google.com/folderview?id=0B-**>
>>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<https://drive.**
>>>>>>>>>>>> google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>>>>>> sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>> This folder has 3 files:
>>>>>>>>>>>>
>>>>>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>>>>>> configurations
>>>>>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>>>>>> weighted
>>>>>>>>>>>> chain
>>>>>>>>>>>> 2) our custom ontology
>>>>>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL
>>>>>>>>>>>> endpoint
>>>>>>>>>>>>
>>>>>>>>>>>> May i request you all to please look at these files and let us
>>>>>>>>>>>> know
>>>>>>>>>>>> if we
>>>>>>>>>>>> are missing something in configurations.
>>>>>>>>>>>>
>>>>>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>>>>>> for
>>>>>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>>>>>
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/**
>>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>>> >
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> entityhub/managedsite<http://******stanbol.apache.org/docs/**
>>>>>>>>>>>> trunk/** <http://stanbol.apache.org/**docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> components/entityhub/******managedsite<http://stanbol.**
>>>>>>>>>>>> apache.org/docs/trunk/****components/entityhub/****managedsite<http://apache.org/docs/trunk/**components/entityhub/**managedsite>
>>>>>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**components/entityhub/*
>>>>>>>>>>>> *managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>>>>>> >
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/****<http://stanbol.apache.org/******docs/trunk/components/****>
>>>>>>>>>>>> <h**ttp://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>>>>> >
>>>>>>>>>>>> enhancer/engines/**<http://**s**tanbol.apache.org/**docs/**<http://stanbol.apache.org/**docs/**>
>>>>>>>>>>>> trunk/components/**enhancer/****engines/**<http://stanbol.**
>>>>>>>>>>>> apache.org/**docs/trunk/**components/**enhancer/engines/****<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> entityhublinking<http://****stan**bol.apache.org/docs/**trunk/**<http://bol.apache.org/docs/trunk/**>
>>>>>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> components/enhancer/engines/******entityhublinking<http://**
>>>>>>>>>>>> stanbol.apache.org/docs/trunk/****components/enhancer/engines/**
>>>>>>>>>>>> **<http://stanbol.apache.org/docs/trunk/**components/enhancer/engines/**>
>>>>>>>>>>>> entityhublinking<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>>> components/enhancer/engines/**entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>>>>>> >
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> enhancer/chains/weightedchain.********html<http://stanbol.****
>>>>>>>>>>>> apache.<http://stanbol.apache.**>
>>>>>>>>>>>> **
>>>>>>>>>>>> org/docs/trunk/components/******enhancer/chains/weightedchain.**
>>>>>>>>>>>> **
>>>>>>>>>>>> **html<http://stanbol.apache.****org/docs/trunk/components/**
>>>>>>>>>>>> enhancer/chains/weightedchain.****html<http://stanbol.apache.**
>>>>>>>>>>>> org/docs/trunk/components/**enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>>>>>
>>>>>>>>>>>> Best regards
>>>>>>>>>>>> tarandeep
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>>>>>> sergio.fernandez@******salzburgres**earch.at <
>>>>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>>>>> <sergio.fernandez@****salzburgre**search.at<http://**
>>>>>>>>>>>> salzburgresearch.at <http://salzburgresearch.at>>
>>>>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>>>>> >
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm not an expert on entity linking, but from my experience
>>>>>>>>>>>> such
>>>>>>>>>>>>
>>>>>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>>>>>> details
>>>>>>>>>>>>> at:
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://stanbol.apache.org/**********docs/trunk/components/**<http://stanbol.apache.org/********docs/trunk/components/**>
>>>>>>>>>>>>> <h**ttp://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <htt**p://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>>> <**http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <http:**//stanbol.apache.org/******docs/**trunk/components/**<http://stanbol.apache.org/****docs/**trunk/components/**>
>>>>>>>>>>>>> <**http://stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <ht**tp://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> enhancer/engines/********entitylinking<http://stanbol.********
>>>>>>>>>>>>> apache.org/docs/trunk/********components/enhancer/engines/****
>>>>>>>>>>>>> ****<http://apache.org/docs/trunk/******components/enhancer/engines/******>
>>>>>>>>>>>>> <http://apache.org/docs/**trunk/****components/enhancer/**
>>>>>>>>>>>>> engines/****<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> entitylinking<http://apache.****org/docs/trunk/**components/**
>>>>>>>>>>>>> enhancer/engines/******entitylinking<http://apache.**
>>>>>>>>>>>>> org/docs/trunk/**components/**enhancer/engines/****
>>>>>>>>>>>>> entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <http://stanbol.**apache.org/****docs/trunk/**<http://apache.org/**docs/trunk/**>
>>>>>>>>>>>>> <http://apache.**org/docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> components/enhancer/engines/******entitylinking<http://**
>>>>>>>>>>>>> stanbol. <http://stanbol.>**
>>>>>>>>>>>>> apache.org/docs/trunk/****components/enhancer/engines/**<http://apache.org/docs/trunk/**components/enhancer/engines/**>
>>>>>>>>>>>>> entitylinking<http://stanbol.**apache.org/docs/trunk/**
>>>>>>>>>>>>> components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>>>>>> netiquette in
>>>>>>>>>>>>> mailing lists. This is an open source community; therefore
>>>>>>>>>>>>> messages
>>>>>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it
>>>>>>>>>>>>> on
>>>>>>>>>>>>> Friday
>>>>>>>>>>>>> afternoon, when people could be already out for weekend, or even
>>>>>>>>>>>>> on
>>>>>>>>>>>>> vacations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Sergio
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>
>>>>>>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>>>>>
>>>>>>>>>>>>>> vocabulary
>>>>>>>>>>>>>> as
>>>>>>>>>>>>>> per
>>>>>>>>>>>>>> http://stanbol.apache.org/**********docs/trunk/**
>>>>>>>>>>>>>> customvocabulary.**<http://stanbol.apache.org/********docs/trunk/customvocabulary.**>
>>>>>>>>>>>>>> ****<http://stanbol.apache.**org/******docs/trunk/**
>>>>>>>>>>>>>> customvocabulary.****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> **html<http://stanbol.apache.****org/****docs/trunk/**
>>>>>>>>>>>>>> customvocabulary.****html<http**://stanbol.apache.org/******
>>>>>>>>>>>>>> docs/trunk/customvocabulary.******html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> <http://stanbol.apache.**org/******docs/trunk/****
>>>>>>>>>>>>>> customvocabulary.**html<http:/****/stanbol.apache.org/**docs/*
>>>>>>>>>>>>>> *** <http://stanbol.apache.org/**docs/**>
>>>>>>>>>>>>>> trunk/customvocabulary.**html<**http://stanbol.apache.org/****
>>>>>>>>>>>>>> docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> <http://stanbol.apache.**org/******docs/trunk/****
>>>>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>>>>> html<http://stanbol.apache.******org/docs/trunk/****
>>>>>>>>>>>>>> customvocabulary.html<http://****stanbol.apache.org/docs/**
>>>>>>>>>>>>>> trunk/** <http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/*
>>>>>>>>>>>>>> *customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Configured Clerezza Yard.
>>>>>>>>>>>>>> Configured Managed Yard site.
>>>>>>>>>>>>>> Updated the site by plugging ontology(containing custom
>>>>>>>>>>>>>> entities) .
>>>>>>>>>>>>>> Configured Entity hub linking
>>>>>>>>>>>>>> Engine(*customLinkingEngine*)
>>>>>>>>>>>>>> with
>>>>>>>>>>>>>> managed
>>>>>>>>>>>>>> site.
>>>>>>>>>>>>>> Configured a customChain which uses following engine
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - *langdetect*
>>>>>>>>>>>>>> - *opennlp-sentence*
>>>>>>>>>>>>>> - *opennlp-token*
>>>>>>>>>>>>>> - *opennlp-pos*
>>>>>>>>>>>>>> - *opennlp-chunker*
>>>>>>>>>>>>>> - *customLinkingEngine*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>>>>>> *customChain*.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>>>>>> space in
>>>>>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>>>>>> stanbol
>>>>>>>>>>>>>> instance) is rightly extracting entities like "Tommy
>>>>>>>>>>>>>> Hilfiger".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>>>>>>>>>> dbpedia-disamb-linking *(configured in
>>>>>>>>>>>>>> *dbpedia-disambiguation* )
>>>>>>>>>>>>>> but
>>>>>>>>>>>>>> it
>>>>>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>>>>>> now
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sergio Fernández
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Salzburg Research
>>>>>>>>>>>>> +43 662 2288 318
>>>>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>> Sergio Fernández
>>>>>>>>>> Salzburg Research
>>>>>>>>>> +43 662 2288 318
>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>> Sergio Fernández
>>>>>>> Salzburg Research
>>>>>>> +43 662 2288 318
>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>> A-5020 Salzburg (Austria)
>>>>>>> http://www.salzburgresearch.at
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>
>>>>> ------------------------------
>>>>> This message should be regarded as confidential. If you have received
>>>>> this email in error please notify the sender and destroy it immediately.
>>>>> Statements of intent shall only become binding when confirmed in hard
>>>>> copy
>>>>> by an authorised signatory.
>>>>>
>>>>> Zaizi Ltd is registered in England and Wales with the registration
>>>>> number
>>>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>>>>> London W6 7AN.
>>>>>
>>>>
>>>>
>>>>
>>
>> --
>>
>> ------------------------------
>> This message should be regarded as confidential. If you have received this
>> email in error please notify the sender and destroy it immediately.
>> Statements of intent shall only become binding when confirmed in hard copy
>> by an authorised signatory.
>>
>> Zaizi Ltd is registered in England and Wales with the registration number
>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>> London W6 7AN.
>
> --
>
> "This e-mail and any attachments transmitted with it are for the sole use
> of the intended recipient(s) and may contain confidential , proprietary or
> privileged information. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of the original
> message. Any unauthorized review, use, disclosure, dissemination,
> forwarding, printing or copying of this e-mail or any action taken in
> reliance on this e-mail is strictly prohibited and may be unlawful."
--
| Rupert Westenthaler rupert.westenthaler@gmail.com
| Bodenlehenstraße 11 ++43-699-11108907
| A-5500 Bischofshofen
Re: Working with custom vocabulary
Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Sure Rafa. i will add the new issue in jira with details
best regards
tarandeep
On Jul 16, 2013 5:59 PM, "Rafa Haro" <rh...@zaizi.com> wrote:
> Hi Tarandeep,
>
> Happy to hear you finally solve your problem. Could you please add a new
> issue in the Stanbol Jira explaining the error with your ClerezzaYard site?
>
> Thanks
>
> El 16/07/13 13:38, Sawhney, Tarandeep Singh escribió:
>
>> Hi Rafa
>>
>> I tried using SolrYard and it worked :-) So there seems to be a defect in
>> ClerezzaYard
>>
>> thanks so much for pointing that out
>>
>> Do you have any information on when new version of stanbol is planned to
>> be
>> released and what will be covered in that release (feature/bug list etc)
>>
>> Also can i get some information on stanbol roadmap ahead
>>
>> thanks again for your help
>>
>> best regards
>> tarandeep
>>
>>
>> On Mon, Jul 15, 2013 at 10:47 PM, Sawhney, Tarandeep Singh <
>> tsawhney@innodata.com> wrote:
>>
>> Thanks Rafa for your response
>>>
>>> I will try resolving this issue based on pointers you have provided and
>>> will post the update accordingly.
>>>
>>> Best regards
>>> tarandeep
>>>
>>>
>>> On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>>
>>> Hi Tarandeep,
>>>>
>>>> As Sergio already pointed, you can check some different Entity Linking
>>>> engines configurations at IKS development server:
>>>> http://dev.iks-project.eu:****8081/enhancer/chain<http://**
>>>> dev.iks-project.eu:8081/**enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>
>>>> >.
>>>> You can try to use the same configuration of some of the chains
>>>> registered
>>>> in this Stanbol instance. For that, just go through the Felix Console (
>>>> http://dev.iks-project.eu:****8081/system/console/configMgr/**<
>>>> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
>>>> **>
>>>> **) and take a look to the different EntityHubLinkingEngine
>>>> configurations. You can also try to use a Keyword Linking engine
>>>> instead of
>>>> an EntityHub Linking engine.
>>>>
>>>> Anyway, all the sites configured in this server are SolrYard based, so
>>>> perhaps there is a bug in the ClerezzaYard entity search process for
>>>> multi-words entities' labels. We might would need debug logs messages in
>>>> order to find out the problem.
>>>>
>>>> Regards
>>>>
>>>> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>>>>
>>>> Hi Sergio
>>>>>
>>>>> This is exactly i did and i mentioned in my last email
>>>>>
>>>>> *"What i understand is to enable option "Link ProperNouns only" in
>>>>>
>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>> chain"
>>>>> *
>>>>>
>>>>>
>>>>> I have already checked this option in my own entity hub linking engine
>>>>>
>>>>> By the way, did you get a chance to look at files i have shared in
>>>>> google
>>>>> drive folder. Did you notice any problems there ?
>>>>>
>>>>> I think using custom ontology with stanbol should be a very common use
>>>>> case
>>>>> and if there are issues getting it working, either i am doing something
>>>>> terribly wrong or there are some other reasons which i dont know.
>>>>>
>>>>> But anyways, i am persisting to solve this issue and any help on this
>>>>> from
>>>>> this dev community will be much appreciated
>>>>>
>>>>> best regards
>>>>> tarandeep
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at><
>>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>
>>>>> >>
>>>>> wrote:
>>>>>
>>>>> http://{stanbol}/system/******console/configMgr sorry
>>>>>
>>>>>>
>>>>>>
>>>>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>>>>
>>>>>> Have you check the
>>>>>>
>>>>>>> 1) go to http://{stanbol}/config/******system/console/configMgr
>>>>>>>
>>>>>>>
>>>>>>> 2) find your EntityHub Linking engine
>>>>>>>
>>>>>>> 3) and then "Link ProperNouns only"
>>>>>>>
>>>>>>> The documentation in that configuration is quite useful I think:
>>>>>>>
>>>>>>> "If activated only ProperNouns will be matched against the
>>>>>>> Vocabulary.
>>>>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>>>>> requires a tag of the POS TagSet to be mapped against
>>>>>>> 'olia:PorperNoun'.
>>>>>>> Otherwise mapping will not work as expected.
>>>>>>> (enhancer.engines.linking.******properNounsState)"
>>>>>>>
>>>>>>>
>>>>>>> Hope this help. You have to take into account such kind of issues are
>>>>>>> not easy to solve by email.
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>>>>
>>>>>>> Thanks Sergio for your response
>>>>>>>
>>>>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>>>>> entityhub linking and also to use "opennlp-pos" engine in my
>>>>>>>> weighted
>>>>>>>> chain
>>>>>>>>
>>>>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>>>>
>>>>>>>> Please find below the output RDF/XML from enhancer
>>>>>>>>
>>>>>>>> Request you to please let me know if i did not understand your
>>>>>>>> inputs
>>>>>>>> correctly
>>>>>>>>
>>>>>>>> One more thing, in our ontology (yet to be built) we will have
>>>>>>>> entities
>>>>>>>> which are other than people, places and organisations. For example,
>>>>>>>> belts,
>>>>>>>> bags etc
>>>>>>>>
>>>>>>>> best regards
>>>>>>>> tarandeep
>>>>>>>>
>>>>>>>> <rdf:RDF
>>>>>>>> xmlns:rdf="http://www.w3.org/******1999/02/22-rdf-syntax-ns#<http://www.w3.org/****1999/02/22-rdf-syntax-ns#>
>>>>>>>> <h**ttp://www.w3.org/**1999/02/22-**rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>>>>> >
>>>>>>>> <htt**p://www.w3.org/1999/02/**22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-**syntax-ns#>
>>>>>>>> <http://**www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>>>>> >
>>>>>>>> "
>>>>>>>> xmlns:j.0="http://purl.org/dc/******terms/<http://purl.org/dc/****terms/>
>>>>>>>> <http://purl.org/dc/****terms/ <http://purl.org/dc/**terms/>><
>>>>>>>> http://purl.org/dc/terms/>"
>>>>>>>> xmlns:j.1="http://fise.iks-****p**roject.eu/ontology/<
>>>>>>>> http://**project.eu/ontology/ <http://project.eu/ontology/>>
>>>>>>>> <http://**fise.iks-project.eu/**ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>> <http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>> >
>>>>>>>>
>>>>>>>>> **"
>>>>>>>>>
>>>>>>>> <rdf:Description
>>>>>>>> rdf:about="urn:enhancement-******197792bf-f1e8-47bf-626a-****
>>>>>>>> 3cdfbdb863b3">
>>>>>>>> <j.0:type rdf:resource="http://purl.org/****<http://purl.org/**>
>>>>>>>> **dc/terms/LinguisticSystem<ht**tp://purl.org/**dc/terms/**
>>>>>>>> LinguisticSystem <http://purl.org/**dc/terms/LinguisticSystem>>
>>>>>>>> <ht**tp://purl.org/dc/terms/****LinguisticSystem<http://purl.org/dc/terms/**LinguisticSystem>
>>>>>>>> <http://purl.**org/dc/terms/LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>>>>> >
>>>>>>>> "/>
>>>>>>>> <j.1:extracted-from
>>>>>>>> rdf:resource="urn:content-******item-sha1-****
>>>>>>>> 3b2998e66582544035454850d2dd81******
>>>>>>>> 755b747849"/>
>>>>>>>>
>>>>>>>> <j.1:confidence
>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#double<**
>>>>>>>> http**
>>>>>>>> ://www.w3.org/2001/XMLSchema#****double<http://www.w3.org/2001/XMLSchema#**double>
>>>>>>>> <http://www.w3.org/**2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>>>>>>> >
>>>>>>>> ">0.**9999964817340454</j.1:******confidence>
>>>>>>>>
>>>>>>>> <rdf:type
>>>>>>>> rdf:resource="http://fise.iks-******project.eu/ontology/******
>>>>>>>> Enhancement <http://project.eu/ontology/****Enhancement><
>>>>>>>> http://project.eu/**ontology/**Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>>> >
>>>>>>>> <http://fise.iks-**project.eu/**ontology/**Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>>> <http://**fise.iks-project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>>>>> >
>>>>>>>> "/>
>>>>>>>> <rdf:type
>>>>>>>> rdf:resource="http://fise.iks-******project.eu/ontology/****
>>>>>>>> TextAnnotation <http://project.eu/ontology/****TextAnnotation<http://project.eu/ontology/**TextAnnotation>
>>>>>>>> ><
>>>>>>>> http://fise.**iks-project.eu/**ontology/**TextAnnotation<http://iks-project.eu/ontology/**TextAnnotation>
>>>>>>>> <http**://fise.iks-project.eu/**ontology/TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>>>>> >
>>>>>>>> "/>
>>>>>>>> <j.0:language>en</j.0:******language>
>>>>>>>> <j.0:created
>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#dateTime<**
>>>>>>>> ht**
>>>>>>>> tp://www.w3.org/2001/****XMLSchema#dateTime<http://www.w3.org/2001/**XMLSchema#dateTime>
>>>>>>>> <http://www.**w3.org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>>>>> **>
>>>>>>>> ">**2013-07-15T14:25:43.829Z</****j.0:**created>
>>>>>>>>
>>>>>>>> <j.0:creator
>>>>>>>> rdf:datatype="http://www.w3.******org/2001/XMLSchema#string<**
>>>>>>>> http**
>>>>>>>> ://www.w3.org/2001/XMLSchema#****string<http://www.w3.org/2001/XMLSchema#**string>
>>>>>>>> <http://www.w3.org/**2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>>>>>>> >
>>>>>>>> ">**org.apache.stanbol.****enhancer.**engines.langdetect.******
>>>>>>>> LanguageDetectionEnhancementEn******gine</j.0:creator>
>>>>>>>>
>>>>>>>>
>>>>>>>> </rdf:Description>
>>>>>>>> </rdf:RDF>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>> http://salzburgresearch.at>
>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>> >
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> As I said: have you check the proper noun detection and POS
>>>>>>>> tagging
>>>>>>>> in
>>>>>>>>
>>>>>>>> your chain?
>>>>>>>>>
>>>>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>>>>> Salzburg,
>>>>>>>>> which is based in Austria" works at the demo server:
>>>>>>>>>
>>>>>>>>> http://dev.iks-project.eu:********8081/enhancer/chain/dbpedia-**
>>>>>>>>> ******
>>>>>>>>> proper-noun<http://dev.iks-****p**roject.eu:8081/enhancer/**<**
>>>>>>>>> http://project.eu:8081/**enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>>>>> >
>>>>>>>>> chain/dbpedia-proper-noun<**http**://dev.iks-project.eu:**8081/**<http://dev.iks-project.eu:8081/**>
>>>>>>>>> enhancer/chain/dbpedia-proper-****noun<http://dev.iks-project.**
>>>>>>>>> eu:8081/enhancer/chain/**dbpedia-proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>>>>> >
>>>>>>>>> Here the details:
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/****<http://stanbol.apache.org/******docs/trunk/components/****>
>>>>>>>>> <h**ttp://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>> >
>>>>>>>>> enhancer/engines/**<http://**s**tanbol.apache.org/**docs/**<http://stanbol.apache.org/**docs/**>
>>>>>>>>> trunk/components/**enhancer/****engines/**<http://stanbol.**
>>>>>>>>> apache.org/**docs/trunk/**components/**enhancer/engines/****<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>> >
>>>>>>>>> entitylinking#proper-noun-********linking-****
>>>>>>>>> wzxhzdk14enhancerengineslinkin********
>>>>>>>>> gpropernounsstatewzxhzdk15<****htt**p://stanbol.apache.org/****
>>>>>>>>> docs/** <http://stanbol.apache.org/**docs/**><
>>>>>>>>> http://stanbol.apache.**org/docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>>> >
>>>>>>>>> trunk/components/enhancer/******engines/entitylinking#proper-***
>>>>>>>>> ***
>>>>>>>>> noun-linking-******wzxhzdk14enhancerengineslinkin******
>>>>>>>>>
>>>>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>>>>> >
>>>>>>>>> Cheers,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>>>>
>>>>>>>>> Just to add to my previous email
>>>>>>>>>
>>>>>>>>> If i add another individual in my ontology "MyUniversity" under
>>>>>>>>>> class
>>>>>>>>>> University
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> <!--
>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> untitled-ontology-13#********MyUniversity--<http://www.**
>>>>>>>>>> semanticweb.org/vi5/******ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**>
>>>>>>>>>> <**http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>> >
>>>>>>>>>> ontology-13#MyUniversity--<**htt**p://www.semanticweb.org/**
>>>>>>>>>> vi5/** <http://www.semanticweb.org/vi5/**>
>>>>>>>>>> ontologies/2013/6/untitled-****ontology-13#MyUniversity--<htt**
>>>>>>>>>> p://www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>>> ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>>>>> >
>>>>>>>>>> <owl:NamedIndividual rdf:about="
>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> untitled-ontology-13#********MyUniversity<http://www.**
>>>>>>>>>> semanticweb.org/vi5/******ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**>
>>>>>>>>>> <**http://semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>> >
>>>>>>>>>> ontology-13#MyUniversity<http:****//www.semanticweb.org/vi5/**
>>>>>>>>>> ontologies/2013/6/untitled-****ontology-13#MyUniversity<http:**
>>>>>>>>>> //www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>>> ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>>>>> >
>>>>>>>>>> ">
>>>>>>>>>> <rdf:type rdf:resource="
>>>>>>>>>> http://www.semanticweb.org/********vi5/ontologies/2013/6/**<http://www.semanticweb.org/******vi5/ontologies/2013/6/**>
>>>>>>>>>> <ht**tp://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> <http**://www.semanticweb.org/****vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/**ontologies/2013/6/**>
>>>>>>>>>> <h**ttp://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>>> >
>>>>>>>>>> untitled-ontology-13#********University<http://www.****semant**
>>>>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>>>>> *ontologies/2013/6/untitled-******ontology-13#University<http:**
>>>>>>>>>> //**
>>>>>>>>>> www.semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>>> ontology-13#University<http://**www.semanticweb.org/vi5/**
>>>>>>>>>> ontologies/2013/6/untitled-**ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>>>>> >
>>>>>>>>>> "/>
>>>>>>>>>> <rdfs:label>MyUniversity</********rdfs:label>
>>>>>>>>>>
>>>>>>>>>> </owl:NamedIndividual>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> So with all configurations i have mentioned in the word document
>>>>>>>>>> (in
>>>>>>>>>> google
>>>>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>>>>> enhancement
>>>>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>>>>> "University" type
>>>>>>>>>>
>>>>>>>>>> But same set of configurations doesn't work with individual
>>>>>>>>>> "University of
>>>>>>>>>> Salzburg"
>>>>>>>>>>
>>>>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>>>>> able to
>>>>>>>>>> extract custom entities which has space in between, will be a
>>>>>>>>>> great
>>>>>>>>>> help
>>>>>>>>>> to
>>>>>>>>>> proceed further on our journey with using and contributing to
>>>>>>>>>> stanbol
>>>>>>>>>>
>>>>>>>>>> with best regards,
>>>>>>>>>> tarandeep
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Thanks Sergio and Dileepa for your responses
>>>>>>>>>>
>>>>>>>>>> We haven't been able to resolve the issue. We therefore decided
>>>>>>>>>> to
>>>>>>>>>>
>>>>>>>>>>> keep
>>>>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>>>>> our
>>>>>>>>>>> custom ontology and try to extract this entity and also link it
>>>>>>>>>>> but we
>>>>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>>>>> configurations.
>>>>>>>>>>>
>>>>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>>>>
>>>>>>>>>>> https://drive.google.com/********folderview?id=0B-**<https://drive.google.com/******folderview?id=0B-**>
>>>>>>>>>>> <https://**drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>>>>> >
>>>>>>>>>>> <https://**drive.google.com/****folderview?**id=0B-**<http://drive.google.com/**folderview?**id=0B-**>
>>>>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>>>>> >
>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=********sharing<https://drive.**
>>>>>>>>>>> google.com/folderview?id=0B-******vX9idwHlRtRFFOR000ZnBBOWM&**
>>>>>>>>>>> usp=**<http://google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**>
>>>>>>>>>>> **sharing<http://google.com/**folderview?id=0B-****
>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>>>>> >
>>>>>>>>>>> <https://drive.**google.com/**folderview?id=0B-**<http://google.com/folderview?id=0B-**>
>>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<https://drive.**
>>>>>>>>>>> google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>>>>> sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> This folder has 3 files:
>>>>>>>>>>>
>>>>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>>>>> configurations
>>>>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>>>>> weighted
>>>>>>>>>>> chain
>>>>>>>>>>> 2) our custom ontology
>>>>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL
>>>>>>>>>>> endpoint
>>>>>>>>>>>
>>>>>>>>>>> May i request you all to please look at these files and let us
>>>>>>>>>>> know
>>>>>>>>>>> if we
>>>>>>>>>>> are missing something in configurations.
>>>>>>>>>>>
>>>>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>>>>> for
>>>>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>>>>
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>> >
>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>> >
>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/**
>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>> >
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>> >
>>>>>>>>>>> entityhub/managedsite<http://******stanbol.apache.org/docs/**
>>>>>>>>>>> trunk/** <http://stanbol.apache.org/**docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>> >
>>>>>>>>>>> components/entityhub/******managedsite<http://stanbol.**
>>>>>>>>>>> apache.org/docs/trunk/****components/entityhub/****managedsite<http://apache.org/docs/trunk/**components/entityhub/**managedsite>
>>>>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**components/entityhub/*
>>>>>>>>>>> *managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>>>>> >
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/****<http://stanbol.apache.org/******docs/trunk/components/****>
>>>>>>>>>>> <h**ttp://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>>>> >
>>>>>>>>>>> enhancer/engines/**<http://**s**tanbol.apache.org/**docs/**<http://stanbol.apache.org/**docs/**>
>>>>>>>>>>> trunk/components/**enhancer/****engines/**<http://stanbol.**
>>>>>>>>>>> apache.org/**docs/trunk/**components/**enhancer/engines/****<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>>>> >
>>>>>>>>>>> entityhublinking<http://****stan**bol.apache.org/docs/**trunk/**<http://bol.apache.org/docs/trunk/**>
>>>>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>> >
>>>>>>>>>>> components/enhancer/engines/******entityhublinking<http://**
>>>>>>>>>>> stanbol.apache.org/docs/trunk/****components/enhancer/engines/**
>>>>>>>>>>> **<http://stanbol.apache.org/docs/trunk/**components/enhancer/engines/**>
>>>>>>>>>>> entityhublinking<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>> components/enhancer/engines/**entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>>>>> >
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>> >
>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>> >
>>>>>>>>>>> enhancer/chains/weightedchain.********html<http://stanbol.****
>>>>>>>>>>> apache.<http://stanbol.apache.**>
>>>>>>>>>>> **
>>>>>>>>>>> org/docs/trunk/components/******enhancer/chains/weightedchain.**
>>>>>>>>>>> **
>>>>>>>>>>> **html<http://stanbol.apache.****org/docs/trunk/components/**
>>>>>>>>>>> enhancer/chains/weightedchain.****html<http://stanbol.apache.**
>>>>>>>>>>> org/docs/trunk/components/**enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>>>>
>>>>>>>>>>> Best regards
>>>>>>>>>>> tarandeep
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>>>>> sergio.fernandez@******salzburgres**earch.at <
>>>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>>>> <sergio.fernandez@****salzburgre**search.at<http://**
>>>>>>>>>>> salzburgresearch.at <http://salzburgresearch.at>>
>>>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>>>> >
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I'm not an expert on entity linking, but from my experience
>>>>>>>>>>> such
>>>>>>>>>>>
>>>>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>>>>> details
>>>>>>>>>>>> at:
>>>>>>>>>>>>
>>>>>>>>>>>> http://stanbol.apache.org/**********docs/trunk/components/**<http://stanbol.apache.org/********docs/trunk/components/**>
>>>>>>>>>>>> <h**ttp://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <htt**p://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>>> <**http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http:**//stanbol.apache.org/******docs/**trunk/components/**<http://stanbol.apache.org/****docs/**trunk/components/**>
>>>>>>>>>>>> <**http://stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> <ht**tp://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> enhancer/engines/********entitylinking<http://stanbol.********
>>>>>>>>>>>> apache.org/docs/trunk/********components/enhancer/engines/****
>>>>>>>>>>>> ****<http://apache.org/docs/trunk/******components/enhancer/engines/******>
>>>>>>>>>>>> <http://apache.org/docs/**trunk/****components/enhancer/**
>>>>>>>>>>>> engines/****<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>>>>> >
>>>>>>>>>>>> entitylinking<http://apache.****org/docs/trunk/**components/**
>>>>>>>>>>>> enhancer/engines/******entitylinking<http://apache.**
>>>>>>>>>>>> org/docs/trunk/**components/**enhancer/engines/****
>>>>>>>>>>>> entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>>>>> >
>>>>>>>>>>>> <http://stanbol.**apache.org/****docs/trunk/**<http://apache.org/**docs/trunk/**>
>>>>>>>>>>>> <http://apache.**org/docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>>>>> >
>>>>>>>>>>>> components/enhancer/engines/******entitylinking<http://**
>>>>>>>>>>>> stanbol. <http://stanbol.>**
>>>>>>>>>>>> apache.org/docs/trunk/****components/enhancer/engines/**<http://apache.org/docs/trunk/**components/enhancer/engines/**>
>>>>>>>>>>>> entitylinking<http://stanbol.**apache.org/docs/trunk/**
>>>>>>>>>>>> components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>>>>> >
>>>>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>>>>> netiquette in
>>>>>>>>>>>> mailing lists. This is an open source community; therefore
>>>>>>>>>>>> messages
>>>>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it
>>>>>>>>>>>> on
>>>>>>>>>>>> Friday
>>>>>>>>>>>> afternoon, when people could be already out for weekend, or even
>>>>>>>>>>>> on
>>>>>>>>>>>> vacations.
>>>>>>>>>>>>
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Sergio
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>>>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>>>>
>>>>>>>>>>>>> vocabulary
>>>>>>>>>>>>> as
>>>>>>>>>>>>> per
>>>>>>>>>>>>> http://stanbol.apache.org/**********docs/trunk/**
>>>>>>>>>>>>> customvocabulary.**<http://stanbol.apache.org/********docs/trunk/customvocabulary.**>
>>>>>>>>>>>>> ****<http://stanbol.apache.**org/******docs/trunk/**
>>>>>>>>>>>>> customvocabulary.****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> **html<http://stanbol.apache.****org/****docs/trunk/**
>>>>>>>>>>>>> customvocabulary.****html<http**://stanbol.apache.org/******
>>>>>>>>>>>>> docs/trunk/customvocabulary.******html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <http://stanbol.apache.**org/******docs/trunk/****
>>>>>>>>>>>>> customvocabulary.**html<http:/****/stanbol.apache.org/**docs/*
>>>>>>>>>>>>> *** <http://stanbol.apache.org/**docs/**>
>>>>>>>>>>>>> trunk/customvocabulary.**html<**http://stanbol.apache.org/****
>>>>>>>>>>>>> docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>>>> >
>>>>>>>>>>>>> <http://stanbol.apache.**org/******docs/trunk/****
>>>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>>>> html<http://stanbol.apache.******org/docs/trunk/****
>>>>>>>>>>>>> customvocabulary.html<http://****stanbol.apache.org/docs/**
>>>>>>>>>>>>> trunk/** <http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/*
>>>>>>>>>>>>> *customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
>>>>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>>>>
>>>>>>>>>>>>> Configured Clerezza Yard.
>>>>>>>>>>>>> Configured Managed Yard site.
>>>>>>>>>>>>> Updated the site by plugging ontology(containing custom
>>>>>>>>>>>>> entities) .
>>>>>>>>>>>>> Configured Entity hub linking
>>>>>>>>>>>>> Engine(*customLinkingEngine*)
>>>>>>>>>>>>> with
>>>>>>>>>>>>> managed
>>>>>>>>>>>>> site.
>>>>>>>>>>>>> Configured a customChain which uses following engine
>>>>>>>>>>>>>
>>>>>>>>>>>>> - *langdetect*
>>>>>>>>>>>>> - *opennlp-sentence*
>>>>>>>>>>>>> - *opennlp-token*
>>>>>>>>>>>>> - *opennlp-pos*
>>>>>>>>>>>>> - *opennlp-chunker*
>>>>>>>>>>>>> - *customLinkingEngine*
>>>>>>>>>>>>>
>>>>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>>>>> *customChain*.
>>>>>>>>>>>>>
>>>>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>>>>> space in
>>>>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>>>>
>>>>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>>>>> stanbol
>>>>>>>>>>>>> instance) is rightly extracting entities like "Tommy
>>>>>>>>>>>>> Hilfiger".
>>>>>>>>>>>>>
>>>>>>>>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>>>>>>>>> dbpedia-disamb-linking *(configured in
>>>>>>>>>>>>> *dbpedia-disambiguation* )
>>>>>>>>>>>>> but
>>>>>>>>>>>>> it
>>>>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>>>>> now
>>>>>>>>>>>>>
>>>>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sergio Fernández
>>>>>>>>>>>>>
>>>>>>>>>>>> Salzburg Research
>>>>>>>>>>>> +43 662 2288 318
>>>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>> Sergio Fernández
>>>>>>>>> Salzburg Research
>>>>>>>>> +43 662 2288 318
>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>> Sergio Fernández
>>>>>> Salzburg Research
>>>>>> +43 662 2288 318
>>>>>> Jakob-Haringer Strasse 5/II
>>>>>> A-5020 Salzburg (Austria)
>>>>>> http://www.salzburgresearch.at
>>>>>>
>>>>>>
>>>>>> --
>>>>
>>>> ------------------------------
>>>> This message should be regarded as confidential. If you have received
>>>> this email in error please notify the sender and destroy it immediately.
>>>> Statements of intent shall only become binding when confirmed in hard
>>>> copy
>>>> by an authorised signatory.
>>>>
>>>> Zaizi Ltd is registered in England and Wales with the registration
>>>> number
>>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>>>> London W6 7AN.
>>>>
>>>
>>>
>>>
>
> --
>
> ------------------------------
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> London W6 7AN.
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: Working with custom vocabulary
Posted by Rafa Haro <rh...@zaizi.com>.
Hi Tarandeep,
Happy to hear you finally solve your problem. Could you please add a new
issue in the Stanbol Jira explaining the error with your ClerezzaYard site?
Thanks
El 16/07/13 13:38, Sawhney, Tarandeep Singh escribió:
> Hi Rafa
>
> I tried using SolrYard and it worked :-) So there seems to be a defect in
> ClerezzaYard
>
> thanks so much for pointing that out
>
> Do you have any information on when new version of stanbol is planned to be
> released and what will be covered in that release (feature/bug list etc)
>
> Also can i get some information on stanbol roadmap ahead
>
> thanks again for your help
>
> best regards
> tarandeep
>
>
> On Mon, Jul 15, 2013 at 10:47 PM, Sawhney, Tarandeep Singh <
> tsawhney@innodata.com> wrote:
>
>> Thanks Rafa for your response
>>
>> I will try resolving this issue based on pointers you have provided and
>> will post the update accordingly.
>>
>> Best regards
>> tarandeep
>>
>>
>> On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
>>
>>> Hi Tarandeep,
>>>
>>> As Sergio already pointed, you can check some different Entity Linking
>>> engines configurations at IKS development server:
>>> http://dev.iks-project.eu:**8081/enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>.
>>> You can try to use the same configuration of some of the chains registered
>>> in this Stanbol instance. For that, just go through the Felix Console (
>>> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
>>> **) and take a look to the different EntityHubLinkingEngine
>>> configurations. You can also try to use a Keyword Linking engine instead of
>>> an EntityHub Linking engine.
>>>
>>> Anyway, all the sites configured in this server are SolrYard based, so
>>> perhaps there is a bug in the ClerezzaYard entity search process for
>>> multi-words entities' labels. We might would need debug logs messages in
>>> order to find out the problem.
>>>
>>> Regards
>>>
>>> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>>>
>>>> Hi Sergio
>>>>
>>>> This is exactly i did and i mentioned in my last email
>>>>
>>>> *"What i understand is to enable option "Link ProperNouns only" in
>>>>
>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>> chain"
>>>> *
>>>>
>>>>
>>>> I have already checked this option in my own entity hub linking engine
>>>>
>>>> By the way, did you get a chance to look at files i have shared in google
>>>> drive folder. Did you notice any problems there ?
>>>>
>>>> I think using custom ontology with stanbol should be a very common use
>>>> case
>>>> and if there are issues getting it working, either i am doing something
>>>> terribly wrong or there are some other reasons which i dont know.
>>>>
>>>> But anyways, i am persisting to solve this issue and any help on this
>>>> from
>>>> this dev community will be much appreciated
>>>>
>>>> best regards
>>>> tarandeep
>>>>
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>>> wrote:
>>>>
>>>> http://{stanbol}/system/****console/configMgr sorry
>>>>>
>>>>>
>>>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>>>
>>>>> Have you check the
>>>>>> 1) go to http://{stanbol}/config/****system/console/configMgr
>>>>>>
>>>>>>
>>>>>> 2) find your EntityHub Linking engine
>>>>>>
>>>>>> 3) and then "Link ProperNouns only"
>>>>>>
>>>>>> The documentation in that configuration is quite useful I think:
>>>>>>
>>>>>> "If activated only ProperNouns will be matched against the Vocabulary.
>>>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>>>> requires a tag of the POS TagSet to be mapped against
>>>>>> 'olia:PorperNoun'.
>>>>>> Otherwise mapping will not work as expected.
>>>>>> (enhancer.engines.linking.****properNounsState)"
>>>>>>
>>>>>>
>>>>>> Hope this help. You have to take into account such kind of issues are
>>>>>> not easy to solve by email.
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>>>
>>>>>> Thanks Sergio for your response
>>>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>>>> chain
>>>>>>>
>>>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>>>
>>>>>>> Please find below the output RDF/XML from enhancer
>>>>>>>
>>>>>>> Request you to please let me know if i did not understand your inputs
>>>>>>> correctly
>>>>>>>
>>>>>>> One more thing, in our ontology (yet to be built) we will have
>>>>>>> entities
>>>>>>> which are other than people, places and organisations. For example,
>>>>>>> belts,
>>>>>>> bags etc
>>>>>>>
>>>>>>> best regards
>>>>>>> tarandeep
>>>>>>>
>>>>>>> <rdf:RDF
>>>>>>> xmlns:rdf="http://www.w3.org/****1999/02/22-rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>>>> <htt**p://www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>>>> "
>>>>>>> xmlns:j.0="http://purl.org/dc/****terms/<http://purl.org/dc/**terms/><
>>>>>>> http://purl.org/dc/terms/>"
>>>>>>> xmlns:j.1="http://fise.iks-**p**roject.eu/ontology/<http://project.eu/ontology/>
>>>>>>> <http://**fise.iks-project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>>>>> **"
>>>>>>> <rdf:Description
>>>>>>> rdf:about="urn:enhancement-****197792bf-f1e8-47bf-626a-****
>>>>>>> 3cdfbdb863b3">
>>>>>>> <j.0:type rdf:resource="http://purl.org/**
>>>>>>> **dc/terms/LinguisticSystem<http://purl.org/**dc/terms/LinguisticSystem>
>>>>>>> <ht**tp://purl.org/dc/terms/**LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>>>> "/>
>>>>>>> <j.1:extracted-from
>>>>>>> rdf:resource="urn:content-****item-sha1-****
>>>>>>> 3b2998e66582544035454850d2dd81****
>>>>>>> 755b747849"/>
>>>>>>>
>>>>>>> <j.1:confidence
>>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#double<http**
>>>>>>> ://www.w3.org/2001/XMLSchema#**double<http://www.w3.org/2001/XMLSchema#double>
>>>>>>> ">0.**9999964817340454</j.1:****confidence>
>>>>>>>
>>>>>>> <rdf:type
>>>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>>> <http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>>>> "/>
>>>>>>> <rdf:type
>>>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****
>>>>>>> TextAnnotation <http://project.eu/ontology/**TextAnnotation><
>>>>>>> http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>>>> "/>
>>>>>>> <j.0:language>en</j.0:****language>
>>>>>>> <j.0:created
>>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#dateTime<ht**
>>>>>>> tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>>>> ">**2013-07-15T14:25:43.829Z</**j.0:**created>
>>>>>>>
>>>>>>> <j.0:creator
>>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<http**
>>>>>>> ://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>>>>>>> ">**org.apache.stanbol.**enhancer.**engines.langdetect.****
>>>>>>> LanguageDetectionEnhancementEn****gine</j.0:creator>
>>>>>>>
>>>>>>>
>>>>>>> </rdf:Description>
>>>>>>> </rdf:RDF>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at>
>>>>>>> <se...@salzburgresearch.at>
>>>>>>> wrote:
>>>>>>>
>>>>>>> As I said: have you check the proper noun detection and POS tagging
>>>>>>> in
>>>>>>>
>>>>>>>> your chain?
>>>>>>>>
>>>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>>>> Salzburg,
>>>>>>>> which is based in Austria" works at the demo server:
>>>>>>>>
>>>>>>>> http://dev.iks-project.eu:******8081/enhancer/chain/dbpedia-******
>>>>>>>> proper-noun<http://dev.iks-**p**roject.eu:8081/enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>>>> chain/dbpedia-proper-noun<http**://dev.iks-project.eu:8081/**
>>>>>>>> enhancer/chain/dbpedia-proper-**noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>>>> Here the details:
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>> entitylinking#proper-noun-******linking-****
>>>>>>>> wzxhzdk14enhancerengineslinkin******
>>>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>>>
>>>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>>>
>>>>>>>> Just to add to my previous email
>>>>>>>>
>>>>>>>>> If i add another individual in my ontology "MyUniversity" under
>>>>>>>>> class
>>>>>>>>> University
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> <!--
>>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>> untitled-ontology-13#******MyUniversity--<http://www.**
>>>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>> ontology-13#MyUniversity--<htt**p://www.semanticweb.org/vi5/**
>>>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>>>> <owl:NamedIndividual rdf:about="
>>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>> untitled-ontology-13#******MyUniversity<http://www.**
>>>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>>> ontology-13#MyUniversity<http:**//www.semanticweb.org/vi5/**
>>>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>>>> ">
>>>>>>>>> <rdf:type rdf:resource="
>>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>>> untitled-ontology-13#******University<http://www.**semant**
>>>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>>>> *ontologies/2013/6/untitled-****ontology-13#University<http://**
>>>>>>>>> www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>>> ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>>>> "/>
>>>>>>>>> <rdfs:label>MyUniversity</******rdfs:label>
>>>>>>>>>
>>>>>>>>> </owl:NamedIndividual>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> So with all configurations i have mentioned in the word document (in
>>>>>>>>> google
>>>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>>>> enhancement
>>>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>>>> "University" type
>>>>>>>>>
>>>>>>>>> But same set of configurations doesn't work with individual
>>>>>>>>> "University of
>>>>>>>>> Salzburg"
>>>>>>>>>
>>>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>>>> able to
>>>>>>>>> extract custom entities which has space in between, will be a great
>>>>>>>>> help
>>>>>>>>> to
>>>>>>>>> proceed further on our journey with using and contributing to
>>>>>>>>> stanbol
>>>>>>>>>
>>>>>>>>> with best regards,
>>>>>>>>> tarandeep
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>>>
>>>>>>>>> Thanks Sergio and Dileepa for your responses
>>>>>>>>>
>>>>>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>>>>>>> keep
>>>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>>>> our
>>>>>>>>>> custom ontology and try to extract this entity and also link it
>>>>>>>>>> but we
>>>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>>>> configurations.
>>>>>>>>>>
>>>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>>>
>>>>>>>>>> https://drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=******sharing<https://drive.**
>>>>>>>>>> google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>>>> **sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>>>> <https://drive.**google.com/folderview?id=0B-**
>>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>>>>
>>>>>>>>>> This folder has 3 files:
>>>>>>>>>>
>>>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>>>> configurations
>>>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>>>> weighted
>>>>>>>>>> chain
>>>>>>>>>> 2) our custom ontology
>>>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>>>>>
>>>>>>>>>> May i request you all to please look at these files and let us know
>>>>>>>>>> if we
>>>>>>>>>> are missing something in configurations.
>>>>>>>>>>
>>>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>>>> for
>>>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>>>
>>>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**
>>>>>>>>>> customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**
>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>> entityhub/managedsite<http://****stanbol.apache.org/docs/**
>>>>>>>>>> trunk/** <http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>> components/entityhub/****managedsite<http://stanbol.**
>>>>>>>>>> apache.org/docs/trunk/**components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>>> entityhublinking<http://**stan**bol.apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>>> components/enhancer/engines/****entityhublinking<http://**
>>>>>>>>>> stanbol.apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>> enhancer/chains/weightedchain.******html<http://stanbol.**apache.<http://stanbol.apache.>
>>>>>>>>>> **
>>>>>>>>>> org/docs/trunk/components/****enhancer/chains/weightedchain.**
>>>>>>>>>> **html<http://stanbol.apache.**org/docs/trunk/components/**
>>>>>>>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>>>>
>>>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>>>
>>>>>>>>>> Best regards
>>>>>>>>>> tarandeep
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>>>> details
>>>>>>>>>>> at:
>>>>>>>>>>>
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>>> enhancer/engines/******entitylinking<http://stanbol.******
>>>>>>>>>>> apache.org/docs/trunk/******components/enhancer/engines/******<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>>>> entitylinking<http://apache.**org/docs/trunk/**components/**
>>>>>>>>>>> enhancer/engines/****entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>>>> <http://stanbol.**apache.org/**docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>>>> components/enhancer/engines/****entitylinking<http://stanbol.**
>>>>>>>>>>> apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>>>> entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>>>> netiquette in
>>>>>>>>>>> mailing lists. This is an open source community; therefore
>>>>>>>>>>> messages
>>>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it
>>>>>>>>>>> on
>>>>>>>>>>> Friday
>>>>>>>>>>> afternoon, when people could be already out for weekend, or even
>>>>>>>>>>> on
>>>>>>>>>>> vacations.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Sergio
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>>>> vocabulary
>>>>>>>>>>>> as
>>>>>>>>>>>> per
>>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/****
>>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>>>
>>>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>>>
>>>>>>>>>>>> Configured Clerezza Yard.
>>>>>>>>>>>> Configured Managed Yard site.
>>>>>>>>>>>> Updated the site by plugging ontology(containing custom
>>>>>>>>>>>> entities) .
>>>>>>>>>>>> Configured Entity hub linking Engine(*customLinkingEngine*)
>>>>>>>>>>>> with
>>>>>>>>>>>> managed
>>>>>>>>>>>> site.
>>>>>>>>>>>> Configured a customChain which uses following engine
>>>>>>>>>>>>
>>>>>>>>>>>> - *langdetect*
>>>>>>>>>>>> - *opennlp-sentence*
>>>>>>>>>>>> - *opennlp-token*
>>>>>>>>>>>> - *opennlp-pos*
>>>>>>>>>>>> - *opennlp-chunker*
>>>>>>>>>>>> - *customLinkingEngine*
>>>>>>>>>>>>
>>>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>>>> *customChain*.
>>>>>>>>>>>>
>>>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>>>> space in
>>>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>>>
>>>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>>>> stanbol
>>>>>>>>>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>>>>>>>>>
>>>>>>>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>>>>>> but
>>>>>>>>>>>> it
>>>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>>>
>>>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>>>> now
>>>>>>>>>>>>
>>>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> Sergio Fernández
>>>>>>>>>>> Salzburg Research
>>>>>>>>>>> +43 662 2288 318
>>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>> Sergio Fernández
>>>>>>>> Salzburg Research
>>>>>>>> +43 662 2288 318
>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>> Sergio Fernández
>>>>> Salzburg Research
>>>>> +43 662 2288 318
>>>>> Jakob-Haringer Strasse 5/II
>>>>> A-5020 Salzburg (Austria)
>>>>> http://www.salzburgresearch.at
>>>>>
>>>>>
>>> --
>>>
>>> ------------------------------
>>> This message should be regarded as confidential. If you have received
>>> this email in error please notify the sender and destroy it immediately.
>>> Statements of intent shall only become binding when confirmed in hard copy
>>> by an authorised signatory.
>>>
>>> Zaizi Ltd is registered in England and Wales with the registration number
>>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>>> London W6 7AN.
>>
>>
--
------------------------------
This message should be regarded as confidential. If you have received this
email in error please notify the sender and destroy it immediately.
Statements of intent shall only become binding when confirmed in hard copy
by an authorised signatory.
Zaizi Ltd is registered in England and Wales with the registration number
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
London W6 7AN.
Re: Working with custom vocabulary
Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Hi Rafa
I tried using SolrYard and it worked :-) So there seems to be a defect in
ClerezzaYard
thanks so much for pointing that out
Do you have any information on when new version of stanbol is planned to be
released and what will be covered in that release (feature/bug list etc)
Also can i get some information on stanbol roadmap ahead
thanks again for your help
best regards
tarandeep
On Mon, Jul 15, 2013 at 10:47 PM, Sawhney, Tarandeep Singh <
tsawhney@innodata.com> wrote:
> Thanks Rafa for your response
>
> I will try resolving this issue based on pointers you have provided and
> will post the update accordingly.
>
> Best regards
> tarandeep
>
>
> On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
>
>> Hi Tarandeep,
>>
>> As Sergio already pointed, you can check some different Entity Linking
>> engines configurations at IKS development server:
>> http://dev.iks-project.eu:**8081/enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>.
>> You can try to use the same configuration of some of the chains registered
>> in this Stanbol instance. For that, just go through the Felix Console (
>> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
>> **) and take a look to the different EntityHubLinkingEngine
>> configurations. You can also try to use a Keyword Linking engine instead of
>> an EntityHub Linking engine.
>>
>> Anyway, all the sites configured in this server are SolrYard based, so
>> perhaps there is a bug in the ClerezzaYard entity search process for
>> multi-words entities' labels. We might would need debug logs messages in
>> order to find out the problem.
>>
>> Regards
>>
>> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>>
>>> Hi Sergio
>>>
>>> This is exactly i did and i mentioned in my last email
>>>
>>> *"What i understand is to enable option "Link ProperNouns only" in
>>>
>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>> chain"
>>> *
>>>
>>>
>>> I have already checked this option in my own entity hub linking engine
>>>
>>> By the way, did you get a chance to look at files i have shared in google
>>> drive folder. Did you notice any problems there ?
>>>
>>> I think using custom ontology with stanbol should be a very common use
>>> case
>>> and if there are issues getting it working, either i am doing something
>>> terribly wrong or there are some other reasons which i dont know.
>>>
>>> But anyways, i am persisting to solve this issue and any help on this
>>> from
>>> this dev community will be much appreciated
>>>
>>> best regards
>>> tarandeep
>>>
>>>
>>>
>>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>> wrote:
>>>
>>> http://{stanbol}/system/****console/configMgr sorry
>>>>
>>>>
>>>>
>>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>>
>>>> Have you check the
>>>>>
>>>>> 1) go to http://{stanbol}/config/****system/console/configMgr
>>>>>
>>>>>
>>>>> 2) find your EntityHub Linking engine
>>>>>
>>>>> 3) and then "Link ProperNouns only"
>>>>>
>>>>> The documentation in that configuration is quite useful I think:
>>>>>
>>>>> "If activated only ProperNouns will be matched against the Vocabulary.
>>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>>> requires a tag of the POS TagSet to be mapped against
>>>>> 'olia:PorperNoun'.
>>>>> Otherwise mapping will not work as expected.
>>>>> (enhancer.engines.linking.****properNounsState)"
>>>>>
>>>>>
>>>>> Hope this help. You have to take into account such kind of issues are
>>>>> not easy to solve by email.
>>>>>
>>>>> Cheers,
>>>>>
>>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>>
>>>>> Thanks Sergio for your response
>>>>>>
>>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>>> chain
>>>>>>
>>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>>
>>>>>> Please find below the output RDF/XML from enhancer
>>>>>>
>>>>>> Request you to please let me know if i did not understand your inputs
>>>>>> correctly
>>>>>>
>>>>>> One more thing, in our ontology (yet to be built) we will have
>>>>>> entities
>>>>>> which are other than people, places and organisations. For example,
>>>>>> belts,
>>>>>> bags etc
>>>>>>
>>>>>> best regards
>>>>>> tarandeep
>>>>>>
>>>>>> <rdf:RDF
>>>>>> xmlns:rdf="http://www.w3.org/****1999/02/22-rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>>> <htt**p://www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>>> >
>>>>>> "
>>>>>> xmlns:j.0="http://purl.org/dc/****terms/<http://purl.org/dc/**terms/><
>>>>>> http://purl.org/dc/terms/>"
>>>>>> xmlns:j.1="http://fise.iks-**p**roject.eu/ontology/<http://project.eu/ontology/>
>>>>>> <http://**fise.iks-project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>>> >**"
>>>>>> <rdf:Description
>>>>>> rdf:about="urn:enhancement-****197792bf-f1e8-47bf-626a-****
>>>>>> 3cdfbdb863b3">
>>>>>> <j.0:type rdf:resource="http://purl.org/**
>>>>>> **dc/terms/LinguisticSystem<http://purl.org/**dc/terms/LinguisticSystem>
>>>>>> <ht**tp://purl.org/dc/terms/**LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>>> >
>>>>>> "/>
>>>>>> <j.1:extracted-from
>>>>>> rdf:resource="urn:content-****item-sha1-****
>>>>>> 3b2998e66582544035454850d2dd81****
>>>>>> 755b747849"/>
>>>>>>
>>>>>> <j.1:confidence
>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#double<http**
>>>>>> ://www.w3.org/2001/XMLSchema#**double<http://www.w3.org/2001/XMLSchema#double>
>>>>>> >
>>>>>> ">0.**9999964817340454</j.1:****confidence>
>>>>>>
>>>>>> <rdf:type
>>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>>>>>> <http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>>> >
>>>>>> "/>
>>>>>> <rdf:type
>>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****
>>>>>> TextAnnotation <http://project.eu/ontology/**TextAnnotation><
>>>>>> http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>>> >
>>>>>> "/>
>>>>>> <j.0:language>en</j.0:****language>
>>>>>> <j.0:created
>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#dateTime<ht**
>>>>>> tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>>> >
>>>>>> ">**2013-07-15T14:25:43.829Z</**j.0:**created>
>>>>>>
>>>>>> <j.0:creator
>>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<http**
>>>>>> ://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>>>>>> >
>>>>>> ">**org.apache.stanbol.**enhancer.**engines.langdetect.****
>>>>>> LanguageDetectionEnhancementEn****gine</j.0:creator>
>>>>>>
>>>>>>
>>>>>> </rdf:Description>
>>>>>> </rdf:RDF>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at>
>>>>>> <se...@salzburgresearch.at>
>>>>>> >>
>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>> As I said: have you check the proper noun detection and POS tagging
>>>>>> in
>>>>>>
>>>>>>> your chain?
>>>>>>>
>>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>>> Salzburg,
>>>>>>> which is based in Austria" works at the demo server:
>>>>>>>
>>>>>>> http://dev.iks-project.eu:******8081/enhancer/chain/dbpedia-******
>>>>>>> proper-noun<http://dev.iks-**p**roject.eu:8081/enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>>> chain/dbpedia-proper-noun<http**://dev.iks-project.eu:8081/**
>>>>>>> enhancer/chain/dbpedia-proper-**noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>>> >
>>>>>>>
>>>>>>> Here the details:
>>>>>>>
>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>> >
>>>>>>> entitylinking#proper-noun-******linking-****
>>>>>>> wzxhzdk14enhancerengineslinkin******
>>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>>
>>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>>> >
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>>
>>>>>>> Just to add to my previous email
>>>>>>>
>>>>>>>> If i add another individual in my ontology "MyUniversity" under
>>>>>>>> class
>>>>>>>> University
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> <!--
>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>> >
>>>>>>>> untitled-ontology-13#******MyUniversity--<http://www.**
>>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>> ontology-13#MyUniversity--<htt**p://www.semanticweb.org/vi5/**
>>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>>> >
>>>>>>>>
>>>>>>>> <owl:NamedIndividual rdf:about="
>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>> >
>>>>>>>> untitled-ontology-13#******MyUniversity<http://www.**
>>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>>> ontology-13#MyUniversity<http:**//www.semanticweb.org/vi5/**
>>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>>> >
>>>>>>>> ">
>>>>>>>> <rdf:type rdf:resource="
>>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>>> >
>>>>>>>> untitled-ontology-13#******University<http://www.**semant**
>>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>>> *ontologies/2013/6/untitled-****ontology-13#University<http://**
>>>>>>>> www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>>> ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>>> >
>>>>>>>> "/>
>>>>>>>> <rdfs:label>MyUniversity</******rdfs:label>
>>>>>>>>
>>>>>>>> </owl:NamedIndividual>
>>>>>>>>
>>>>>>>>
>>>>>>>> So with all configurations i have mentioned in the word document (in
>>>>>>>> google
>>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>>> enhancement
>>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>>> "University" type
>>>>>>>>
>>>>>>>> But same set of configurations doesn't work with individual
>>>>>>>> "University of
>>>>>>>> Salzburg"
>>>>>>>>
>>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>>> able to
>>>>>>>> extract custom entities which has space in between, will be a great
>>>>>>>> help
>>>>>>>> to
>>>>>>>> proceed further on our journey with using and contributing to
>>>>>>>> stanbol
>>>>>>>>
>>>>>>>> with best regards,
>>>>>>>> tarandeep
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>>
>>>>>>>> Thanks Sergio and Dileepa for your responses
>>>>>>>>
>>>>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>>>>>> keep
>>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>>> our
>>>>>>>>> custom ontology and try to extract this entity and also link it
>>>>>>>>> but we
>>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>>> configurations.
>>>>>>>>>
>>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>>
>>>>>>>>> https://drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>>> >
>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=******sharing<https://drive.**
>>>>>>>>> google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>>> **sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>>> <https://drive.**google.com/folderview?id=0B-**
>>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> This folder has 3 files:
>>>>>>>>>
>>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>>> configurations
>>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>>> weighted
>>>>>>>>> chain
>>>>>>>>> 2) our custom ontology
>>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>>>>
>>>>>>>>> May i request you all to please look at these files and let us know
>>>>>>>>> if we
>>>>>>>>> are missing something in configurations.
>>>>>>>>>
>>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>>> for
>>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**
>>>>>>>>> customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>> >
>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**
>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>> >
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>> >
>>>>>>>>> entityhub/managedsite<http://****stanbol.apache.org/docs/**
>>>>>>>>> trunk/** <http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>> components/entityhub/****managedsite<http://stanbol.**
>>>>>>>>> apache.org/docs/trunk/**components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> entityhublinking<http://**stan**bol.apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>>> components/enhancer/engines/****entityhublinking<http://**
>>>>>>>>> stanbol.apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>> >
>>>>>>>>> enhancer/chains/weightedchain.******html<http://stanbol.**apache.<http://stanbol.apache.>
>>>>>>>>> **
>>>>>>>>> org/docs/trunk/components/****enhancer/chains/weightedchain.**
>>>>>>>>> **html<http://stanbol.apache.**org/docs/trunk/components/**
>>>>>>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>>
>>>>>>>>> Best regards
>>>>>>>>> tarandeep
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>>> http://salzburgresearch.at>
>>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>>> >
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>>> details
>>>>>>>>>> at:
>>>>>>>>>>
>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>>> >
>>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>>> >
>>>>>>>>>> enhancer/engines/******entitylinking<http://stanbol.******
>>>>>>>>>> apache.org/docs/trunk/******components/enhancer/engines/******<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>>> entitylinking<http://apache.**org/docs/trunk/**components/**
>>>>>>>>>> enhancer/engines/****entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>> <http://stanbol.**apache.org/**docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>>> components/enhancer/engines/****entitylinking<http://stanbol.**
>>>>>>>>>> apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>>> entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>>> netiquette in
>>>>>>>>>> mailing lists. This is an open source community; therefore
>>>>>>>>>> messages
>>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it
>>>>>>>>>> on
>>>>>>>>>> Friday
>>>>>>>>>> afternoon, when people could be already out for weekend, or even
>>>>>>>>>> on
>>>>>>>>>> vacations.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Sergio
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>>> vocabulary
>>>>>>>>>>> as
>>>>>>>>>>> per
>>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>>> >
>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>>> >
>>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>>> customvocabulary.**
>>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/****
>>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>>
>>>>>>>>>>> Configured Clerezza Yard.
>>>>>>>>>>> Configured Managed Yard site.
>>>>>>>>>>> Updated the site by plugging ontology(containing custom
>>>>>>>>>>> entities) .
>>>>>>>>>>> Configured Entity hub linking Engine(*customLinkingEngine*)
>>>>>>>>>>> with
>>>>>>>>>>> managed
>>>>>>>>>>> site.
>>>>>>>>>>> Configured a customChain which uses following engine
>>>>>>>>>>>
>>>>>>>>>>> - *langdetect*
>>>>>>>>>>> - *opennlp-sentence*
>>>>>>>>>>> - *opennlp-token*
>>>>>>>>>>> - *opennlp-pos*
>>>>>>>>>>> - *opennlp-chunker*
>>>>>>>>>>> - *customLinkingEngine*
>>>>>>>>>>>
>>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>>> *customChain*.
>>>>>>>>>>>
>>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>>> space in
>>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>>
>>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>>> stanbol
>>>>>>>>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>>>>>>>>
>>>>>>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>>>>> but
>>>>>>>>>>> it
>>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>>
>>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>>> now
>>>>>>>>>>>
>>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>>
>>>>>>>>>>> Sergio Fernández
>>>>>>>>>> Salzburg Research
>>>>>>>>>> +43 662 2288 318
>>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> --
>>>>>>>>
>>>>>>> Sergio Fernández
>>>>>>> Salzburg Research
>>>>>>> +43 662 2288 318
>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>> A-5020 Salzburg (Austria)
>>>>>>> http://www.salzburgresearch.at
>>>>>>>
>>>>>>>
>>>>>>> --
>>>> Sergio Fernández
>>>> Salzburg Research
>>>> +43 662 2288 318
>>>> Jakob-Haringer Strasse 5/II
>>>> A-5020 Salzburg (Austria)
>>>> http://www.salzburgresearch.at
>>>>
>>>>
>>
>> --
>>
>> ------------------------------
>> This message should be regarded as confidential. If you have received
>> this email in error please notify the sender and destroy it immediately.
>> Statements of intent shall only become binding when confirmed in hard copy
>> by an authorised signatory.
>>
>> Zaizi Ltd is registered in England and Wales with the registration number
>> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
>> London W6 7AN.
>
>
>
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: Working with custom vocabulary
Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Thanks Rafa for your response
I will try resolving this issue based on pointers you have provided and
will post the update accordingly.
Best regards
tarandeep
On Mon, Jul 15, 2013 at 10:37 PM, Rafa Haro <rh...@zaizi.com> wrote:
> Hi Tarandeep,
>
> As Sergio already pointed, you can check some different Entity Linking
> engines configurations at IKS development server:
> http://dev.iks-project.eu:**8081/enhancer/chain<http://dev.iks-project.eu:8081/enhancer/chain>.
> You can try to use the same configuration of some of the chains registered
> in this Stanbol instance. For that, just go through the Felix Console (
> http://dev.iks-project.eu:**8081/system/console/configMgr/<http://dev.iks-project.eu:8081/system/console/configMgr/>
> **) and take a look to the different EntityHubLinkingEngine
> configurations. You can also try to use a Keyword Linking engine instead of
> an EntityHub Linking engine.
>
> Anyway, all the sites configured in this server are SolrYard based, so
> perhaps there is a bug in the ClerezzaYard entity search process for
> multi-words entities' labels. We might would need debug logs messages in
> order to find out the problem.
>
> Regards
>
> El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
>
>> Hi Sergio
>>
>> This is exactly i did and i mentioned in my last email
>>
>> *"What i understand is to enable option "Link ProperNouns only" in
>>
>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>> chain"
>> *
>>
>>
>> I have already checked this option in my own entity hub linking engine
>>
>> By the way, did you get a chance to look at files i have shared in google
>> drive folder. Did you notice any problems there ?
>>
>> I think using custom ontology with stanbol should be a very common use
>> case
>> and if there are issues getting it working, either i am doing something
>> terribly wrong or there are some other reasons which i dont know.
>>
>> But anyways, i am persisting to solve this issue and any help on this from
>> this dev community will be much appreciated
>>
>> best regards
>> tarandeep
>>
>>
>>
>> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>> wrote:
>>
>> http://{stanbol}/system/****console/configMgr sorry
>>>
>>>
>>>
>>> On 15/07/13 18:15, Sergio Fernández wrote:
>>>
>>> Have you check the
>>>>
>>>> 1) go to http://{stanbol}/config/****system/console/configMgr
>>>>
>>>>
>>>> 2) find your EntityHub Linking engine
>>>>
>>>> 3) and then "Link ProperNouns only"
>>>>
>>>> The documentation in that configuration is quite useful I think:
>>>>
>>>> "If activated only ProperNouns will be matched against the Vocabulary.
>>>> If deactivated any Noun will be matched. NOTE that this parameter
>>>> requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
>>>> Otherwise mapping will not work as expected.
>>>> (enhancer.engines.linking.****properNounsState)"
>>>>
>>>>
>>>> Hope this help. You have to take into account such kind of issues are
>>>> not easy to solve by email.
>>>>
>>>> Cheers,
>>>>
>>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>>
>>>> Thanks Sergio for your response
>>>>>
>>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>>> chain
>>>>>
>>>>> I did these changes but unable to extract "University of Salzberg"
>>>>>
>>>>> Please find below the output RDF/XML from enhancer
>>>>>
>>>>> Request you to please let me know if i did not understand your inputs
>>>>> correctly
>>>>>
>>>>> One more thing, in our ontology (yet to be built) we will have entities
>>>>> which are other than people, places and organisations. For example,
>>>>> belts,
>>>>> bags etc
>>>>>
>>>>> best regards
>>>>> tarandeep
>>>>>
>>>>> <rdf:RDF
>>>>> xmlns:rdf="http://www.w3.org/****1999/02/22-rdf-syntax-ns#<http://www.w3.org/**1999/02/22-rdf-syntax-ns#>
>>>>> <htt**p://www.w3.org/1999/02/22-rdf-**syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>>> >
>>>>> "
>>>>> xmlns:j.0="http://purl.org/dc/****terms/<http://purl.org/dc/**terms/><
>>>>> http://purl.org/dc/terms/>"
>>>>> xmlns:j.1="http://fise.iks-**p**roject.eu/ontology/<http://project.eu/ontology/>
>>>>> <http://**fise.iks-project.eu/ontology/<http://fise.iks-project.eu/ontology/>
>>>>> >**"
>>>>> <rdf:Description
>>>>> rdf:about="urn:enhancement-****197792bf-f1e8-47bf-626a-****
>>>>> 3cdfbdb863b3">
>>>>> <j.0:type rdf:resource="http://purl.org/**
>>>>> **dc/terms/LinguisticSystem<http://purl.org/**dc/terms/LinguisticSystem>
>>>>> <ht**tp://purl.org/dc/terms/**LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>>> >
>>>>> "/>
>>>>> <j.1:extracted-from
>>>>> rdf:resource="urn:content-****item-sha1-****
>>>>> 3b2998e66582544035454850d2dd81****
>>>>> 755b747849"/>
>>>>>
>>>>> <j.1:confidence
>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#double<http**
>>>>> ://www.w3.org/2001/XMLSchema#**double<http://www.w3.org/2001/XMLSchema#double>
>>>>> >
>>>>> ">0.**9999964817340454</j.1:****confidence>
>>>>>
>>>>> <rdf:type
>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****Enhancement<http://project.eu/ontology/**Enhancement>
>>>>> <http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>>> >
>>>>> "/>
>>>>> <rdf:type
>>>>> rdf:resource="http://fise.iks-****project.eu/ontology/****
>>>>> TextAnnotation <http://project.eu/ontology/**TextAnnotation><
>>>>> http://fise.**iks-project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>>> >
>>>>> "/>
>>>>> <j.0:language>en</j.0:****language>
>>>>> <j.0:created
>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#dateTime<ht**
>>>>> tp://www.w3.org/2001/**XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>>> >
>>>>> ">**2013-07-15T14:25:43.829Z</**j.0:**created>
>>>>>
>>>>> <j.0:creator
>>>>> rdf:datatype="http://www.w3.****org/2001/XMLSchema#string<http**
>>>>> ://www.w3.org/2001/XMLSchema#**string<http://www.w3.org/2001/XMLSchema#string>
>>>>> >
>>>>> ">**org.apache.stanbol.**enhancer.**engines.langdetect.****
>>>>> LanguageDetectionEnhancementEn****gine</j.0:creator>
>>>>>
>>>>>
>>>>> </rdf:Description>
>>>>> </rdf:RDF>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at><
>>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>
>>>>> >>
>>>>>
>>>>> wrote:
>>>>>
>>>>> As I said: have you check the proper noun detection and POS tagging
>>>>> in
>>>>>
>>>>>> your chain?
>>>>>>
>>>>>> For instance, enhancing the text "I studied at the University of
>>>>>> Salzburg,
>>>>>> which is based in Austria" works at the demo server:
>>>>>>
>>>>>> http://dev.iks-project.eu:******8081/enhancer/chain/dbpedia-******
>>>>>> proper-noun<http://dev.iks-**p**roject.eu:8081/enhancer/**<http://project.eu:8081/enhancer/**>
>>>>>> chain/dbpedia-proper-noun<http**://dev.iks-project.eu:8081/**
>>>>>> enhancer/chain/dbpedia-proper-**noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>> >
>>>>>>
>>>>>> Here the details:
>>>>>>
>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>> >
>>>>>> entitylinking#proper-noun-******linking-****
>>>>>> wzxhzdk14enhancerengineslinkin******
>>>>>> gpropernounsstatewzxhzdk15<**htt**p://stanbol.apache.org/**docs/**<http://stanbol.apache.org/docs/**>
>>>>>> trunk/components/enhancer/****engines/entitylinking#proper-****
>>>>>> noun-linking-****wzxhzdk14enhancerengineslinkin****
>>>>>>
>>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>> >
>>>>>>
>>>>>> Cheers,
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>>
>>>>>> Just to add to my previous email
>>>>>>
>>>>>>> If i add another individual in my ontology "MyUniversity" under class
>>>>>>> University
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> <!--
>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>> >
>>>>>>> untitled-ontology-13#******MyUniversity--<http://www.**
>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>> ontology-13#MyUniversity--<htt**p://www.semanticweb.org/vi5/**
>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>> >
>>>>>>>
>>>>>>> <owl:NamedIndividual rdf:about="
>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>> >
>>>>>>> untitled-ontology-13#******MyUniversity<http://www.**
>>>>>>> semanticweb.org/vi5/****ontologies/2013/6/untitled-**<http://semanticweb.org/vi5/**ontologies/2013/6/untitled-**>
>>>>>>> ontology-13#MyUniversity<http:**//www.semanticweb.org/vi5/**
>>>>>>> ontologies/2013/6/untitled-**ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>>> >
>>>>>>> ">
>>>>>>> <rdf:type rdf:resource="
>>>>>>> http://www.semanticweb.org/******vi5/ontologies/2013/6/**<http://www.semanticweb.org/****vi5/ontologies/2013/6/**>
>>>>>>> <http**://www.semanticweb.org/**vi5/**ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>>> >
>>>>>>> untitled-ontology-13#******University<http://www.**semant**
>>>>>>> icweb.org/vi5/* <http://semanticweb.org/vi5/*>
>>>>>>> *ontologies/2013/6/untitled-****ontology-13#University<http://**
>>>>>>> www.semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>>> ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>>> >
>>>>>>> "/>
>>>>>>> <rdfs:label>MyUniversity</******rdfs:label>
>>>>>>>
>>>>>>> </owl:NamedIndividual>
>>>>>>>
>>>>>>>
>>>>>>> So with all configurations i have mentioned in the word document (in
>>>>>>> google
>>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>>> enhancement
>>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>>> "University" type
>>>>>>>
>>>>>>> But same set of configurations doesn't work with individual
>>>>>>> "University of
>>>>>>> Salzburg"
>>>>>>>
>>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>>> able to
>>>>>>> extract custom entities which has space in between, will be a great
>>>>>>> help
>>>>>>> to
>>>>>>> proceed further on our journey with using and contributing to stanbol
>>>>>>>
>>>>>>> with best regards,
>>>>>>> tarandeep
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>>> tsawhney@innodata.com> wrote:
>>>>>>>
>>>>>>> Thanks Sergio and Dileepa for your responses
>>>>>>>
>>>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>>>>> keep
>>>>>>>> just one class and one instance value "University of Salzburg" in
>>>>>>>> our
>>>>>>>> custom ontology and try to extract this entity and also link it but
>>>>>>>> we
>>>>>>>> could not get this running. I am sure we are missing some
>>>>>>>> configurations.
>>>>>>>>
>>>>>>>> I am sharing a google drive folder at below link
>>>>>>>>
>>>>>>>> https://drive.google.com/******folderview?id=0B-**<https://drive.google.com/****folderview?id=0B-**>
>>>>>>>> <https://**drive.google.com/**folderview?**id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>>> >
>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=******sharing<https://drive.**
>>>>>>>> google.com/folderview?id=0B-****vX9idwHlRtRFFOR000ZnBBOWM&usp=**
>>>>>>>> **sharing<http://google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing>
>>>>>>>> <https://drive.**google.com/folderview?id=0B-**
>>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>> This folder has 3 files:
>>>>>>>>
>>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>>> configurations
>>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>>> weighted
>>>>>>>> chain
>>>>>>>> 2) our custom ontology
>>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>>>
>>>>>>>> May i request you all to please look at these files and let us know
>>>>>>>> if we
>>>>>>>> are missing something in configurations.
>>>>>>>>
>>>>>>>> We have referred to below web links in order to configure stanbol
>>>>>>>> for
>>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>> >
>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>> >
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>> >
>>>>>>>> entityhub/managedsite<http://****stanbol.apache.org/docs/**trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>> components/entityhub/****managedsite<http://stanbol.**
>>>>>>>> apache.org/docs/trunk/**components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>> >
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/****<http://stanbol.apache.org/****docs/trunk/components/****>
>>>>>>>> enhancer/engines/**<http://**stanbol.apache.org/**docs/**
>>>>>>>> trunk/components/**enhancer/**engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>> >
>>>>>>>>
>>>>>>>> entityhublinking<http://**stan**bol.apache.org/docs/trunk/**<http://stanbol.apache.org/docs/trunk/**>
>>>>>>>> components/enhancer/engines/****entityhublinking<http://**
>>>>>>>> stanbol.apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>> >
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>> >
>>>>>>>> enhancer/chains/weightedchain.******html<http://stanbol.**apache.<http://stanbol.apache.>
>>>>>>>> **
>>>>>>>> org/docs/trunk/components/****enhancer/chains/weightedchain.**
>>>>>>>> **html<http://stanbol.apache.**org/docs/trunk/components/**
>>>>>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks in advance for your valuable help.
>>>>>>>>
>>>>>>>> Best regards
>>>>>>>> tarandeep
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>>> sergio.fernandez@****salzburgres**earch.at <
>>>>>>>> http://salzburgresearch.at>
>>>>>>>> <sergio.fernandez@**salzburgre**search.at<http://salzburgresearch.at>
>>>>>>>> <se...@salzburgresearch.at>
>>>>>>>> >
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>>> details
>>>>>>>>> at:
>>>>>>>>>
>>>>>>>>> http://stanbol.apache.org/********docs/trunk/components/**<http://stanbol.apache.org/******docs/trunk/components/**>
>>>>>>>>> <htt**p://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>>> >
>>>>>>>>> <http:**//stanbol.apache.org/****docs/**trunk/components/**<http://stanbol.apache.org/**docs/**trunk/components/**>
>>>>>>>>> <ht**tp://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>>> >
>>>>>>>>> enhancer/engines/******entitylinking<http://stanbol.******
>>>>>>>>> apache.org/docs/trunk/******components/enhancer/engines/******<http://apache.org/docs/trunk/****components/enhancer/engines/****>
>>>>>>>>> entitylinking<http://apache.**org/docs/trunk/**components/**
>>>>>>>>> enhancer/engines/****entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> <http://stanbol.**apache.org/**docs/trunk/**<http://apache.org/docs/trunk/**>
>>>>>>>>> components/enhancer/engines/****entitylinking<http://stanbol.**
>>>>>>>>> apache.org/docs/trunk/**components/enhancer/engines/**
>>>>>>>>> entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>>> netiquette in
>>>>>>>>> mailing lists. This is an open source community; therefore messages
>>>>>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>>>>>> Friday
>>>>>>>>> afternoon, when people could be already out for weekend, or even on
>>>>>>>>> vacations.
>>>>>>>>>
>>>>>>>>> Best,
>>>>>>>>> Sergio
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>>>> vocabulary
>>>>>>>>>> as
>>>>>>>>>> per
>>>>>>>>>> http://stanbol.apache.org/********docs/trunk/customvocabulary.**
>>>>>>>>>> ****<http://stanbol.apache.org/******docs/trunk/customvocabulary.****>
>>>>>>>>>> **html<http://stanbol.apache.**org/****docs/trunk/**
>>>>>>>>>> customvocabulary.****html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>>> >
>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>> customvocabulary.**html<http:/**/stanbol.apache.org/**docs/**
>>>>>>>>>> trunk/customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>>> >
>>>>>>>>>> <http://stanbol.apache.**org/****docs/trunk/****
>>>>>>>>>> customvocabulary.**
>>>>>>>>>> html<http://stanbol.apache.****org/docs/trunk/****
>>>>>>>>>> customvocabulary.html<http://**stanbol.apache.org/docs/trunk/**
>>>>>>>>>> customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Following are the steps followed -
>>>>>>>>>>
>>>>>>>>>> Configured Clerezza Yard.
>>>>>>>>>> Configured Managed Yard site.
>>>>>>>>>> Updated the site by plugging ontology(containing custom
>>>>>>>>>> entities) .
>>>>>>>>>> Configured Entity hub linking Engine(*customLinkingEngine*)
>>>>>>>>>> with
>>>>>>>>>> managed
>>>>>>>>>> site.
>>>>>>>>>> Configured a customChain which uses following engine
>>>>>>>>>>
>>>>>>>>>> - *langdetect*
>>>>>>>>>> - *opennlp-sentence*
>>>>>>>>>> - *opennlp-token*
>>>>>>>>>> - *opennlp-pos*
>>>>>>>>>> - *opennlp-chunker*
>>>>>>>>>> - *customLinkingEngine*
>>>>>>>>>>
>>>>>>>>>> Now, i am able to extract entities like Adidas using
>>>>>>>>>> *customChain*.
>>>>>>>>>>
>>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>>> space in
>>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>>
>>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>>> stanbol
>>>>>>>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>>>>>>>
>>>>>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>>>> but
>>>>>>>>>> it
>>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>>
>>>>>>>>>> I have invested more than a week now and running out of options
>>>>>>>>>> now
>>>>>>>>>>
>>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>>
>>>>>>>>>> Sergio Fernández
>>>>>>>>> Salzburg Research
>>>>>>>>> +43 662 2288 318
>>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>> --
>>>>>>>
>>>>>> Sergio Fernández
>>>>>> Salzburg Research
>>>>>> +43 662 2288 318
>>>>>> Jakob-Haringer Strasse 5/II
>>>>>> A-5020 Salzburg (Austria)
>>>>>> http://www.salzburgresearch.at
>>>>>>
>>>>>>
>>>>>> --
>>> Sergio Fernández
>>> Salzburg Research
>>> +43 662 2288 318
>>> Jakob-Haringer Strasse 5/II
>>> A-5020 Salzburg (Austria)
>>> http://www.salzburgresearch.at
>>>
>>>
>
> --
>
> ------------------------------
> This message should be regarded as confidential. If you have received this
> email in error please notify the sender and destroy it immediately.
> Statements of intent shall only become binding when confirmed in hard copy
> by an authorised signatory.
>
> Zaizi Ltd is registered in England and Wales with the registration number
> 6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
> London W6 7AN.
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: Working with custom vocabulary
Posted by Rafa Haro <rh...@zaizi.com>.
Hi Tarandeep,
As Sergio already pointed, you can check some different Entity Linking
engines configurations at IKS development server:
http://dev.iks-project.eu:8081/enhancer/chain. You can try to use the
same configuration of some of the chains registered in this Stanbol
instance. For that, just go through the Felix Console
(http://dev.iks-project.eu:8081/system/console/configMgr/) and take a
look to the different EntityHubLinkingEngine configurations. You can
also try to use a Keyword Linking engine instead of an EntityHub Linking
engine.
Anyway, all the sites configured in this server are SolrYard based, so
perhaps there is a bug in the ClerezzaYard entity search process for
multi-words entities' labels. We might would need debug logs messages in
order to find out the problem.
Regards
El 15/07/13 18:28, Sawhney, Tarandeep Singh escribió:
> Hi Sergio
>
> This is exactly i did and i mentioned in my last email
>
> *"What i understand is to enable option "Link ProperNouns only" in
> entityhub linking and also to use "opennlp-pos" engine in my weighted chain"
> *
>
> I have already checked this option in my own entity hub linking engine
>
> By the way, did you get a chance to look at files i have shared in google
> drive folder. Did you notice any problems there ?
>
> I think using custom ontology with stanbol should be a very common use case
> and if there are issues getting it working, either i am doing something
> terribly wrong or there are some other reasons which i dont know.
>
> But anyways, i am persisting to solve this issue and any help on this from
> this dev community will be much appreciated
>
> best regards
> tarandeep
>
>
>
> On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
> sergio.fernandez@salzburgresearch.at> wrote:
>
>> http://{stanbol}/system/**console/configMgr sorry
>>
>>
>> On 15/07/13 18:15, Sergio Fernández wrote:
>>
>>> Have you check the
>>>
>>> 1) go to http://{stanbol}/config/**system/console/configMgr
>>>
>>> 2) find your EntityHub Linking engine
>>>
>>> 3) and then "Link ProperNouns only"
>>>
>>> The documentation in that configuration is quite useful I think:
>>>
>>> "If activated only ProperNouns will be matched against the Vocabulary.
>>> If deactivated any Noun will be matched. NOTE that this parameter
>>> requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
>>> Otherwise mapping will not work as expected.
>>> (enhancer.engines.linking.**properNounsState)"
>>>
>>> Hope this help. You have to take into account such kind of issues are
>>> not easy to solve by email.
>>>
>>> Cheers,
>>>
>>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>>
>>>> Thanks Sergio for your response
>>>>
>>>> What i understand is to enable option *"Link ProperNouns only"* in
>>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>>> chain
>>>>
>>>> I did these changes but unable to extract "University of Salzberg"
>>>>
>>>> Please find below the output RDF/XML from enhancer
>>>>
>>>> Request you to please let me know if i did not understand your inputs
>>>> correctly
>>>>
>>>> One more thing, in our ontology (yet to be built) we will have entities
>>>> which are other than people, places and organisations. For example,
>>>> belts,
>>>> bags etc
>>>>
>>>> best regards
>>>> tarandeep
>>>>
>>>> <rdf:RDF
>>>> xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>>> "
>>>> xmlns:j.0="http://purl.org/dc/**terms/ <http://purl.org/dc/terms/>"
>>>> xmlns:j.1="http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>"
>>>> <rdf:Description
>>>> rdf:about="urn:enhancement-**197792bf-f1e8-47bf-626a-**3cdfbdb863b3">
>>>> <j.0:type rdf:resource="http://purl.org/**dc/terms/LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>>> "/>
>>>> <j.1:extracted-from
>>>> rdf:resource="urn:content-**item-sha1-**3b2998e66582544035454850d2dd81**
>>>> 755b747849"/>
>>>>
>>>> <j.1:confidence
>>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>>> ">0.**9999964817340454</j.1:**confidence>
>>>>
>>>> <rdf:type
>>>> rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>>> "/>
>>>> <rdf:type
>>>> rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>>> "/>
>>>> <j.0:language>en</j.0:**language>
>>>> <j.0:created
>>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>>> ">**2013-07-15T14:25:43.829Z</j.0:**created>
>>>>
>>>> <j.0:creator
>>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>>> ">**org.apache.stanbol.enhancer.**engines.langdetect.**
>>>> LanguageDetectionEnhancementEn**gine</j.0:creator>
>>>>
>>>> </rdf:Description>
>>>> </rdf:RDF>
>>>>
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>>> wrote:
>>>>
>>>> As I said: have you check the proper noun detection and POS tagging in
>>>>> your chain?
>>>>>
>>>>> For instance, enhancing the text "I studied at the University of
>>>>> Salzburg,
>>>>> which is based in Austria" works at the demo server:
>>>>>
>>>>> http://dev.iks-project.eu:****8081/enhancer/chain/dbpedia-****
>>>>> proper-noun<http://dev.iks-**project.eu:8081/enhancer/**
>>>>> chain/dbpedia-proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>>>
>>>>> Here the details:
>>>>>
>>>>> http://stanbol.apache.org/****docs/trunk/components/****
>>>>> enhancer/engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>> entitylinking#proper-noun-****linking-****
>>>>> wzxhzdk14enhancerengineslinkin****
>>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>>>
>>>>> Cheers,
>>>>>
>>>>>
>>>>>
>>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>>
>>>>> Just to add to my previous email
>>>>>> If i add another individual in my ontology "MyUniversity" under class
>>>>>> University
>>>>>>
>>>>>>
>>>>>>
>>>>>> <!--
>>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>> untitled-ontology-13#****MyUniversity--<http://www.**
>>>>>> semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>> ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>>>
>>>>>> <owl:NamedIndividual rdf:about="
>>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>> untitled-ontology-13#****MyUniversity<http://www.**
>>>>>> semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>>> ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>>> ">
>>>>>> <rdf:type rdf:resource="
>>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>>> untitled-ontology-13#****University<http://www.**semanticweb.org/vi5/*
>>>>>> *ontologies/2013/6/untitled-**ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>>> "/>
>>>>>> <rdfs:label>MyUniversity</****rdfs:label>
>>>>>> </owl:NamedIndividual>
>>>>>>
>>>>>>
>>>>>> So with all configurations i have mentioned in the word document (in
>>>>>> google
>>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>>> enhancement
>>>>>> chain is able to extract "MyUniversity" and link it with
>>>>>> "University" type
>>>>>>
>>>>>> But same set of configurations doesn't work with individual
>>>>>> "University of
>>>>>> Salzburg"
>>>>>>
>>>>>> If anyone of you please provide help on what are we missing to be
>>>>>> able to
>>>>>> extract custom entities which has space in between, will be a great
>>>>>> help
>>>>>> to
>>>>>> proceed further on our journey with using and contributing to stanbol
>>>>>>
>>>>>> with best regards,
>>>>>> tarandeep
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>>> tsawhney@innodata.com> wrote:
>>>>>>
>>>>>> Thanks Sergio and Dileepa for your responses
>>>>>>
>>>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>>>> keep
>>>>>>> just one class and one instance value "University of Salzburg" in our
>>>>>>> custom ontology and try to extract this entity and also link it but we
>>>>>>> could not get this running. I am sure we are missing some
>>>>>>> configurations.
>>>>>>>
>>>>>>> I am sharing a google drive folder at below link
>>>>>>>
>>>>>>> https://drive.google.com/****folderview?id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<https://drive.**
>>>>>>> google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>>>
>>>>>>> This folder has 3 files:
>>>>>>>
>>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>>> configurations
>>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>>> weighted
>>>>>>> chain
>>>>>>> 2) our custom ontology
>>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>>
>>>>>>> May i request you all to please look at these files and let us know
>>>>>>> if we
>>>>>>> are missing something in configurations.
>>>>>>>
>>>>>>> We have referred to below web links in order to configure stanbol for
>>>>>>> using our custom ontology for entity extraction and linking
>>>>>>>
>>>>>>> http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>> entityhub/managedsite<http://**stanbol.apache.org/docs/trunk/**
>>>>>>> components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>>>
>>>>>>> http://stanbol.apache.org/****docs/trunk/components/****
>>>>>>> enhancer/engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>>
>>>>>>> entityhublinking<http://**stanbol.apache.org/docs/trunk/**
>>>>>>> components/enhancer/engines/**entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>>>
>>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>> enhancer/chains/weightedchain.****html<http://stanbol.apache.**
>>>>>>> org/docs/trunk/components/**enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>>>
>>>>>>> Thanks in advance for your valuable help.
>>>>>>>
>>>>>>> Best regards
>>>>>>> tarandeep
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at>
>>>>>>> <se...@salzburgresearch.at>
>>>>>>> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>>> details
>>>>>>>> at:
>>>>>>>>
>>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>>> enhancer/engines/****entitylinking<http://stanbol.****
>>>>>>>> apache.org/docs/trunk/****components/enhancer/engines/****
>>>>>>>> entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>>> <http://stanbol.**apache.org/docs/trunk/**
>>>>>>>> components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>>>
>>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>>> netiquette in
>>>>>>>> mailing lists. This is an open source community; therefore messages
>>>>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>>>>> Friday
>>>>>>>> afternoon, when people could be already out for weekend, or even on
>>>>>>>> vacations.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Sergio
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>>> vocabulary
>>>>>>>>> as
>>>>>>>>> per
>>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>>>
>>>>>>>>> Following are the steps followed -
>>>>>>>>>
>>>>>>>>> Configured Clerezza Yard.
>>>>>>>>> Configured Managed Yard site.
>>>>>>>>> Updated the site by plugging ontology(containing custom
>>>>>>>>> entities) .
>>>>>>>>> Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>>>>>> managed
>>>>>>>>> site.
>>>>>>>>> Configured a customChain which uses following engine
>>>>>>>>>
>>>>>>>>> - *langdetect*
>>>>>>>>> - *opennlp-sentence*
>>>>>>>>> - *opennlp-token*
>>>>>>>>> - *opennlp-pos*
>>>>>>>>> - *opennlp-chunker*
>>>>>>>>> - *customLinkingEngine*
>>>>>>>>>
>>>>>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>>>>>
>>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>>> space in
>>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>>
>>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>>> stanbol
>>>>>>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>>>>>>
>>>>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>>> but
>>>>>>>>> it
>>>>>>>>> didn't work to extract above entity.
>>>>>>>>>
>>>>>>>>> I have invested more than a week now and running out of options now
>>>>>>>>>
>>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>
>>>>>>>> Sergio Fernández
>>>>>>>> Salzburg Research
>>>>>>>> +43 662 2288 318
>>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>>> A-5020 Salzburg (Austria)
>>>>>>>> http://www.salzburgresearch.at
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>> --
>>>>> Sergio Fernández
>>>>> Salzburg Research
>>>>> +43 662 2288 318
>>>>> Jakob-Haringer Strasse 5/II
>>>>> A-5020 Salzburg (Austria)
>>>>> http://www.salzburgresearch.at
>>>>>
>>>>>
>> --
>> Sergio Fernández
>> Salzburg Research
>> +43 662 2288 318
>> Jakob-Haringer Strasse 5/II
>> A-5020 Salzburg (Austria)
>> http://www.salzburgresearch.at
>>
--
------------------------------
This message should be regarded as confidential. If you have received this
email in error please notify the sender and destroy it immediately.
Statements of intent shall only become binding when confirmed in hard copy
by an authorised signatory.
Zaizi Ltd is registered in England and Wales with the registration number
6440931. The Registered Office is Brook House, 229 Shepherds Bush Road,
London W6 7AN.
Re: Working with custom vocabulary
Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Hi Sergio
This is exactly i did and i mentioned in my last email
*"What i understand is to enable option "Link ProperNouns only" in
entityhub linking and also to use "opennlp-pos" engine in my weighted chain"
*
I have already checked this option in my own entity hub linking engine
By the way, did you get a chance to look at files i have shared in google
drive folder. Did you notice any problems there ?
I think using custom ontology with stanbol should be a very common use case
and if there are issues getting it working, either i am doing something
terribly wrong or there are some other reasons which i dont know.
But anyways, i am persisting to solve this issue and any help on this from
this dev community will be much appreciated
best regards
tarandeep
On Mon, Jul 15, 2013 at 9:49 PM, Sergio Fernández <
sergio.fernandez@salzburgresearch.at> wrote:
> http://{stanbol}/system/**console/configMgr sorry
>
>
> On 15/07/13 18:15, Sergio Fernández wrote:
>
>> Have you check the
>>
>> 1) go to http://{stanbol}/config/**system/console/configMgr
>>
>> 2) find your EntityHub Linking engine
>>
>> 3) and then "Link ProperNouns only"
>>
>> The documentation in that configuration is quite useful I think:
>>
>> "If activated only ProperNouns will be matched against the Vocabulary.
>> If deactivated any Noun will be matched. NOTE that this parameter
>> requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
>> Otherwise mapping will not work as expected.
>> (enhancer.engines.linking.**properNounsState)"
>>
>> Hope this help. You have to take into account such kind of issues are
>> not easy to solve by email.
>>
>> Cheers,
>>
>> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>>
>>> Thanks Sergio for your response
>>>
>>> What i understand is to enable option *"Link ProperNouns only"* in
>>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>>> chain
>>>
>>> I did these changes but unable to extract "University of Salzberg"
>>>
>>> Please find below the output RDF/XML from enhancer
>>>
>>> Request you to please let me know if i did not understand your inputs
>>> correctly
>>>
>>> One more thing, in our ontology (yet to be built) we will have entities
>>> which are other than people, places and organisations. For example,
>>> belts,
>>> bags etc
>>>
>>> best regards
>>> tarandeep
>>>
>>> <rdf:RDF
>>> xmlns:rdf="http://www.w3.org/**1999/02/22-rdf-syntax-ns#<http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>>> "
>>> xmlns:j.0="http://purl.org/dc/**terms/ <http://purl.org/dc/terms/>"
>>> xmlns:j.1="http://fise.iks-**project.eu/ontology/<http://fise.iks-project.eu/ontology/>"
>>> >
>>> <rdf:Description
>>> rdf:about="urn:enhancement-**197792bf-f1e8-47bf-626a-**3cdfbdb863b3">
>>> <j.0:type rdf:resource="http://purl.org/**dc/terms/LinguisticSystem<http://purl.org/dc/terms/LinguisticSystem>
>>> "/>
>>> <j.1:extracted-from
>>> rdf:resource="urn:content-**item-sha1-**3b2998e66582544035454850d2dd81**
>>> 755b747849"/>
>>>
>>> <j.1:confidence
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#double<http://www.w3.org/2001/XMLSchema#double>
>>> ">0.**9999964817340454</j.1:**confidence>
>>>
>>> <rdf:type
>>> rdf:resource="http://fise.iks-**project.eu/ontology/**Enhancement<http://fise.iks-project.eu/ontology/Enhancement>
>>> "/>
>>> <rdf:type
>>> rdf:resource="http://fise.iks-**project.eu/ontology/**TextAnnotation<http://fise.iks-project.eu/ontology/TextAnnotation>
>>> "/>
>>> <j.0:language>en</j.0:**language>
>>> <j.0:created
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#dateTime<http://www.w3.org/2001/XMLSchema#dateTime>
>>> ">**2013-07-15T14:25:43.829Z</j.0:**created>
>>>
>>> <j.0:creator
>>> rdf:datatype="http://www.w3.**org/2001/XMLSchema#string<http://www.w3.org/2001/XMLSchema#string>
>>> ">**org.apache.stanbol.enhancer.**engines.langdetect.**
>>> LanguageDetectionEnhancementEn**gine</j.0:creator>
>>>
>>> </rdf:Description>
>>> </rdf:RDF>
>>>
>>>
>>>
>>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>> wrote:
>>>
>>> As I said: have you check the proper noun detection and POS tagging in
>>>> your chain?
>>>>
>>>> For instance, enhancing the text "I studied at the University of
>>>> Salzburg,
>>>> which is based in Austria" works at the demo server:
>>>>
>>>> http://dev.iks-project.eu:****8081/enhancer/chain/dbpedia-****
>>>> proper-noun<http://dev.iks-**project.eu:8081/enhancer/**
>>>> chain/dbpedia-proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>> >
>>>>
>>>>
>>>> Here the details:
>>>>
>>>> http://stanbol.apache.org/****docs/trunk/components/****
>>>> enhancer/engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>> entitylinking#proper-noun-****linking-****
>>>> wzxhzdk14enhancerengineslinkin****
>>>> gpropernounsstatewzxhzdk15<htt**p://stanbol.apache.org/docs/**
>>>> trunk/components/enhancer/**engines/entitylinking#proper-**
>>>> noun-linking-**wzxhzdk14enhancerengineslinkin**
>>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>> >
>>>>
>>>>
>>>> Cheers,
>>>>
>>>>
>>>>
>>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>>
>>>> Just to add to my previous email
>>>>>
>>>>> If i add another individual in my ontology "MyUniversity" under class
>>>>> University
>>>>>
>>>>>
>>>>>
>>>>> <!--
>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>> untitled-ontology-13#****MyUniversity--<http://www.**
>>>>> semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>> ontology-13#MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>> >
>>>>>
>>>>>
>>>>>>
>>>>> <owl:NamedIndividual rdf:about="
>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>> untitled-ontology-13#****MyUniversity<http://www.**
>>>>> semanticweb.org/vi5/**ontologies/2013/6/untitled-**
>>>>> ontology-13#MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>> >
>>>>>
>>>>> ">
>>>>> <rdf:type rdf:resource="
>>>>> http://www.semanticweb.org/****vi5/ontologies/2013/6/**<http://www.semanticweb.org/**vi5/ontologies/2013/6/**>
>>>>> untitled-ontology-13#****University<http://www.**semanticweb.org/vi5/*
>>>>> *ontologies/2013/6/untitled-**ontology-13#University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>> >
>>>>>
>>>>> "/>
>>>>> <rdfs:label>MyUniversity</****rdfs:label>
>>>>> </owl:NamedIndividual>
>>>>>
>>>>>
>>>>> So with all configurations i have mentioned in the word document (in
>>>>> google
>>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>>> enhancement
>>>>> chain is able to extract "MyUniversity" and link it with
>>>>> "University" type
>>>>>
>>>>> But same set of configurations doesn't work with individual
>>>>> "University of
>>>>> Salzburg"
>>>>>
>>>>> If anyone of you please provide help on what are we missing to be
>>>>> able to
>>>>> extract custom entities which has space in between, will be a great
>>>>> help
>>>>> to
>>>>> proceed further on our journey with using and contributing to stanbol
>>>>>
>>>>> with best regards,
>>>>> tarandeep
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>>> tsawhney@innodata.com> wrote:
>>>>>
>>>>> Thanks Sergio and Dileepa for your responses
>>>>>
>>>>>>
>>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>>> keep
>>>>>> just one class and one instance value "University of Salzburg" in our
>>>>>> custom ontology and try to extract this entity and also link it but we
>>>>>> could not get this running. I am sure we are missing some
>>>>>> configurations.
>>>>>>
>>>>>> I am sharing a google drive folder at below link
>>>>>>
>>>>>> https://drive.google.com/****folderview?id=0B-**<https://drive.google.com/**folderview?id=0B-**>
>>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=****sharing<https://drive.**
>>>>>> google.com/folderview?id=0B-**vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>> >
>>>>>>
>>>>>>
>>>>>> This folder has 3 files:
>>>>>>
>>>>>> 1) A word document which shows felix snapshots of what all
>>>>>> configurations
>>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>>> weighted
>>>>>> chain
>>>>>> 2) our custom ontology
>>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>>
>>>>>> May i request you all to please look at these files and let us know
>>>>>> if we
>>>>>> are missing something in configurations.
>>>>>>
>>>>>> We have referred to below web links in order to configure stanbol for
>>>>>> using our custom ontology for entity extraction and linking
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>> >
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>> entityhub/managedsite<http://**stanbol.apache.org/docs/trunk/**
>>>>>> components/entityhub/**managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>> >
>>>>>>
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/components/****
>>>>>> enhancer/engines/**<http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**>
>>>>>>
>>>>>> entityhublinking<http://**stanbol.apache.org/docs/trunk/**
>>>>>> components/enhancer/engines/**entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>> >
>>>>>>
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>> enhancer/chains/weightedchain.****html<http://stanbol.apache.**
>>>>>> org/docs/trunk/components/**enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>> >
>>>>>>
>>>>>>
>>>>>> Thanks in advance for your valuable help.
>>>>>>
>>>>>> Best regards
>>>>>> tarandeep
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>>> sergio.fernandez@**salzburgres**earch.at <http://salzburgresearch.at>
>>>>>> <se...@salzburgresearch.at>
>>>>>> >>
>>>>>>
>>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>>>
>>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>>> details
>>>>>>> at:
>>>>>>>
>>>>>>> http://stanbol.apache.org/******docs/trunk/components/**<http://stanbol.apache.org/****docs/trunk/components/**>
>>>>>>> <http:**//stanbol.apache.org/**docs/**trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>> >
>>>>>>>
>>>>>>> enhancer/engines/****entitylinking<http://stanbol.****
>>>>>>> apache.org/docs/trunk/****components/enhancer/engines/****
>>>>>>> entitylinking<http://apache.org/docs/trunk/**components/enhancer/engines/**entitylinking>
>>>>>>> <http://stanbol.**apache.org/docs/trunk/**
>>>>>>> components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>> >
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>>> netiquette in
>>>>>>> mailing lists. This is an open source community; therefore messages
>>>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>>>> Friday
>>>>>>> afternoon, when people could be already out for weekend, or even on
>>>>>>> vacations.
>>>>>>>
>>>>>>> Best,
>>>>>>> Sergio
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>>>
>>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>>> vocabulary
>>>>>>>> as
>>>>>>>> per
>>>>>>>> http://stanbol.apache.org/******docs/trunk/customvocabulary.****
>>>>>>>> **html<http://stanbol.apache.org/****docs/trunk/customvocabulary.****html>
>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>> >
>>>>>>>>
>>>>>>>> <http://stanbol.apache.**org/**docs/trunk/**customvocabulary.**
>>>>>>>> html<http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>> >
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>> Following are the steps followed -
>>>>>>>>
>>>>>>>> Configured Clerezza Yard.
>>>>>>>> Configured Managed Yard site.
>>>>>>>> Updated the site by plugging ontology(containing custom
>>>>>>>> entities) .
>>>>>>>> Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>>>>> managed
>>>>>>>> site.
>>>>>>>> Configured a customChain which uses following engine
>>>>>>>>
>>>>>>>> - *langdetect*
>>>>>>>> - *opennlp-sentence*
>>>>>>>> - *opennlp-token*
>>>>>>>> - *opennlp-pos*
>>>>>>>> - *opennlp-chunker*
>>>>>>>> - *customLinkingEngine*
>>>>>>>>
>>>>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>>>>
>>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>>> space in
>>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>>
>>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>>> stanbol
>>>>>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>>>>>
>>>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>>> but
>>>>>>>> it
>>>>>>>> didn't work to extract above entity.
>>>>>>>>
>>>>>>>> I have invested more than a week now and running out of options now
>>>>>>>>
>>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>>
>>>>>>> Sergio Fernández
>>>>>>> Salzburg Research
>>>>>>> +43 662 2288 318
>>>>>>> Jakob-Haringer Strasse 5/II
>>>>>>> A-5020 Salzburg (Austria)
>>>>>>> http://www.salzburgresearch.at
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>> --
>>>> Sergio Fernández
>>>> Salzburg Research
>>>> +43 662 2288 318
>>>> Jakob-Haringer Strasse 5/II
>>>> A-5020 Salzburg (Austria)
>>>> http://www.salzburgresearch.at
>>>>
>>>>
>>>
>>
> --
> Sergio Fernández
> Salzburg Research
> +43 662 2288 318
> Jakob-Haringer Strasse 5/II
> A-5020 Salzburg (Austria)
> http://www.salzburgresearch.at
>
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: Working with custom vocabulary
Posted by Sergio Fernández <se...@salzburgresearch.at>.
http://{stanbol}/system/console/configMgr sorry
On 15/07/13 18:15, Sergio Fernández wrote:
> Have you check the
>
> 1) go to http://{stanbol}/config/system/console/configMgr
>
> 2) find your EntityHub Linking engine
>
> 3) and then "Link ProperNouns only"
>
> The documentation in that configuration is quite useful I think:
>
> "If activated only ProperNouns will be matched against the Vocabulary.
> If deactivated any Noun will be matched. NOTE that this parameter
> requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
> Otherwise mapping will not work as expected.
> (enhancer.engines.linking.properNounsState)"
>
> Hope this help. You have to take into account such kind of issues are
> not easy to solve by email.
>
> Cheers,
>
> On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
>> Thanks Sergio for your response
>>
>> What i understand is to enable option *"Link ProperNouns only"* in
>> entityhub linking and also to use "opennlp-pos" engine in my weighted
>> chain
>>
>> I did these changes but unable to extract "University of Salzberg"
>>
>> Please find below the output RDF/XML from enhancer
>>
>> Request you to please let me know if i did not understand your inputs
>> correctly
>>
>> One more thing, in our ontology (yet to be built) we will have entities
>> which are other than people, places and organisations. For example,
>> belts,
>> bags etc
>>
>> best regards
>> tarandeep
>>
>> <rdf:RDF
>> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>> xmlns:j.0="http://purl.org/dc/terms/"
>> xmlns:j.1="http://fise.iks-project.eu/ontology/" >
>> <rdf:Description
>> rdf:about="urn:enhancement-197792bf-f1e8-47bf-626a-3cdfbdb863b3">
>> <j.0:type rdf:resource="http://purl.org/dc/terms/LinguisticSystem"/>
>> <j.1:extracted-from
>> rdf:resource="urn:content-item-sha1-3b2998e66582544035454850d2dd81755b747849"/>
>>
>> <j.1:confidence
>> rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.9999964817340454</j.1:confidence>
>>
>> <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
>> <rdf:type
>> rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
>> <j.0:language>en</j.0:language>
>> <j.0:created
>> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2013-07-15T14:25:43.829Z</j.0:created>
>>
>> <j.0:creator
>> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine</j.0:creator>
>>
>> </rdf:Description>
>> </rdf:RDF>
>>
>>
>>
>> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
>> sergio.fernandez@salzburgresearch.at> wrote:
>>
>>> As I said: have you check the proper noun detection and POS tagging in
>>> your chain?
>>>
>>> For instance, enhancing the text "I studied at the University of
>>> Salzburg,
>>> which is based in Austria" works at the demo server:
>>>
>>> http://dev.iks-project.eu:**8081/enhancer/chain/dbpedia-**proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>>
>>>
>>> Here the details:
>>>
>>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>>> entitylinking#proper-noun-**linking-**wzxhzdk14enhancerengineslinkin**
>>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>>
>>>
>>> Cheers,
>>>
>>>
>>>
>>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>>
>>>> Just to add to my previous email
>>>>
>>>> If i add another individual in my ontology "MyUniversity" under class
>>>> University
>>>>
>>>>
>>>>
>>>> <!--
>>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>>> untitled-ontology-13#**MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>
>>>>>
>>>>
>>>> <owl:NamedIndividual rdf:about="
>>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>>> untitled-ontology-13#**MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>>>
>>>> ">
>>>> <rdf:type rdf:resource="
>>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>>> untitled-ontology-13#**University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>>>
>>>> "/>
>>>> <rdfs:label>MyUniversity</**rdfs:label>
>>>> </owl:NamedIndividual>
>>>>
>>>>
>>>> So with all configurations i have mentioned in the word document (in
>>>> google
>>>> drive folder), when i pass text with "MyUniversity" in it, my
>>>> enhancement
>>>> chain is able to extract "MyUniversity" and link it with
>>>> "University" type
>>>>
>>>> But same set of configurations doesn't work with individual
>>>> "University of
>>>> Salzburg"
>>>>
>>>> If anyone of you please provide help on what are we missing to be
>>>> able to
>>>> extract custom entities which has space in between, will be a great
>>>> help
>>>> to
>>>> proceed further on our journey with using and contributing to stanbol
>>>>
>>>> with best regards,
>>>> tarandeep
>>>>
>>>>
>>>>
>>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>>> tsawhney@innodata.com> wrote:
>>>>
>>>> Thanks Sergio and Dileepa for your responses
>>>>>
>>>>> We haven't been able to resolve the issue. We therefore decided to
>>>>> keep
>>>>> just one class and one instance value "University of Salzburg" in our
>>>>> custom ontology and try to extract this entity and also link it but we
>>>>> could not get this running. I am sure we are missing some
>>>>> configurations.
>>>>>
>>>>> I am sharing a google drive folder at below link
>>>>>
>>>>> https://drive.google.com/**folderview?id=0B-**
>>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>>
>>>>>
>>>>> This folder has 3 files:
>>>>>
>>>>> 1) A word document which shows felix snapshots of what all
>>>>> configurations
>>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>>> weighted
>>>>> chain
>>>>> 2) our custom ontology
>>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>>
>>>>> May i request you all to please look at these files and let us know
>>>>> if we
>>>>> are missing something in configurations.
>>>>>
>>>>> We have referred to below web links in order to configure stanbol for
>>>>> using our custom ontology for entity extraction and linking
>>>>>
>>>>> http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>
>>>>> http://stanbol.apache.org/**docs/trunk/components/**
>>>>> entityhub/managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>>
>>>>>
>>>>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>>>>>
>>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>>
>>>>>
>>>>> http://stanbol.apache.org/**docs/trunk/components/**
>>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>>
>>>>>
>>>>> Thanks in advance for your valuable help.
>>>>>
>>>>> Best regards
>>>>> tarandeep
>>>>>
>>>>>
>>>>>
>>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>>>>
>>>>> wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> I'm not an expert on entity linking, but from my experience such
>>>>>> behaviour could be caused by the proper noun detection. Further
>>>>>> details
>>>>>> at:
>>>>>>
>>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>>>
>>>>>> enhancer/engines/**entitylinking<http://stanbol.**
>>>>>> apache.org/docs/trunk/**components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> In addition, I'd like to suggest you to take a look to the
>>>>>> netiquette in
>>>>>> mailing lists. This is an open source community; therefore messages
>>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>>> Friday
>>>>>> afternoon, when people could be already out for weekend, or even on
>>>>>> vacations.
>>>>>>
>>>>>> Best,
>>>>>> Sergio
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>>
>>>>>> Hi,
>>>>>>>
>>>>>>> I am using stanbol to extract entitiies by plugging custom
>>>>>>> vocabulary
>>>>>>> as
>>>>>>> per
>>>>>>> http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>>>
>>>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Following are the steps followed -
>>>>>>>
>>>>>>> Configured Clerezza Yard.
>>>>>>> Configured Managed Yard site.
>>>>>>> Updated the site by plugging ontology(containing custom
>>>>>>> entities) .
>>>>>>> Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>>>> managed
>>>>>>> site.
>>>>>>> Configured a customChain which uses following engine
>>>>>>>
>>>>>>> - *langdetect*
>>>>>>> - *opennlp-sentence*
>>>>>>> - *opennlp-token*
>>>>>>> - *opennlp-pos*
>>>>>>> - *opennlp-chunker*
>>>>>>> - *customLinkingEngine*
>>>>>>>
>>>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>>>
>>>>>>> However i am facing an issue in extracting entities which has
>>>>>>> space in
>>>>>>> between. For example "Tommy Hilfiger".
>>>>>>>
>>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with
>>>>>>> stanbol
>>>>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>>>>
>>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* )
>>>>>>> but
>>>>>>> it
>>>>>>> didn't work to extract above entity.
>>>>>>>
>>>>>>> I have invested more than a week now and running out of options now
>>>>>>>
>>>>>>> i request you to please provide help in resolving this issue
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>> Sergio Fernández
>>>>>> Salzburg Research
>>>>>> +43 662 2288 318
>>>>>> Jakob-Haringer Strasse 5/II
>>>>>> A-5020 Salzburg (Austria)
>>>>>> http://www.salzburgresearch.at
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>> --
>>> Sergio Fernández
>>> Salzburg Research
>>> +43 662 2288 318
>>> Jakob-Haringer Strasse 5/II
>>> A-5020 Salzburg (Austria)
>>> http://www.salzburgresearch.at
>>>
>>
>
--
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at
Re: Working with custom vocabulary
Posted by Sergio Fernández <se...@salzburgresearch.at>.
Have you check the
1) go to http://{stanbol}/config/system/console/configMgr
2) find your EntityHub Linking engine
3) and then "Link ProperNouns only"
The documentation in that configuration is quite useful I think:
"If activated only ProperNouns will be matched against the Vocabulary.
If deactivated any Noun will be matched. NOTE that this parameter
requires a tag of the POS TagSet to be mapped against 'olia:PorperNoun'.
Otherwise mapping will not work as expected.
(enhancer.engines.linking.properNounsState)"
Hope this help. You have to take into account such kind of issues are
not easy to solve by email.
Cheers,
On 15/07/13 16:31, Sawhney, Tarandeep Singh wrote:
> Thanks Sergio for your response
>
> What i understand is to enable option *"Link ProperNouns only"* in
> entityhub linking and also to use "opennlp-pos" engine in my weighted chain
>
> I did these changes but unable to extract "University of Salzberg"
>
> Please find below the output RDF/XML from enhancer
>
> Request you to please let me know if i did not understand your inputs
> correctly
>
> One more thing, in our ontology (yet to be built) we will have entities
> which are other than people, places and organisations. For example, belts,
> bags etc
>
> best regards
> tarandeep
>
> <rdf:RDF
> xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:j.0="http://purl.org/dc/terms/"
> xmlns:j.1="http://fise.iks-project.eu/ontology/" >
> <rdf:Description
> rdf:about="urn:enhancement-197792bf-f1e8-47bf-626a-3cdfbdb863b3">
> <j.0:type rdf:resource="http://purl.org/dc/terms/LinguisticSystem"/>
> <j.1:extracted-from
> rdf:resource="urn:content-item-sha1-3b2998e66582544035454850d2dd81755b747849"/>
> <j.1:confidence
> rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.9999964817340454</j.1:confidence>
> <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
> <rdf:type rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
> <j.0:language>en</j.0:language>
> <j.0:created
> rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2013-07-15T14:25:43.829Z</j.0:created>
> <j.0:creator
> rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine</j.0:creator>
> </rdf:Description>
> </rdf:RDF>
>
>
>
> On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
> sergio.fernandez@salzburgresearch.at> wrote:
>
>> As I said: have you check the proper noun detection and POS tagging in
>> your chain?
>>
>> For instance, enhancing the text "I studied at the University of Salzburg,
>> which is based in Austria" works at the demo server:
>>
>> http://dev.iks-project.eu:**8081/enhancer/chain/dbpedia-**proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>>
>> Here the details:
>>
>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>> entitylinking#proper-noun-**linking-**wzxhzdk14enhancerengineslinkin**
>> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>>
>> Cheers,
>>
>>
>>
>> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>>
>>> Just to add to my previous email
>>>
>>> If i add another individual in my ontology "MyUniversity" under class
>>> University
>>>
>>>
>>>
>>> <!--
>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>> untitled-ontology-13#**MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>>>>
>>>
>>> <owl:NamedIndividual rdf:about="
>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>> untitled-ontology-13#**MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>>> ">
>>> <rdf:type rdf:resource="
>>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>>> untitled-ontology-13#**University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>>> "/>
>>> <rdfs:label>MyUniversity</**rdfs:label>
>>> </owl:NamedIndividual>
>>>
>>>
>>> So with all configurations i have mentioned in the word document (in
>>> google
>>> drive folder), when i pass text with "MyUniversity" in it, my enhancement
>>> chain is able to extract "MyUniversity" and link it with "University" type
>>>
>>> But same set of configurations doesn't work with individual "University of
>>> Salzburg"
>>>
>>> If anyone of you please provide help on what are we missing to be able to
>>> extract custom entities which has space in between, will be a great help
>>> to
>>> proceed further on our journey with using and contributing to stanbol
>>>
>>> with best regards,
>>> tarandeep
>>>
>>>
>>>
>>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>>> tsawhney@innodata.com> wrote:
>>>
>>> Thanks Sergio and Dileepa for your responses
>>>>
>>>> We haven't been able to resolve the issue. We therefore decided to keep
>>>> just one class and one instance value "University of Salzburg" in our
>>>> custom ontology and try to extract this entity and also link it but we
>>>> could not get this running. I am sure we are missing some configurations.
>>>>
>>>> I am sharing a google drive folder at below link
>>>>
>>>> https://drive.google.com/**folderview?id=0B-**
>>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>>
>>>> This folder has 3 files:
>>>>
>>>> 1) A word document which shows felix snapshots of what all configurations
>>>> we did while configuring Yard, yardsite, entiy linking engine and
>>>> weighted
>>>> chain
>>>> 2) our custom ontology
>>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>>
>>>> May i request you all to please look at these files and let us know if we
>>>> are missing something in configurations.
>>>>
>>>> We have referred to below web links in order to configure stanbol for
>>>> using our custom ontology for entity extraction and linking
>>>>
>>>> http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>> http://stanbol.apache.org/**docs/trunk/components/**
>>>> entityhub/managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>>
>>>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>>
>>>> http://stanbol.apache.org/**docs/trunk/components/**
>>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>>
>>>> Thanks in advance for your valuable help.
>>>>
>>>> Best regards
>>>> tarandeep
>>>>
>>>>
>>>>
>>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>>> wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> I'm not an expert on entity linking, but from my experience such
>>>>> behaviour could be caused by the proper noun detection. Further details
>>>>> at:
>>>>>
>>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>>> enhancer/engines/**entitylinking<http://stanbol.**
>>>>> apache.org/docs/trunk/**components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>>>>
>>>>>
>>>>>
>>>>> In addition, I'd like to suggest you to take a look to the netiquette in
>>>>> mailing lists. This is an open source community; therefore messages
>>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>>> Friday
>>>>> afternoon, when people could be already out for weekend, or even on
>>>>> vacations.
>>>>>
>>>>> Best,
>>>>> Sergio
>>>>>
>>>>>
>>>>>
>>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>>
>>>>> Hi,
>>>>>>
>>>>>> I am using stanbol to extract entitiies by plugging custom vocabulary
>>>>>> as
>>>>>> per http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> Following are the steps followed -
>>>>>>
>>>>>> Configured Clerezza Yard.
>>>>>> Configured Managed Yard site.
>>>>>> Updated the site by plugging ontology(containing custom entities) .
>>>>>> Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>>> managed
>>>>>> site.
>>>>>> Configured a customChain which uses following engine
>>>>>>
>>>>>> - *langdetect*
>>>>>> - *opennlp-sentence*
>>>>>> - *opennlp-token*
>>>>>> - *opennlp-pos*
>>>>>> - *opennlp-chunker*
>>>>>> - *customLinkingEngine*
>>>>>>
>>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>>
>>>>>> However i am facing an issue in extracting entities which has space in
>>>>>> between. For example "Tommy Hilfiger".
>>>>>>
>>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>>>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>>>
>>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but
>>>>>> it
>>>>>> didn't work to extract above entity.
>>>>>>
>>>>>> I have invested more than a week now and running out of options now
>>>>>>
>>>>>> i request you to please provide help in resolving this issue
>>>>>>
>>>>>>
>>>>>> --
>>>>> Sergio Fernández
>>>>> Salzburg Research
>>>>> +43 662 2288 318
>>>>> Jakob-Haringer Strasse 5/II
>>>>> A-5020 Salzburg (Austria)
>>>>> http://www.salzburgresearch.at
>>>>>
>>>>>
>>>>
>>>>
>>>
>> --
>> Sergio Fernández
>> Salzburg Research
>> +43 662 2288 318
>> Jakob-Haringer Strasse 5/II
>> A-5020 Salzburg (Austria)
>> http://www.salzburgresearch.at
>>
>
--
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at
Re: [URGENT] Working with custom vocabulary
Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Thanks Sergio for your response
What i understand is to enable option *"Link ProperNouns only"* in
entityhub linking and also to use "opennlp-pos" engine in my weighted chain
I did these changes but unable to extract "University of Salzberg"
Please find below the output RDF/XML from enhancer
Request you to please let me know if i did not understand your inputs
correctly
One more thing, in our ontology (yet to be built) we will have entities
which are other than people, places and organisations. For example, belts,
bags etc
best regards
tarandeep
<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:j.0="http://purl.org/dc/terms/"
xmlns:j.1="http://fise.iks-project.eu/ontology/" >
<rdf:Description
rdf:about="urn:enhancement-197792bf-f1e8-47bf-626a-3cdfbdb863b3">
<j.0:type rdf:resource="http://purl.org/dc/terms/LinguisticSystem"/>
<j.1:extracted-from
rdf:resource="urn:content-item-sha1-3b2998e66582544035454850d2dd81755b747849"/>
<j.1:confidence
rdf:datatype="http://www.w3.org/2001/XMLSchema#double">0.9999964817340454</j.1:confidence>
<rdf:type rdf:resource="http://fise.iks-project.eu/ontology/Enhancement"/>
<rdf:type rdf:resource="http://fise.iks-project.eu/ontology/TextAnnotation"/>
<j.0:language>en</j.0:language>
<j.0:created
rdf:datatype="http://www.w3.org/2001/XMLSchema#dateTime">2013-07-15T14:25:43.829Z</j.0:created>
<j.0:creator
rdf:datatype="http://www.w3.org/2001/XMLSchema#string">org.apache.stanbol.enhancer.engines.langdetect.LanguageDetectionEnhancementEngine</j.0:creator>
</rdf:Description>
</rdf:RDF>
On Mon, Jul 15, 2013 at 7:32 PM, Sergio Fernández <
sergio.fernandez@salzburgresearch.at> wrote:
> As I said: have you check the proper noun detection and POS tagging in
> your chain?
>
> For instance, enhancing the text "I studied at the University of Salzburg,
> which is based in Austria" works at the demo server:
>
> http://dev.iks-project.eu:**8081/enhancer/chain/dbpedia-**proper-noun<http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun>
>
> Here the details:
>
> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
> entitylinking#proper-noun-**linking-**wzxhzdk14enhancerengineslinkin**
> gpropernounsstatewzxhzdk15<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15>
>
> Cheers,
>
>
>
> On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
>
>> Just to add to my previous email
>>
>> If i add another individual in my ontology "MyUniversity" under class
>> University
>>
>>
>>
>> <!--
>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>> untitled-ontology-13#**MyUniversity--<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>> >
>>
>> <owl:NamedIndividual rdf:about="
>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>> untitled-ontology-13#**MyUniversity<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity>
>> ">
>> <rdf:type rdf:resource="
>> http://www.semanticweb.org/**vi5/ontologies/2013/6/**
>> untitled-ontology-13#**University<http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University>
>> "/>
>> <rdfs:label>MyUniversity</**rdfs:label>
>> </owl:NamedIndividual>
>>
>>
>> So with all configurations i have mentioned in the word document (in
>> google
>> drive folder), when i pass text with "MyUniversity" in it, my enhancement
>> chain is able to extract "MyUniversity" and link it with "University" type
>>
>> But same set of configurations doesn't work with individual "University of
>> Salzburg"
>>
>> If anyone of you please provide help on what are we missing to be able to
>> extract custom entities which has space in between, will be a great help
>> to
>> proceed further on our journey with using and contributing to stanbol
>>
>> with best regards,
>> tarandeep
>>
>>
>>
>> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
>> tsawhney@innodata.com> wrote:
>>
>> Thanks Sergio and Dileepa for your responses
>>>
>>> We haven't been able to resolve the issue. We therefore decided to keep
>>> just one class and one instance value "University of Salzburg" in our
>>> custom ontology and try to extract this entity and also link it but we
>>> could not get this running. I am sure we are missing some configurations.
>>>
>>> I am sharing a google drive folder at below link
>>>
>>> https://drive.google.com/**folderview?id=0B-**
>>> vX9idwHlRtRFFOR000ZnBBOWM&usp=**sharing<https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing>
>>>
>>> This folder has 3 files:
>>>
>>> 1) A word document which shows felix snapshots of what all configurations
>>> we did while configuring Yard, yardsite, entiy linking engine and
>>> weighted
>>> chain
>>> 2) our custom ontology
>>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>>
>>> May i request you all to please look at these files and let us know if we
>>> are missing something in configurations.
>>>
>>> We have referred to below web links in order to configure stanbol for
>>> using our custom ontology for entity extraction and linking
>>>
>>> http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>> http://stanbol.apache.org/**docs/trunk/components/**
>>> entityhub/managedsite<http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite>
>>>
>>> http://stanbol.apache.org/**docs/trunk/components/**enhancer/engines/**
>>> entityhublinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking>
>>>
>>> http://stanbol.apache.org/**docs/trunk/components/**
>>> enhancer/chains/weightedchain.**html<http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html>
>>>
>>> Thanks in advance for your valuable help.
>>>
>>> Best regards
>>> tarandeep
>>>
>>>
>>>
>>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>>> sergio.fernandez@**salzburgresearch.at<se...@salzburgresearch.at>>
>>> wrote:
>>>
>>> Hi,
>>>>
>>>> I'm not an expert on entity linking, but from my experience such
>>>> behaviour could be caused by the proper noun detection. Further details
>>>> at:
>>>>
>>>> http://stanbol.apache.org/****docs/trunk/components/**<http://stanbol.apache.org/**docs/trunk/components/**>
>>>> enhancer/engines/**entitylinking<http://stanbol.**
>>>> apache.org/docs/trunk/**components/enhancer/engines/**entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>> >
>>>>
>>>>
>>>> In addition, I'd like to suggest you to take a look to the netiquette in
>>>> mailing lists. This is an open source community; therefore messages
>>>> starting with "URGENT" are not very polite. Specially sending it on
>>>> Friday
>>>> afternoon, when people could be already out for weekend, or even on
>>>> vacations.
>>>>
>>>> Best,
>>>> Sergio
>>>>
>>>>
>>>>
>>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>>
>>>> Hi,
>>>>>
>>>>> I am using stanbol to extract entitiies by plugging custom vocabulary
>>>>> as
>>>>> per http://stanbol.apache.org/****docs/trunk/customvocabulary.****html<http://stanbol.apache.org/**docs/trunk/customvocabulary.**html>
>>>>> <http://stanbol.apache.**org/docs/trunk/**customvocabulary.html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>> >
>>>>>
>>>>>
>>>>> Following are the steps followed -
>>>>>
>>>>> Configured Clerezza Yard.
>>>>> Configured Managed Yard site.
>>>>> Updated the site by plugging ontology(containing custom entities) .
>>>>> Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>>> managed
>>>>> site.
>>>>> Configured a customChain which uses following engine
>>>>>
>>>>> - *langdetect*
>>>>> - *opennlp-sentence*
>>>>> - *opennlp-token*
>>>>> - *opennlp-pos*
>>>>> - *opennlp-chunker*
>>>>> - *customLinkingEngine*
>>>>>
>>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>>
>>>>> However i am facing an issue in extracting entities which has space in
>>>>> between. For example "Tommy Hilfiger".
>>>>>
>>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>>
>>>>> I had tried configuring *customLinkingEngine* same as *
>>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but
>>>>> it
>>>>> didn't work to extract above entity.
>>>>>
>>>>> I have invested more than a week now and running out of options now
>>>>>
>>>>> i request you to please provide help in resolving this issue
>>>>>
>>>>>
>>>>> --
>>>> Sergio Fernández
>>>> Salzburg Research
>>>> +43 662 2288 318
>>>> Jakob-Haringer Strasse 5/II
>>>> A-5020 Salzburg (Austria)
>>>> http://www.salzburgresearch.at
>>>>
>>>>
>>>
>>>
>>
> --
> Sergio Fernández
> Salzburg Research
> +43 662 2288 318
> Jakob-Haringer Strasse 5/II
> A-5020 Salzburg (Austria)
> http://www.salzburgresearch.at
>
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: [URGENT] Working with custom vocabulary
Posted by Sergio Fernández <se...@salzburgresearch.at>.
As I said: have you check the proper noun detection and POS tagging in
your chain?
For instance, enhancing the text "I studied at the University of
Salzburg, which is based in Austria" works at the demo server:
http://dev.iks-project.eu:8081/enhancer/chain/dbpedia-proper-noun
Here the details:
http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking#proper-noun-linking-wzxhzdk14enhancerengineslinkingpropernounsstatewzxhzdk15
Cheers,
On 15/07/13 15:27, Sawhney, Tarandeep Singh wrote:
> Just to add to my previous email
>
> If i add another individual in my ontology "MyUniversity" under class
> University
>
>
>
> <!--
> http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
>
> <owl:NamedIndividual rdf:about="
> http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity
> ">
> <rdf:type rdf:resource="
> http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University
> "/>
> <rdfs:label>MyUniversity</rdfs:label>
> </owl:NamedIndividual>
>
>
> So with all configurations i have mentioned in the word document (in google
> drive folder), when i pass text with "MyUniversity" in it, my enhancement
> chain is able to extract "MyUniversity" and link it with "University" type
>
> But same set of configurations doesn't work with individual "University of
> Salzburg"
>
> If anyone of you please provide help on what are we missing to be able to
> extract custom entities which has space in between, will be a great help to
> proceed further on our journey with using and contributing to stanbol
>
> with best regards,
> tarandeep
>
>
>
> On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
> tsawhney@innodata.com> wrote:
>
>> Thanks Sergio and Dileepa for your responses
>>
>> We haven't been able to resolve the issue. We therefore decided to keep
>> just one class and one instance value "University of Salzburg" in our
>> custom ontology and try to extract this entity and also link it but we
>> could not get this running. I am sure we are missing some configurations.
>>
>> I am sharing a google drive folder at below link
>>
>> https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing
>>
>> This folder has 3 files:
>>
>> 1) A word document which shows felix snapshots of what all configurations
>> we did while configuring Yard, yardsite, entiy linking engine and weighted
>> chain
>> 2) our custom ontology
>> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>>
>> May i request you all to please look at these files and let us know if we
>> are missing something in configurations.
>>
>> We have referred to below web links in order to configure stanbol for
>> using our custom ontology for entity extraction and linking
>>
>> http://stanbol.apache.org/docs/trunk/customvocabulary.html
>> http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite
>>
>> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking
>>
>> http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html
>>
>> Thanks in advance for your valuable help.
>>
>> Best regards
>> tarandeep
>>
>>
>>
>> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
>> sergio.fernandez@salzburgresearch.at> wrote:
>>
>>> Hi,
>>>
>>> I'm not an expert on entity linking, but from my experience such
>>> behaviour could be caused by the proper noun detection. Further details at:
>>>
>>> http://stanbol.apache.org/**docs/trunk/components/**
>>> enhancer/engines/entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>>
>>> In addition, I'd like to suggest you to take a look to the netiquette in
>>> mailing lists. This is an open source community; therefore messages
>>> starting with "URGENT" are not very polite. Specially sending it on Friday
>>> afternoon, when people could be already out for weekend, or even on
>>> vacations.
>>>
>>> Best,
>>> Sergio
>>>
>>>
>>>
>>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>>
>>>> Hi,
>>>>
>>>> I am using stanbol to extract entitiies by plugging custom vocabulary as
>>>> per http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>>
>>>> Following are the steps followed -
>>>>
>>>> Configured Clerezza Yard.
>>>> Configured Managed Yard site.
>>>> Updated the site by plugging ontology(containing custom entities) .
>>>> Configured Entity hub linking Engine(*customLinkingEngine*) with
>>>> managed
>>>> site.
>>>> Configured a customChain which uses following engine
>>>>
>>>> - *langdetect*
>>>> - *opennlp-sentence*
>>>> - *opennlp-token*
>>>> - *opennlp-pos*
>>>> - *opennlp-chunker*
>>>> - *customLinkingEngine*
>>>>
>>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>>
>>>> However i am facing an issue in extracting entities which has space in
>>>> between. For example "Tommy Hilfiger".
>>>>
>>>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>>
>>>> I had tried configuring *customLinkingEngine* same as *
>>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
>>>> didn't work to extract above entity.
>>>>
>>>> I have invested more than a week now and running out of options now
>>>>
>>>> i request you to please provide help in resolving this issue
>>>>
>>>>
>>> --
>>> Sergio Fernández
>>> Salzburg Research
>>> +43 662 2288 318
>>> Jakob-Haringer Strasse 5/II
>>> A-5020 Salzburg (Austria)
>>> http://www.salzburgresearch.at
>>>
>>
>>
>
--
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at
Re: [URGENT] Working with custom vocabulary
Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Just to add to my previous email
If i add another individual in my ontology "MyUniversity" under class
University
<!--
http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity-->
<owl:NamedIndividual rdf:about="
http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#MyUniversity
">
<rdf:type rdf:resource="
http://www.semanticweb.org/vi5/ontologies/2013/6/untitled-ontology-13#University
"/>
<rdfs:label>MyUniversity</rdfs:label>
</owl:NamedIndividual>
So with all configurations i have mentioned in the word document (in google
drive folder), when i pass text with "MyUniversity" in it, my enhancement
chain is able to extract "MyUniversity" and link it with "University" type
But same set of configurations doesn't work with individual "University of
Salzburg"
If anyone of you please provide help on what are we missing to be able to
extract custom entities which has space in between, will be a great help to
proceed further on our journey with using and contributing to stanbol
with best regards,
tarandeep
On Mon, Jul 15, 2013 at 5:57 PM, Sawhney, Tarandeep Singh <
tsawhney@innodata.com> wrote:
> Thanks Sergio and Dileepa for your responses
>
> We haven't been able to resolve the issue. We therefore decided to keep
> just one class and one instance value "University of Salzburg" in our
> custom ontology and try to extract this entity and also link it but we
> could not get this running. I am sure we are missing some configurations.
>
> I am sharing a google drive folder at below link
>
> https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing
>
> This folder has 3 files:
>
> 1) A word document which shows felix snapshots of what all configurations
> we did while configuring Yard, yardsite, entiy linking engine and weighted
> chain
> 2) our custom ontology
> 3) the result of SPARQL against our graphuri using SPARQL endpoint
>
> May i request you all to please look at these files and let us know if we
> are missing something in configurations.
>
> We have referred to below web links in order to configure stanbol for
> using our custom ontology for entity extraction and linking
>
> http://stanbol.apache.org/docs/trunk/customvocabulary.html
> http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite
>
> http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking
>
> http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html
>
> Thanks in advance for your valuable help.
>
> Best regards
> tarandeep
>
>
>
> On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
> sergio.fernandez@salzburgresearch.at> wrote:
>
>> Hi,
>>
>> I'm not an expert on entity linking, but from my experience such
>> behaviour could be caused by the proper noun detection. Further details at:
>>
>> http://stanbol.apache.org/**docs/trunk/components/**
>> enhancer/engines/entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>>
>> In addition, I'd like to suggest you to take a look to the netiquette in
>> mailing lists. This is an open source community; therefore messages
>> starting with "URGENT" are not very polite. Specially sending it on Friday
>> afternoon, when people could be already out for weekend, or even on
>> vacations.
>>
>> Best,
>> Sergio
>>
>>
>>
>> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>>
>>> Hi,
>>>
>>> I am using stanbol to extract entitiies by plugging custom vocabulary as
>>> per http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>>
>>> Following are the steps followed -
>>>
>>> Configured Clerezza Yard.
>>> Configured Managed Yard site.
>>> Updated the site by plugging ontology(containing custom entities) .
>>> Configured Entity hub linking Engine(*customLinkingEngine*) with
>>> managed
>>> site.
>>> Configured a customChain which uses following engine
>>>
>>> - *langdetect*
>>> - *opennlp-sentence*
>>> - *opennlp-token*
>>> - *opennlp-pos*
>>> - *opennlp-chunker*
>>> - *customLinkingEngine*
>>>
>>> Now, i am able to extract entities like Adidas using *customChain*.
>>>
>>> However i am facing an issue in extracting entities which has space in
>>> between. For example "Tommy Hilfiger".
>>>
>>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>>
>>> I had tried configuring *customLinkingEngine* same as *
>>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
>>> didn't work to extract above entity.
>>>
>>> I have invested more than a week now and running out of options now
>>>
>>> i request you to please provide help in resolving this issue
>>>
>>>
>> --
>> Sergio Fernández
>> Salzburg Research
>> +43 662 2288 318
>> Jakob-Haringer Strasse 5/II
>> A-5020 Salzburg (Austria)
>> http://www.salzburgresearch.at
>>
>
>
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: [URGENT] Working with custom vocabulary
Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
Thanks Sergio and Dileepa for your responses
We haven't been able to resolve the issue. We therefore decided to keep
just one class and one instance value "University of Salzburg" in our
custom ontology and try to extract this entity and also link it but we
could not get this running. I am sure we are missing some configurations.
I am sharing a google drive folder at below link
https://drive.google.com/folderview?id=0B-vX9idwHlRtRFFOR000ZnBBOWM&usp=sharing
This folder has 3 files:
1) A word document which shows felix snapshots of what all configurations
we did while configuring Yard, yardsite, entiy linking engine and weighted
chain
2) our custom ontology
3) the result of SPARQL against our graphuri using SPARQL endpoint
May i request you all to please look at these files and let us know if we
are missing something in configurations.
We have referred to below web links in order to configure stanbol for using
our custom ontology for entity extraction and linking
http://stanbol.apache.org/docs/trunk/customvocabulary.html
http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite
http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entityhublinking
http://stanbol.apache.org/docs/trunk/components/enhancer/chains/weightedchain.html
Thanks in advance for your valuable help.
Best regards
tarandeep
On Sat, Jul 13, 2013 at 5:57 PM, Sergio Fernández <
sergio.fernandez@salzburgresearch.at> wrote:
> Hi,
>
> I'm not an expert on entity linking, but from my experience such behaviour
> could be caused by the proper noun detection. Further details at:
>
> http://stanbol.apache.org/**docs/trunk/components/**
> enhancer/engines/entitylinking<http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking>
>
> In addition, I'd like to suggest you to take a look to the netiquette in
> mailing lists. This is an open source community; therefore messages
> starting with "URGENT" are not very polite. Specially sending it on Friday
> afternoon, when people could be already out for weekend, or even on
> vacations.
>
> Best,
> Sergio
>
>
>
> On 12/07/13 15:54, Sethi, Keval Krishna wrote:
>
>> Hi,
>>
>> I am using stanbol to extract entitiies by plugging custom vocabulary as
>> per http://stanbol.apache.org/**docs/trunk/customvocabulary.**html<http://stanbol.apache.org/docs/trunk/customvocabulary.html>
>>
>> Following are the steps followed -
>>
>> Configured Clerezza Yard.
>> Configured Managed Yard site.
>> Updated the site by plugging ontology(containing custom entities) .
>> Configured Entity hub linking Engine(*customLinkingEngine*) with managed
>> site.
>> Configured a customChain which uses following engine
>>
>> - *langdetect*
>> - *opennlp-sentence*
>> - *opennlp-token*
>> - *opennlp-pos*
>> - *opennlp-chunker*
>> - *customLinkingEngine*
>>
>> Now, i am able to extract entities like Adidas using *customChain*.
>>
>> However i am facing an issue in extracting entities which has space in
>> between. For example "Tommy Hilfiger".
>>
>> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
>> instance) is rightly extracting entities like "Tommy Hilfiger".
>>
>> I had tried configuring *customLinkingEngine* same as *
>> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
>> didn't work to extract above entity.
>>
>> I have invested more than a week now and running out of options now
>>
>> i request you to please provide help in resolving this issue
>>
>>
> --
> Sergio Fernández
> Salzburg Research
> +43 662 2288 318
> Jakob-Haringer Strasse 5/II
> A-5020 Salzburg (Austria)
> http://www.salzburgresearch.at
>
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: [URGENT] Working with custom vocabulary
Posted by Sergio Fernández <se...@salzburgresearch.at>.
Hi,
I'm not an expert on entity linking, but from my experience such
behaviour could be caused by the proper noun detection. Further details at:
http://stanbol.apache.org/docs/trunk/components/enhancer/engines/entitylinking
In addition, I'd like to suggest you to take a look to the netiquette in
mailing lists. This is an open source community; therefore messages
starting with "URGENT" are not very polite. Specially sending it on
Friday afternoon, when people could be already out for weekend, or even
on vacations.
Best,
Sergio
On 12/07/13 15:54, Sethi, Keval Krishna wrote:
> Hi,
>
> I am using stanbol to extract entitiies by plugging custom vocabulary as
> per http://stanbol.apache.org/docs/trunk/customvocabulary.html
>
> Following are the steps followed -
>
> Configured Clerezza Yard.
> Configured Managed Yard site.
> Updated the site by plugging ontology(containing custom entities) .
> Configured Entity hub linking Engine(*customLinkingEngine*) with managed
> site.
> Configured a customChain which uses following engine
>
> - *langdetect*
> - *opennlp-sentence*
> - *opennlp-token*
> - *opennlp-pos*
> - *opennlp-chunker*
> - *customLinkingEngine*
>
> Now, i am able to extract entities like Adidas using *customChain*.
>
> However i am facing an issue in extracting entities which has space in
> between. For example "Tommy Hilfiger".
>
> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
> instance) is rightly extracting entities like "Tommy Hilfiger".
>
> I had tried configuring *customLinkingEngine* same as *
> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
> didn't work to extract above entity.
>
> I have invested more than a week now and running out of options now
>
> i request you to please provide help in resolving this issue
>
--
Sergio Fernández
Salzburg Research
+43 662 2288 318
Jakob-Haringer Strasse 5/II
A-5020 Salzburg (Austria)
http://www.salzburgresearch.at
Re: [URGENT] Working with custom vocabulary
Posted by "Sethi, Keval Krishna" <ks...@innodata.com>.
Hi Dileepa,
Thanks for replying.
Yes i had "Tommy Hilfiger" as an entity in my configured managed site.
To confirm it i had queried it through sparql end point. Following is the
query and its result
Query
select * {<http://demo.com#Tommy_Hilfiger> ?p ?o}
Result
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
<head>
<variable name="p"/>
<variable name="o"/>
</head>
<results>
<result>
<binding name="p">
<uri>http://www.w3.org/1999/02/22-rdf-syntax-ns#type</uri>
</binding>
<binding name="o">
<uri>http://www.w3.org/2002/07/owl#NamedIndividual</uri>
</binding>
</result>
<result>
<binding name="p">
<uri>http://www.w3.org/1999/02/22-rdf-syntax-ns#type</uri>
</binding>
<binding name="o">
<uri>http://demo.com#Designer_Brands</uri>
</binding>
</result>
<result>
<binding name="p">
<uri>http://www.w3.org/2000/01/rdf-schema#label</uri>
</binding>
<binding name="o">
<literal>Tommy Hilfiger</literal>
</binding>
</result>
</results>
</sparql>
I hadn't done any custom configurations while creating managed site, just
followed referenced documentation (
http://stanbol.apache.org/docs/trunk/components/entityhub/managedsite)
and configured clerezza yard site.
Please suggest if i am missing something.
On Sat, Jul 13, 2013 at 11:57 AM, Dileepa Jayakody <
dileepajayakody@gmail.com> wrote:
> Hi Sethi,
>
> I'm also quite a newbie to Stanbol and configured a new site (for a foaf
> dataset),a entityhub linking engine and a custom chain to use the engine
> and was able to extract entities from my entityhub.
> These are the set of engines I chained in my weighted enhancement chain :
> langdetect, opennlp-sentence, opennlp-token, opennlp-pos, foaf-site-linking
> (my custom engine).
>
> If you are only using your new Site (reference site) in your engine, you
> will detect entities from that site only during entity linking. Are you
> sure entity identified by " Tommy Hilfiger" is available in your site? Did
> you do any custom configurations when you created your reference site?
>
> Thanks,
> Dileepa
>
>
> On Sat, Jul 13, 2013 at 7:28 AM, Sawhney, Tarandeep Singh <
> tsawhney@innodata.com> wrote:
>
> > A polite reminder to stanbol dev community
> >
> > Can anyone please provide some pointers to resolve below issue in entity
> > extraction using custom ontology with stanbol.
> >
> > Please let us know if some more information is required to understand
> what
> > we are doing so you can suggest some help.
> >
> > Best regards
> > tarandeep
> >
> >
> > On Fri, Jul 12, 2013 at 7:24 PM, Sethi, Keval Krishna
> > <ks...@innodata.com>wrote:
> >
> > > Hi,
> > >
> > > I am using stanbol to extract entitiies by plugging custom vocabulary
> as
> > > per http://stanbol.apache.org/docs/trunk/customvocabulary.html
> > >
> > > Following are the steps followed -
> > >
> > > Configured Clerezza Yard.
> > > Configured Managed Yard site.
> > > Updated the site by plugging ontology(containing custom entities) .
> > > Configured Entity hub linking Engine(*customLinkingEngine*) with
> managed
> > > site.
> > > Configured a customChain which uses following engine
> > >
> > > - *langdetect*
> > > - *opennlp-sentence*
> > > - *opennlp-token*
> > > - *opennlp-pos*
> > > - *opennlp-chunker*
> > > - *customLinkingEngine*
> > >
> > > Now, i am able to extract entities like Adidas using *customChain*.
> > >
> > > However i am facing an issue in extracting entities which has space in
> > > between. For example "Tommy Hilfiger".
> > >
> > > Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
> > > instance) is rightly extracting entities like "Tommy Hilfiger".
> > >
> > > I had tried configuring *customLinkingEngine* same as *
> > > dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but
> it
> > > didn't work to extract above entity.
> > >
> > > I have invested more than a week now and running out of options now
> > >
> > > i request you to please provide help in resolving this issue
> > >
> > > --
> > > Regards,
> > > Keval Sethi
> > >
> > > --
> > >
> > > "This e-mail and any attachments transmitted with it are for the sole
> use
> > > of the intended recipient(s) and may contain confidential , proprietary
> > or
> > > privileged information. If you are not the intended recipient, please
> > > contact the sender by reply e-mail and destroy all copies of the
> original
> > > message. Any unauthorized review, use, disclosure, dissemination,
> > > forwarding, printing or copying of this e-mail or any action taken in
> > > reliance on this e-mail is strictly prohibited and may be unlawful."
> > >
> >
> > --
> >
> > "This e-mail and any attachments transmitted with it are for the sole use
> > of the intended recipient(s) and may contain confidential , proprietary
> or
> > privileged information. If you are not the intended recipient, please
> > contact the sender by reply e-mail and destroy all copies of the original
> > message. Any unauthorized review, use, disclosure, dissemination,
> > forwarding, printing or copying of this e-mail or any action taken in
> > reliance on this e-mail is strictly prohibited and may be unlawful."
> >
>
--
Regards,
Keval Sethi
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."
Re: [URGENT] Working with custom vocabulary
Posted by Dileepa Jayakody <di...@gmail.com>.
Hi Sethi,
I'm also quite a newbie to Stanbol and configured a new site (for a foaf
dataset),a entityhub linking engine and a custom chain to use the engine
and was able to extract entities from my entityhub.
These are the set of engines I chained in my weighted enhancement chain :
langdetect, opennlp-sentence, opennlp-token, opennlp-pos, foaf-site-linking
(my custom engine).
If you are only using your new Site (reference site) in your engine, you
will detect entities from that site only during entity linking. Are you
sure entity identified by " Tommy Hilfiger" is available in your site? Did
you do any custom configurations when you created your reference site?
Thanks,
Dileepa
On Sat, Jul 13, 2013 at 7:28 AM, Sawhney, Tarandeep Singh <
tsawhney@innodata.com> wrote:
> A polite reminder to stanbol dev community
>
> Can anyone please provide some pointers to resolve below issue in entity
> extraction using custom ontology with stanbol.
>
> Please let us know if some more information is required to understand what
> we are doing so you can suggest some help.
>
> Best regards
> tarandeep
>
>
> On Fri, Jul 12, 2013 at 7:24 PM, Sethi, Keval Krishna
> <ks...@innodata.com>wrote:
>
> > Hi,
> >
> > I am using stanbol to extract entitiies by plugging custom vocabulary as
> > per http://stanbol.apache.org/docs/trunk/customvocabulary.html
> >
> > Following are the steps followed -
> >
> > Configured Clerezza Yard.
> > Configured Managed Yard site.
> > Updated the site by plugging ontology(containing custom entities) .
> > Configured Entity hub linking Engine(*customLinkingEngine*) with managed
> > site.
> > Configured a customChain which uses following engine
> >
> > - *langdetect*
> > - *opennlp-sentence*
> > - *opennlp-token*
> > - *opennlp-pos*
> > - *opennlp-chunker*
> > - *customLinkingEngine*
> >
> > Now, i am able to extract entities like Adidas using *customChain*.
> >
> > However i am facing an issue in extracting entities which has space in
> > between. For example "Tommy Hilfiger".
> >
> > Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
> > instance) is rightly extracting entities like "Tommy Hilfiger".
> >
> > I had tried configuring *customLinkingEngine* same as *
> > dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
> > didn't work to extract above entity.
> >
> > I have invested more than a week now and running out of options now
> >
> > i request you to please provide help in resolving this issue
> >
> > --
> > Regards,
> > Keval Sethi
> >
> > --
> >
> > "This e-mail and any attachments transmitted with it are for the sole use
> > of the intended recipient(s) and may contain confidential , proprietary
> or
> > privileged information. If you are not the intended recipient, please
> > contact the sender by reply e-mail and destroy all copies of the original
> > message. Any unauthorized review, use, disclosure, dissemination,
> > forwarding, printing or copying of this e-mail or any action taken in
> > reliance on this e-mail is strictly prohibited and may be unlawful."
> >
>
> --
>
> "This e-mail and any attachments transmitted with it are for the sole use
> of the intended recipient(s) and may contain confidential , proprietary or
> privileged information. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of the original
> message. Any unauthorized review, use, disclosure, dissemination,
> forwarding, printing or copying of this e-mail or any action taken in
> reliance on this e-mail is strictly prohibited and may be unlawful."
>
Re: [URGENT] Working with custom vocabulary
Posted by "Sawhney, Tarandeep Singh" <ts...@innodata.com>.
A polite reminder to stanbol dev community
Can anyone please provide some pointers to resolve below issue in entity
extraction using custom ontology with stanbol.
Please let us know if some more information is required to understand what
we are doing so you can suggest some help.
Best regards
tarandeep
On Fri, Jul 12, 2013 at 7:24 PM, Sethi, Keval Krishna
<ks...@innodata.com>wrote:
> Hi,
>
> I am using stanbol to extract entitiies by plugging custom vocabulary as
> per http://stanbol.apache.org/docs/trunk/customvocabulary.html
>
> Following are the steps followed -
>
> Configured Clerezza Yard.
> Configured Managed Yard site.
> Updated the site by plugging ontology(containing custom entities) .
> Configured Entity hub linking Engine(*customLinkingEngine*) with managed
> site.
> Configured a customChain which uses following engine
>
> - *langdetect*
> - *opennlp-sentence*
> - *opennlp-token*
> - *opennlp-pos*
> - *opennlp-chunker*
> - *customLinkingEngine*
>
> Now, i am able to extract entities like Adidas using *customChain*.
>
> However i am facing an issue in extracting entities which has space in
> between. For example "Tommy Hilfiger".
>
> Chain like *dbpedia-disambiguation *(which comes bundeled with stanbol
> instance) is rightly extracting entities like "Tommy Hilfiger".
>
> I had tried configuring *customLinkingEngine* same as *
> dbpedia-disamb-linking *(configured in *dbpedia-disambiguation* ) but it
> didn't work to extract above entity.
>
> I have invested more than a week now and running out of options now
>
> i request you to please provide help in resolving this issue
>
> --
> Regards,
> Keval Sethi
>
> --
>
> "This e-mail and any attachments transmitted with it are for the sole use
> of the intended recipient(s) and may contain confidential , proprietary or
> privileged information. If you are not the intended recipient, please
> contact the sender by reply e-mail and destroy all copies of the original
> message. Any unauthorized review, use, disclosure, dissemination,
> forwarding, printing or copying of this e-mail or any action taken in
> reliance on this e-mail is strictly prohibited and may be unlawful."
>
--
"This e-mail and any attachments transmitted with it are for the sole use
of the intended recipient(s) and may contain confidential , proprietary or
privileged information. If you are not the intended recipient, please
contact the sender by reply e-mail and destroy all copies of the original
message. Any unauthorized review, use, disclosure, dissemination,
forwarding, printing or copying of this e-mail or any action taken in
reliance on this e-mail is strictly prohibited and may be unlawful."