You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by chalitha udara Perera <ch...@gmail.com> on 2014/06/10 16:31:09 UTC

Configuring YAGO site in Stanbol

Hi all,

Currently I'm trying enhancer with yago referenced site. I have followed
the documentation and tried named entity tagging engine and entityhub
linking engines.

Here is the problem, By querying the site I can get labels for entities.
for example for Brain May here are the labels,
[{"type":"text","xml:lang":"eng","value":"Brian Harold
May"},{"type":"text","value":"Brian Harold
May"},{"type":"text","xml:lang":"eng","value":"Brian Harold May
CBE"},{"type":"text","xml:lang":"eng","value":"Brain
May"},{"type":"text","xml:lang":"eng","value":"Brian Harold May,
CBE"},{"type":"text","xml:lang":"eng","value":"Bryan
may"},{"type":"text","xml:lang":"eng","value":"Brian
May"},{"type":"text","xml:lang":"eng","value":"Brian
may"},{"type":"text","xml:lang":"eng","value":"Brain
may"},{"type":"text","xml:lang":"eng","value":"Brian May
(band)"},{"type":"text","xml:lang":"eng","value":"Dr. May"}]

but when I tried with enhancer, I can only get linked entity for "Brian
Harold May". Brain May or Dr. May does not work. The problem is most likely
be the identified language. In YAGO language is identified with three
letters such as "eng". but in Stanbol it is "en".

but with enitityhub linking  when I put "eng" for default matching language
field, I can get results.

how can I fix the label search to include search for these three letter
language codes ?

Thank You,
Chalitha Perera

-- 
J.M Chalitha Udara Perera

*Department of Computer Science and Engineering,*
*University of Moratuwa,*
*Sri Lanka*

Re: Configuring YAGO site in Stanbol

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Chalitha,

Stanbol uses the ISO 3166 alpha 2 codes for languages. So the Language
Detection Engine puts "en" as language for your text. This also means
that labels using alpha 3 codes will not be considered for linking.

I would recommend to convert labels of YAGO to use the alpha 2 codes
(where possible). An other alternative is to implement an engine that
converts the two letter language codes to the alpha 3 variant.

best
Rupert


On Tue, Jun 10, 2014 at 4:31 PM, chalitha udara Perera
<ch...@gmail.com> wrote:
> Hi all,
>
> Currently I'm trying enhancer with yago referenced site. I have followed
> the documentation and tried named entity tagging engine and entityhub
> linking engines.
>
> Here is the problem, By querying the site I can get labels for entities.
> for example for Brain May here are the labels,
> [{"type":"text","xml:lang":"eng","value":"Brian Harold
> May"},{"type":"text","value":"Brian Harold
> May"},{"type":"text","xml:lang":"eng","value":"Brian Harold May
> CBE"},{"type":"text","xml:lang":"eng","value":"Brain
> May"},{"type":"text","xml:lang":"eng","value":"Brian Harold May,
> CBE"},{"type":"text","xml:lang":"eng","value":"Bryan
> may"},{"type":"text","xml:lang":"eng","value":"Brian
> May"},{"type":"text","xml:lang":"eng","value":"Brian
> may"},{"type":"text","xml:lang":"eng","value":"Brain
> may"},{"type":"text","xml:lang":"eng","value":"Brian May
> (band)"},{"type":"text","xml:lang":"eng","value":"Dr. May"}]
>
> but when I tried with enhancer, I can only get linked entity for "Brian
> Harold May". Brain May or Dr. May does not work. The problem is most likely
> be the identified language. In YAGO language is identified with three
> letters such as "eng". but in Stanbol it is "en".
>
> but with enitityhub linking  when I put "eng" for default matching language
> field, I can get results.
>
> how can I fix the label search to include search for these three letter
> language codes ?
>
> Thank You,
> Chalitha Perera
>
> --
> J.M Chalitha Udara Perera
>
> *Department of Computer Science and Engineering,*
> *University of Moratuwa,*
> *Sri Lanka*



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/