You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Umutcan Şimşek <um...@mni.thm.de> on 2015/05/28 19:21:48 UTC

stanbol enhancer/entityhub and german characters

Hello All,

According to N-Triples standart [1], it's not allowed to use Extended 
ASCII characters in literals. (refer EBNF)Therefore, when I extract 
triples from CMS database, I cannot represent characters like ö ü ä 
properly. (I replace it with a bytecode )

Can stanbol process these characters? If I configure NLP modules for 
German, is it going to be able to recognize, for instance, the word "Jäger"?

[1] http://www.w3.org/2001/sw/RDFCore/ntriples

Best Regards

Umutcan


Re: stanbol enhancer/entityhub and german characters

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi,

Stanbol uses the Apache Jena Parsers (via Clerezza) for parsing. If
you have non ASCII characters I recommend to store the file as UTF-8
and process it telling Stanbol that it is Turtle formatted. N-Triples
is a sub-set of Turtle so any N-Triples file is also a valid Turtle
file. However Turtle does support charsets. At least this is the trick
I use when loading RDF to a Sesame based triple store. With Stanbol
(Apache Jena based) I never had a problem like that.

best
Rupert

On Thu, May 28, 2015 at 7:21 PM, Umutcan Şimşek
<um...@mni.thm.de> wrote:
> Hello All,
>
> According to N-Triples standart [1], it's not allowed to use Extended ASCII
> characters in literals. (refer EBNF)Therefore, when I extract triples from
> CMS database, I cannot represent characters like ö ü ä properly. (I replace
> it with a bytecode )
>
> Can stanbol process these characters? If I configure NLP modules for German,
> is it going to be able to recognize, for instance, the word "Jäger"?
>
> [1] http://www.w3.org/2001/sw/RDFCore/ntriples
>
> Best Regards
>
> Umutcan
>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/