You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@stanbol.apache.org by "Rupert Westenthaler (JIRA)" <ji...@apache.org> on 2013/04/18 08:51:15 UTC

[jira] [Created] (STANBOL-1046) Create pageId based DBpedia Freebase linker for the Entiyhub Freebase Indexing Tool

Rupert Westenthaler created STANBOL-1046:
--------------------------------------------

             Summary: Create pageId based DBpedia Freebase linker for the Entiyhub Freebase Indexing Tool
                 Key: STANBOL-1046
                 URL: https://issues.apache.org/jira/browse/STANBOL-1046
             Project: Stanbol
          Issue Type: Bug
          Components: Entityhub
            Reporter: Rupert Westenthaler


While the Freebase Indexing Tool already supports basic linking between Freebase topics and DBpedia Entities those links are constructed based on the local names of the Wikipedia pages what is error prone due to encoding issues.

With STANBOL-1034 [~ninniuz] has pointed out that linking by using the Wikipedia PageId is superior and that such a linking functionality already exists for DBpedia [1].

However using this option would require users to import 

      http://downloads.dbpedia.org/3.8/{language}/page_ids_{language}.nt.bz2

files to the Indexing Source (the Jena TDB holding the Freebase data) or any other data store that can hold those mappings (also an in-memory representation would be feasible).

Because of that a mapping based on PageId will be implemented in a custom EntityProcessor. This Issue covers the implementation of such a processor.


[1] https://github.com/dbpedia/extraction-framework/pull/27

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira