You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by Umutcan Şimşek <um...@mni.thm.de> on 2015/05/28 17:23:41 UTC

Re: stanbol and semantic search

Hi Rupert,

In previous emails, you mentioned that having everything in one instance 
(triple store, enhancer and solr indexer) was a bad idea according to 
experiences of LMF people. Is there any publication about this situation 
where the reasons are explained somehow. I would like to citate in my 
thesis.

Best Regards

Umutcan

On 04/28/2015 12:58 PM, Umutcan Şimşek wrote:
> Hi Rupert,
>
> I've decided to go on with only enhancer and entityhub; to use fuseki 
> server for storage and reasoning.
>
> Now my problem is indexing.
>
> I have three main ontologies, two of them are for extracting semantics 
> from custom CMS components, one of them is just extracting the 
> implicit semantics in content hierarchy (articles, categories vs) so 
> I'll be able to implement semantic search. I did a little research and 
> I encountered [1]. It says, Solr components are independent from 
> Stanbol, but when I look at the bundles on Osgi console, the solr 
> components are in org.apache.stanbol.commons.solr package, not in 
> org.apache.commons.solr package.
>
> Lastly, is there a way to integrate Solr server in stanbol with Fuseki 
> server?
>
> Best
>
> On 23.04.2015 15:27, Rupert Westenthaler wrote:
>> Hi Umutcan,
>>
>> The Linked Media Framework [1] tried to build a think like you
>> described. In this case it was running a Sesame based triple store,
>> Solr for Semantic Search and Stanbol for enhancing contents all in a
>> single instance. Was we learned form that was that it makes much more
>> sense to keep those things separated.
>>
>> If you want everything in Stanbol I can try to give you some pointers:
>>
>> Im not the most knowledgeable person on the Ontonet, Reasoning and
>> Rules component. But AFAIK it uses Clerezza as storage component. So
>> you can use any triple store that is supported by Clerezza. The most
>> commonly used one is Jena TDB.
>>
>> Stanbol also has a very simple SPARQL endpoint typically published
>> under "/sparql" (http://localhost:8080/sparql) via this you can access
>> graphs managed by Clerezza. So if you have your RDF data stored as
>> Clerezza Graphs you should be able to query them using this endpoint.
>>
>> So if the Ontonet and Reasoning components can use Clerezza to store
>> results it should be possible to also use the Stanbol server as Sparql
>> instance. I hope some of the Ontonet people see this and answer this
>> question
>>
>> Make sure to use the 1.0.0-SNAPSHOT because 0.12.* still use an older
>> Clerezza version that does not yet support Clerezza Fastlane [2]
>>
>> best
>> Rupert
>>
>> [1] https://bitbucket.org/srfgkmt/lmf/
>> [2] https://issues.apache.org/jira/browse/CLEREZZA-468
>>
>> On Thu, Apr 23, 2015 at 11:49 AM, Umutcan Şimşek
>> <um...@mni.thm.de> wrote:
>>> Hi,
>>>
>>> I'm trying to implement semantic search for a CMS written in PHP 
>>> that we use
>>> in my university as a part of my master's thesis.
>>>
>>> Ontonet seems to be a good solution for storing and reasoning on 
>>> instance
>>> and conceptual data. However, as far as I understand from previous
>>> questions, ontonet does not provide a sparql endpoint.
>>>
>>> This leaves me at a point where I need to decide if I should use 
>>> Stanbol
>>> only for semantic lifting and use another triple store provides sparql
>>> endpoint. Another approach in my mind, retrieving ontologies from 
>>> ontonet
>>> temporarily to a reasoner and storing them back with inferred 
>>> triples, but I
>>> wonder if there is a simpler way that involves only Stanbol components.
>>>
>>> Is there a formal way to integrate external triple stores to 
>>> Stanbol? Could
>>> you suggest some other solution without an external triple store?
>>>
>>> Regards
>>>
>>> ----
>>>
>>> Umutcan Simsek, MSc Candidate
>>> Technische Hochschule Mittelhessen, Giessen, Germany
>>> Ege University, Izmir, Turkey
>>>
>>
>>
>


Re: stanbol and semantic search

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi,

On Thu, May 28, 2015 at 5:23 PM, Umutcan Şimşek
<um...@mni.thm.de> wrote:
> Hi Rupert,
>
> In previous emails, you mentioned that having everything in one instance
> (triple store, enhancer and solr indexer) was a bad idea according to

In a single JVM to be more explicit. Running everything on a single
host (or maybe even a single Application Container such as Tomcat)
might still be ok. The point is that you use the RESTful service to
decouple Stanbol with the triplestore and the search index.

> experiences of LMF people. Is there any publication about this situation
> where the reasons are explained somehow. I would like to citate in my
> thesis.

ASAIK we have not published any paper mentioning this.

best
Rupert

>
> Best Regards
>
> Umutcan
>
>
> On 04/28/2015 12:58 PM, Umutcan Şimşek wrote:
>>
>> Hi Rupert,
>>
>> I've decided to go on with only enhancer and entityhub; to use fuseki
>> server for storage and reasoning.
>>
>> Now my problem is indexing.
>>
>> I have three main ontologies, two of them are for extracting semantics
>> from custom CMS components, one of them is just extracting the implicit
>> semantics in content hierarchy (articles, categories vs) so I'll be able to
>> implement semantic search. I did a little research and I encountered [1]. It
>> says, Solr components are independent from Stanbol, but when I look at the
>> bundles on Osgi console, the solr components are in
>> org.apache.stanbol.commons.solr package, not in org.apache.commons.solr
>> package.
>>
>> Lastly, is there a way to integrate Solr server in stanbol with Fuseki
>> server?
>>
>> Best
>>
>> On 23.04.2015 15:27, Rupert Westenthaler wrote:
>>>
>>> Hi Umutcan,
>>>
>>> The Linked Media Framework [1] tried to build a think like you
>>> described. In this case it was running a Sesame based triple store,
>>> Solr for Semantic Search and Stanbol for enhancing contents all in a
>>> single instance. Was we learned form that was that it makes much more
>>> sense to keep those things separated.
>>>
>>> If you want everything in Stanbol I can try to give you some pointers:
>>>
>>> Im not the most knowledgeable person on the Ontonet, Reasoning and
>>> Rules component. But AFAIK it uses Clerezza as storage component. So
>>> you can use any triple store that is supported by Clerezza. The most
>>> commonly used one is Jena TDB.
>>>
>>> Stanbol also has a very simple SPARQL endpoint typically published
>>> under "/sparql" (http://localhost:8080/sparql) via this you can access
>>> graphs managed by Clerezza. So if you have your RDF data stored as
>>> Clerezza Graphs you should be able to query them using this endpoint.
>>>
>>> So if the Ontonet and Reasoning components can use Clerezza to store
>>> results it should be possible to also use the Stanbol server as Sparql
>>> instance. I hope some of the Ontonet people see this and answer this
>>> question
>>>
>>> Make sure to use the 1.0.0-SNAPSHOT because 0.12.* still use an older
>>> Clerezza version that does not yet support Clerezza Fastlane [2]
>>>
>>> best
>>> Rupert
>>>
>>> [1] https://bitbucket.org/srfgkmt/lmf/
>>> [2] https://issues.apache.org/jira/browse/CLEREZZA-468
>>>
>>> On Thu, Apr 23, 2015 at 11:49 AM, Umutcan Şimşek
>>> <um...@mni.thm.de> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to implement semantic search for a CMS written in PHP that we
>>>> use
>>>> in my university as a part of my master's thesis.
>>>>
>>>> Ontonet seems to be a good solution for storing and reasoning on
>>>> instance
>>>> and conceptual data. However, as far as I understand from previous
>>>> questions, ontonet does not provide a sparql endpoint.
>>>>
>>>> This leaves me at a point where I need to decide if I should use Stanbol
>>>> only for semantic lifting and use another triple store provides sparql
>>>> endpoint. Another approach in my mind, retrieving ontologies from
>>>> ontonet
>>>> temporarily to a reasoner and storing them back with inferred triples,
>>>> but I
>>>> wonder if there is a simpler way that involves only Stanbol components.
>>>>
>>>> Is there a formal way to integrate external triple stores to Stanbol?
>>>> Could
>>>> you suggest some other solution without an external triple store?
>>>>
>>>> Regards
>>>>
>>>> ----
>>>>
>>>> Umutcan Simsek, MSc Candidate
>>>> Technische Hochschule Mittelhessen, Giessen, Germany
>>>> Ege University, Izmir, Turkey
>>>>
>>>
>>>
>>
>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/