You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@stanbol.apache.org by Umutcan Şimşek <um...@mni.thm.de> on 2015/04/23 11:49:00 UTC

stanbol and semantic search

Hi,

I'm trying to implement semantic search for a CMS written in PHP that we 
use in my university as a part of my master's thesis.

Ontonet seems to be a good solution for storing and reasoning on 
instance and conceptual data. However, as far as I understand from 
previous questions, ontonet does not provide a sparql endpoint.

This leaves me at a point where I need to decide if I should use Stanbol 
only for semantic lifting and use another triple store provides sparql 
endpoint. Another approach in my mind, retrieving ontologies from 
ontonet temporarily to a reasoner and storing them back with inferred 
triples, but I wonder if there is a simpler way that involves only 
Stanbol components.

Is there a formal way to integrate external triple stores to Stanbol? 
Could you suggest some other solution without an external triple store?

Regards

----

Umutcan Simsek, MSc Candidate
Technische Hochschule Mittelhessen, Giessen, Germany
Ege University, Izmir, Turkey

Re: stanbol and semantic search

Posted by Umutcan Şimşek <um...@mni.thm.de>.

...
(sorry, forgot to add the link to first email)

[1] https://stanbol.apache.org/docs/trunk/utils/commons-solr

On 28.04.2015 12:58, Umutcan Şimşek wrote:
> Hi Rupert,
>
> I've decided to go on with only enhancer and entityhub; to use fuseki 
> server for storage and reasoning.
>
> Now my problem is indexing.
>
> I have three main ontologies, two of them are for extracting semantics 
> from custom CMS components, one of them is just extracting the 
> implicit semantics in content hierarchy (articles, categories vs) so 
> I'll be able to implement semantic search. I did a little research and 
> I encountered [1]. It says, Solr components are independent from 
> Stanbol, but when I look at the bundles on Osgi console, the solr 
> components are in org.apache.stanbol.commons.solr package, not in 
> org.apache.commons.solr package.
>
> Lastly, is there a way to integrate Solr server in stanbol with Fuseki 
> server?
>
> Best
>
> On 23.04.2015 15:27, Rupert Westenthaler wrote:
>> Hi Umutcan,
>>
>> The Linked Media Framework [1] tried to build a think like you
>> described. In this case it was running a Sesame based triple store,
>> Solr for Semantic Search and Stanbol for enhancing contents all in a
>> single instance. Was we learned form that was that it makes much more
>> sense to keep those things separated.
>>
>> If you want everything in Stanbol I can try to give you some pointers:
>>
>> Im not the most knowledgeable person on the Ontonet, Reasoning and
>> Rules component. But AFAIK it uses Clerezza as storage component. So
>> you can use any triple store that is supported by Clerezza. The most
>> commonly used one is Jena TDB.
>>
>> Stanbol also has a very simple SPARQL endpoint typically published
>> under "/sparql" (http://localhost:8080/sparql) via this you can access
>> graphs managed by Clerezza. So if you have your RDF data stored as
>> Clerezza Graphs you should be able to query them using this endpoint.
>>
>> So if the Ontonet and Reasoning components can use Clerezza to store
>> results it should be possible to also use the Stanbol server as Sparql
>> instance. I hope some of the Ontonet people see this and answer this
>> question
>>
>> Make sure to use the 1.0.0-SNAPSHOT because 0.12.* still use an older
>> Clerezza version that does not yet support Clerezza Fastlane [2]
>>
>> best
>> Rupert
>>
>> [1] https://bitbucket.org/srfgkmt/lmf/
>> [2] https://issues.apache.org/jira/browse/CLEREZZA-468
>>
>> On Thu, Apr 23, 2015 at 11:49 AM, Umutcan Şimşek
>> <um...@mni.thm.de> wrote:
>>> Hi,
>>>
>>> I'm trying to implement semantic search for a CMS written in PHP 
>>> that we use
>>> in my university as a part of my master's thesis.
>>>
>>> Ontonet seems to be a good solution for storing and reasoning on 
>>> instance
>>> and conceptual data. However, as far as I understand from previous
>>> questions, ontonet does not provide a sparql endpoint.
>>>
>>> This leaves me at a point where I need to decide if I should use 
>>> Stanbol
>>> only for semantic lifting and use another triple store provides sparql
>>> endpoint. Another approach in my mind, retrieving ontologies from 
>>> ontonet
>>> temporarily to a reasoner and storing them back with inferred 
>>> triples, but I
>>> wonder if there is a simpler way that involves only Stanbol components.
>>>
>>> Is there a formal way to integrate external triple stores to 
>>> Stanbol? Could
>>> you suggest some other solution without an external triple store?
>>>
>>> Regards
>>>
>>> ----
>>>
>>> Umutcan Simsek, MSc Candidate
>>> Technische Hochschule Mittelhessen, Giessen, Germany
>>> Ege University, Izmir, Turkey
>>>
>>
>>
>

-- 
Umutcan Simsek, MSc Candidate
Technische Hochschule Mittelhessen, Giessen, Germany
Ege University, Izmir, Turkey

Re: stanbol and semantic search

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi,

On Thu, May 28, 2015 at 5:23 PM, Umutcan Şimşek
<um...@mni.thm.de> wrote:
> Hi Rupert,
>
> In previous emails, you mentioned that having everything in one instance
> (triple store, enhancer and solr indexer) was a bad idea according to

In a single JVM to be more explicit. Running everything on a single
host (or maybe even a single Application Container such as Tomcat)
might still be ok. The point is that you use the RESTful service to
decouple Stanbol with the triplestore and the search index.

> experiences of LMF people. Is there any publication about this situation
> where the reasons are explained somehow. I would like to citate in my
> thesis.

ASAIK we have not published any paper mentioning this.

best
Rupert

>
> Best Regards
>
> Umutcan
>
>
> On 04/28/2015 12:58 PM, Umutcan Şimşek wrote:
>>
>> Hi Rupert,
>>
>> I've decided to go on with only enhancer and entityhub; to use fuseki
>> server for storage and reasoning.
>>
>> Now my problem is indexing.
>>
>> I have three main ontologies, two of them are for extracting semantics
>> from custom CMS components, one of them is just extracting the implicit
>> semantics in content hierarchy (articles, categories vs) so I'll be able to
>> implement semantic search. I did a little research and I encountered [1]. It
>> says, Solr components are independent from Stanbol, but when I look at the
>> bundles on Osgi console, the solr components are in
>> org.apache.stanbol.commons.solr package, not in org.apache.commons.solr
>> package.
>>
>> Lastly, is there a way to integrate Solr server in stanbol with Fuseki
>> server?
>>
>> Best
>>
>> On 23.04.2015 15:27, Rupert Westenthaler wrote:
>>>
>>> Hi Umutcan,
>>>
>>> The Linked Media Framework [1] tried to build a think like you
>>> described. In this case it was running a Sesame based triple store,
>>> Solr for Semantic Search and Stanbol for enhancing contents all in a
>>> single instance. Was we learned form that was that it makes much more
>>> sense to keep those things separated.
>>>
>>> If you want everything in Stanbol I can try to give you some pointers:
>>>
>>> Im not the most knowledgeable person on the Ontonet, Reasoning and
>>> Rules component. But AFAIK it uses Clerezza as storage component. So
>>> you can use any triple store that is supported by Clerezza. The most
>>> commonly used one is Jena TDB.
>>>
>>> Stanbol also has a very simple SPARQL endpoint typically published
>>> under "/sparql" (http://localhost:8080/sparql) via this you can access
>>> graphs managed by Clerezza. So if you have your RDF data stored as
>>> Clerezza Graphs you should be able to query them using this endpoint.
>>>
>>> So if the Ontonet and Reasoning components can use Clerezza to store
>>> results it should be possible to also use the Stanbol server as Sparql
>>> instance. I hope some of the Ontonet people see this and answer this
>>> question
>>>
>>> Make sure to use the 1.0.0-SNAPSHOT because 0.12.* still use an older
>>> Clerezza version that does not yet support Clerezza Fastlane [2]
>>>
>>> best
>>> Rupert
>>>
>>> [1] https://bitbucket.org/srfgkmt/lmf/
>>> [2] https://issues.apache.org/jira/browse/CLEREZZA-468
>>>
>>> On Thu, Apr 23, 2015 at 11:49 AM, Umutcan Şimşek
>>> <um...@mni.thm.de> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm trying to implement semantic search for a CMS written in PHP that we
>>>> use
>>>> in my university as a part of my master's thesis.
>>>>
>>>> Ontonet seems to be a good solution for storing and reasoning on
>>>> instance
>>>> and conceptual data. However, as far as I understand from previous
>>>> questions, ontonet does not provide a sparql endpoint.
>>>>
>>>> This leaves me at a point where I need to decide if I should use Stanbol
>>>> only for semantic lifting and use another triple store provides sparql
>>>> endpoint. Another approach in my mind, retrieving ontologies from
>>>> ontonet
>>>> temporarily to a reasoner and storing them back with inferred triples,
>>>> but I
>>>> wonder if there is a simpler way that involves only Stanbol components.
>>>>
>>>> Is there a formal way to integrate external triple stores to Stanbol?
>>>> Could
>>>> you suggest some other solution without an external triple store?
>>>>
>>>> Regards
>>>>
>>>> ----
>>>>
>>>> Umutcan Simsek, MSc Candidate
>>>> Technische Hochschule Mittelhessen, Giessen, Germany
>>>> Ege University, Izmir, Turkey
>>>>
>>>
>>>
>>
>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/

Re: stanbol and semantic search

Posted by Umutcan Şimşek <um...@mni.thm.de>.

Hi Rupert,

In previous emails, you mentioned that having everything in one instance 
(triple store, enhancer and solr indexer) was a bad idea according to 
experiences of LMF people. Is there any publication about this situation 
where the reasons are explained somehow. I would like to citate in my 
thesis.

Best Regards

Umutcan

On 04/28/2015 12:58 PM, Umutcan Şimşek wrote:
> Hi Rupert,
>
> I've decided to go on with only enhancer and entityhub; to use fuseki 
> server for storage and reasoning.
>
> Now my problem is indexing.
>
> I have three main ontologies, two of them are for extracting semantics 
> from custom CMS components, one of them is just extracting the 
> implicit semantics in content hierarchy (articles, categories vs) so 
> I'll be able to implement semantic search. I did a little research and 
> I encountered [1]. It says, Solr components are independent from 
> Stanbol, but when I look at the bundles on Osgi console, the solr 
> components are in org.apache.stanbol.commons.solr package, not in 
> org.apache.commons.solr package.
>
> Lastly, is there a way to integrate Solr server in stanbol with Fuseki 
> server?
>
> Best
>
> On 23.04.2015 15:27, Rupert Westenthaler wrote:
>> Hi Umutcan,
>>
>> The Linked Media Framework [1] tried to build a think like you
>> described. In this case it was running a Sesame based triple store,
>> Solr for Semantic Search and Stanbol for enhancing contents all in a
>> single instance. Was we learned form that was that it makes much more
>> sense to keep those things separated.
>>
>> If you want everything in Stanbol I can try to give you some pointers:
>>
>> Im not the most knowledgeable person on the Ontonet, Reasoning and
>> Rules component. But AFAIK it uses Clerezza as storage component. So
>> you can use any triple store that is supported by Clerezza. The most
>> commonly used one is Jena TDB.
>>
>> Stanbol also has a very simple SPARQL endpoint typically published
>> under "/sparql" (http://localhost:8080/sparql) via this you can access
>> graphs managed by Clerezza. So if you have your RDF data stored as
>> Clerezza Graphs you should be able to query them using this endpoint.
>>
>> So if the Ontonet and Reasoning components can use Clerezza to store
>> results it should be possible to also use the Stanbol server as Sparql
>> instance. I hope some of the Ontonet people see this and answer this
>> question
>>
>> Make sure to use the 1.0.0-SNAPSHOT because 0.12.* still use an older
>> Clerezza version that does not yet support Clerezza Fastlane [2]
>>
>> best
>> Rupert
>>
>> [1] https://bitbucket.org/srfgkmt/lmf/
>> [2] https://issues.apache.org/jira/browse/CLEREZZA-468
>>
>> On Thu, Apr 23, 2015 at 11:49 AM, Umutcan Şimşek
>> <um...@mni.thm.de> wrote:
>>> Hi,
>>>
>>> I'm trying to implement semantic search for a CMS written in PHP 
>>> that we use
>>> in my university as a part of my master's thesis.
>>>
>>> Ontonet seems to be a good solution for storing and reasoning on 
>>> instance
>>> and conceptual data. However, as far as I understand from previous
>>> questions, ontonet does not provide a sparql endpoint.
>>>
>>> This leaves me at a point where I need to decide if I should use 
>>> Stanbol
>>> only for semantic lifting and use another triple store provides sparql
>>> endpoint. Another approach in my mind, retrieving ontologies from 
>>> ontonet
>>> temporarily to a reasoner and storing them back with inferred 
>>> triples, but I
>>> wonder if there is a simpler way that involves only Stanbol components.
>>>
>>> Is there a formal way to integrate external triple stores to 
>>> Stanbol? Could
>>> you suggest some other solution without an external triple store?
>>>
>>> Regards
>>>
>>> ----
>>>
>>> Umutcan Simsek, MSc Candidate
>>> Technische Hochschule Mittelhessen, Giessen, Germany
>>> Ege University, Izmir, Turkey
>>>
>>
>>
>

Re: stanbol and semantic search

Posted by Umutcan Şimşek <um...@mni.thm.de>.

Hi Rupert,

I've decided to go on with only enhancer and entityhub; to use fuseki 
server for storage and reasoning.

Now my problem is indexing.

I have three main ontologies, two of them are for extracting semantics 
from custom CMS components, one of them is just extracting the implicit 
semantics in content hierarchy (articles, categories vs) so I'll be able 
to implement semantic search. I did a little research and I encountered 
[1]. It says, Solr components are independent from Stanbol, but when I 
look at the bundles on Osgi console, the solr components are in 
org.apache.stanbol.commons.solr package, not in  org.apache.commons.solr 
package.

Lastly, is there a way to integrate Solr server in stanbol with Fuseki 
server?

Best

On 23.04.2015 15:27, Rupert Westenthaler wrote:
> Hi Umutcan,
>
> The Linked Media Framework [1] tried to build a think like you
> described. In this case it was running a Sesame based triple store,
> Solr for Semantic Search and Stanbol for enhancing contents all in a
> single instance. Was we learned form that was that it makes much more
> sense to keep those things separated.
>
> If you want everything in Stanbol I can try to give you some pointers:
>
> Im not the most knowledgeable person on the Ontonet, Reasoning and
> Rules component. But AFAIK it uses Clerezza as storage component. So
> you can use any triple store that is supported by Clerezza. The most
> commonly used one is Jena TDB.
>
> Stanbol also has a very simple SPARQL endpoint typically published
> under "/sparql" (http://localhost:8080/sparql) via this you can access
> graphs managed by Clerezza. So if you have your RDF data stored as
> Clerezza Graphs you should be able to query them using this endpoint.
>
> So if the Ontonet and Reasoning components can use Clerezza to store
> results it should be possible to also use the Stanbol server as Sparql
> instance. I hope some of the Ontonet people see this and answer this
> question
>
> Make sure to use the 1.0.0-SNAPSHOT because 0.12.* still use an older
> Clerezza version that does not yet support Clerezza Fastlane [2]
>
> best
> Rupert
>
> [1] https://bitbucket.org/srfgkmt/lmf/
> [2] https://issues.apache.org/jira/browse/CLEREZZA-468
>
> On Thu, Apr 23, 2015 at 11:49 AM, Umutcan Şimşek
> <um...@mni.thm.de> wrote:
>> Hi,
>>
>> I'm trying to implement semantic search for a CMS written in PHP that we use
>> in my university as a part of my master's thesis.
>>
>> Ontonet seems to be a good solution for storing and reasoning on instance
>> and conceptual data. However, as far as I understand from previous
>> questions, ontonet does not provide a sparql endpoint.
>>
>> This leaves me at a point where I need to decide if I should use Stanbol
>> only for semantic lifting and use another triple store provides sparql
>> endpoint. Another approach in my mind, retrieving ontologies from ontonet
>> temporarily to a reasoner and storing them back with inferred triples, but I
>> wonder if there is a simpler way that involves only Stanbol components.
>>
>> Is there a formal way to integrate external triple stores to Stanbol? Could
>> you suggest some other solution without an external triple store?
>>
>> Regards
>>
>> ----
>>
>> Umutcan Simsek, MSc Candidate
>> Technische Hochschule Mittelhessen, Giessen, Germany
>> Ege University, Izmir, Turkey
>>
>
>

-- 
Umutcan Simsek, MSc Candidate
Technische Hochschule Mittelhessen, Giessen, Germany
Ege University, Izmir, Turkey

Re: stanbol and semantic search

Posted by Rupert Westenthaler <ru...@gmail.com>.

Hi Umutcan,

The Linked Media Framework [1] tried to build a think like you
described. In this case it was running a Sesame based triple store,
Solr for Semantic Search and Stanbol for enhancing contents all in a
single instance. Was we learned form that was that it makes much more
sense to keep those things separated.

If you want everything in Stanbol I can try to give you some pointers:

Im not the most knowledgeable person on the Ontonet, Reasoning and
Rules component. But AFAIK it uses Clerezza as storage component. So
you can use any triple store that is supported by Clerezza. The most
commonly used one is Jena TDB.

Stanbol also has a very simple SPARQL endpoint typically published
under "/sparql" (http://localhost:8080/sparql) via this you can access
graphs managed by Clerezza. So if you have your RDF data stored as
Clerezza Graphs you should be able to query them using this endpoint.

So if the Ontonet and Reasoning components can use Clerezza to store
results it should be possible to also use the Stanbol server as Sparql
instance. I hope some of the Ontonet people see this and answer this
question

Make sure to use the 1.0.0-SNAPSHOT because 0.12.* still use an older
Clerezza version that does not yet support Clerezza Fastlane [2]

best
Rupert

[1] https://bitbucket.org/srfgkmt/lmf/
[2] https://issues.apache.org/jira/browse/CLEREZZA-468

On Thu, Apr 23, 2015 at 11:49 AM, Umutcan Şimşek
<um...@mni.thm.de> wrote:
> Hi,
>
> I'm trying to implement semantic search for a CMS written in PHP that we use
> in my university as a part of my master's thesis.
>
> Ontonet seems to be a good solution for storing and reasoning on instance
> and conceptual data. However, as far as I understand from previous
> questions, ontonet does not provide a sparql endpoint.
>
> This leaves me at a point where I need to decide if I should use Stanbol
> only for semantic lifting and use another triple store provides sparql
> endpoint. Another approach in my mind, retrieving ontologies from ontonet
> temporarily to a reasoner and storing them back with inferred triples, but I
> wonder if there is a simpler way that involves only Stanbol components.
>
> Is there a formal way to integrate external triple stores to Stanbol? Could
> you suggest some other solution without an external triple store?
>
> Regards
>
> ----
>
> Umutcan Simsek, MSc Candidate
> Technische Hochschule Mittelhessen, Giessen, Germany
> Ege University, Izmir, Turkey
>

-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                              ++43-699-11108907
| A-5500 Bischofshofen
| REDLINK.CO ..........................................................................
| http://redlink.co/