You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@stanbol.apache.org by ar...@wipro.com on 2013/09/13 13:48:15 UTC

Working with large RDF data

Hi,

  I have large RDF data.

   The requirement is to be able to reason / run rules on this data /

search this data along with any other unstructured data which  I have enhanced using  Stanbol.



Any pointers on how I can achieve this?





Thanking you and Rgds,

Arthi




Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com

RE: Working with large RDF data

Posted by ar...@wipro.com.
Thanks Reto,
Regards,
Arthi

-----Original Message-----
From: Reto Bachmann-Gmür [mailto:reto@wymiwyg.com] 
Sent: Tuesday, September 17, 2013 10:01 PM
To: dev@stanbol.apache.org
Subject: Re: Working with large RDF data

Hi Arthi

For the programmatic approach of create and access clerezza graphs you might have a look at the stanbol-statefull-webapp archetype.

Hope this gets you further.

Reto


On Tue, Sep 17, 2013 at 11:18 AM, <ar...@wipro.com> wrote:

> Thanks Reto,
>  The use case Iam looking at is to be able to search over the 
> unstructured content which has been enhanced using Stanbol and some 
> already existing RDF data.
> I would also like run rules and reasoning over the RDF data.
>
> Currently Iam using Stanbol for getting the unstructured text 
> enhancements and have a separate SDB store outside Stanbol.
>
> If I use the TDB based clerezza store will I be able to manage this from
> within Stanbol.   Can you point me to some links on how it can be done?
>
> Is there a way to search across  this Clerezza store , the ingested 
> text files into Stanbol and their enhancements. Is there a simple curl 
> command / Java program to do this.
>
> Thanking you and Regards,
> Arthi
>
>
>
>
>
> -----Original Message-----
> From: Reto Bachmann-Gmür [mailto:reto@wymiwyg.com]
> Sent: Monday, September 16, 2013 6:59 PM
> To: dev@stanbol.apache.org
> Subject: Re: Working with large RDF data
>
> Why in memory? TDB based clerezza store is quite efficient, so why not 
> add the data to such a graph?
>
> reto
>
>
> On Sat, Sep 14, 2013 at 9:14 AM, <ar...@wipro.com> wrote:
>
> > Thanks a lot Rupert
> > If the RDF data is smaller ( can fit into memory ) is there a way we 
> > can import into Stanbol and do a joint search across the 
> > enhancements from unstructured text as well as the imported RDF data.
> > If yes would this import be permanent or needs to be repeated each time.
> >
> >
> > Thanks and Rgds,
> > Arthi
> >
> >
> > -----Original Message-----
> > From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> > Sent: Saturday, September 14, 2013 12:40 PM
> > To: dev@stanbol.apache.org
> > Subject: Re: Working with large RDF data
> >
> > Hi Arthi
> >
> > AFAIK the reasoning and rule components of Apache Stanbol are 
> > intended to be used in "Sessions". They are not intended to be used 
> > on a whole knowledge base. A typical use case could be validating 
> > RDF data retrieved from a remote Server (e.g. Linked Data) against 
> > some
> validation rules.
> > Rewriting RDF generated by the Enhancer (Refactor
> > Engine) ...
> >
> > Applying Rules and Reasoning on a whole knowledge base (RDF data 
> > that do not fit in-memory) is not a typical use case.
> >
> > Based on your problem description you might want to have a look onto
> >
> > * Apache Marmotta and the Kiwi Triple Store
> > (http://marmotta.incubator.apache.org/kiwi/introduction.html): This 
> > is a Sesame Sail implementation that supports reasoning
> > * OWLLIM (http://www.ontotext.com/owlim): Commercial product also 
> > implementing Reasoning on top of the Sesame API.
> >
> > But I am not an export in those topics so there might be additional 
> > options I am not aware of.
> >
> > hope this helps
> > best
> > Rupert
> >
> >
> > On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
> > > Hi,
> > >
> > >   I have large RDF data.
> > >
> > >    The requirement is to be able to reason / run rules on this 
> > > data /
> > >
> > > search this data along with any other unstructured data which  I 
> > > have
> > enhanced using  Stanbol.
> > >
> > >
> > >
> > > Any pointers on how I can achieve this?
> > >
> > >
> > >
> > >
> > >
> > > Thanking you and Rgds,
> > >
> > > Arthi
> > >
> > >
> > >
> > >
> > > Please do not print this email unless it is absolutely necessary.
> > >
> > > The information contained in this electronic message and any 
> > > attachments
> > to this message are intended for the exclusive use of the 
> > addressee(s) and may contain proprietary, confidential or privileged information.
> > If you are not the intended recipient, you should not disseminate, 
> > distribute or copy this e-mail. Please notify the sender immediately 
> > and destroy all copies of this message and any attachments.
> > >
> > > WARNING: Computer viruses can be transmitted via email. The 
> > > recipient
> > should check this email and any attachments for the presence of viruses.
> > The company accepts no liability for any damage caused by any virus 
> > transmitted by this email.
> > >
> > > www.wipro.com
> >
> >
> >
> > --
> > | Rupert Westenthaler             rupert.westenthaler@gmail.com
> > | Bodenlehenstraße 11                             ++43-699-11108907
> > | A-5500 Bischofshofen
> >
> > Please do not print this email unless it is absolutely necessary.
> >
> > The information contained in this electronic message and any 
> > attachments to this message are intended for the exclusive use of 
> > the
> > addressee(s) and may contain proprietary, confidential or privileged 
> > information. If you are not the intended recipient, you should not 
> > disseminate, distribute or copy this e-mail. Please notify the 
> > sender immediately and destroy all copies of this message and any attachments.
> >
> > WARNING: Computer viruses can be transmitted via email. The 
> > recipient should check this email and any attachments for the presence of viruses.
> > The company accepts no liability for any damage caused by any virus 
> > transmitted by this email.
> >
> > www.wipro.com
> >
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any 
> attachments to this message are intended for the exclusive use of the 
> addressee(s) and may contain proprietary, confidential or privileged 
> information. If you are not the intended recipient, you should not 
> disseminate, distribute or copy this e-mail. Please notify the sender 
> immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient 
> should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus 
> transmitted by this email.
>
> www.wipro.com
>

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com

Re: Working with large RDF data

Posted by Reto Bachmann-Gmür <re...@wymiwyg.com>.
Hi Arthi

For the programmatic approach of create and access clerezza graphs you
might have a look at the stanbol-statefull-webapp archetype.

Hope this gets you further.

Reto


On Tue, Sep 17, 2013 at 11:18 AM, <ar...@wipro.com> wrote:

> Thanks Reto,
>  The use case Iam looking at is to be able to search over the unstructured
> content which has been enhanced using Stanbol and some already existing RDF
> data.
> I would also like run rules and reasoning over the RDF data.
>
> Currently Iam using Stanbol for getting the unstructured text enhancements
> and have a separate SDB store outside Stanbol.
>
> If I use the TDB based clerezza store will I be able to manage this from
> within Stanbol.   Can you point me to some links on how it can be done?
>
> Is there a way to search across  this Clerezza store , the ingested text
> files into Stanbol and their enhancements. Is there a simple curl command /
> Java program to do this.
>
> Thanking you and Regards,
> Arthi
>
>
>
>
>
> -----Original Message-----
> From: Reto Bachmann-Gmür [mailto:reto@wymiwyg.com]
> Sent: Monday, September 16, 2013 6:59 PM
> To: dev@stanbol.apache.org
> Subject: Re: Working with large RDF data
>
> Why in memory? TDB based clerezza store is quite efficient, so why not add
> the data to such a graph?
>
> reto
>
>
> On Sat, Sep 14, 2013 at 9:14 AM, <ar...@wipro.com> wrote:
>
> > Thanks a lot Rupert
> > If the RDF data is smaller ( can fit into memory ) is there a way we
> > can import into Stanbol and do a joint search across the enhancements
> > from unstructured text as well as the imported RDF data.
> > If yes would this import be permanent or needs to be repeated each time.
> >
> >
> > Thanks and Rgds,
> > Arthi
> >
> >
> > -----Original Message-----
> > From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> > Sent: Saturday, September 14, 2013 12:40 PM
> > To: dev@stanbol.apache.org
> > Subject: Re: Working with large RDF data
> >
> > Hi Arthi
> >
> > AFAIK the reasoning and rule components of Apache Stanbol are intended
> > to be used in "Sessions". They are not intended to be used on a whole
> > knowledge base. A typical use case could be validating RDF data
> > retrieved from a remote Server (e.g. Linked Data) against some
> validation rules.
> > Rewriting RDF generated by the Enhancer (Refactor
> > Engine) ...
> >
> > Applying Rules and Reasoning on a whole knowledge base (RDF data that
> > do not fit in-memory) is not a typical use case.
> >
> > Based on your problem description you might want to have a look onto
> >
> > * Apache Marmotta and the Kiwi Triple Store
> > (http://marmotta.incubator.apache.org/kiwi/introduction.html): This is
> > a Sesame Sail implementation that supports reasoning
> > * OWLLIM (http://www.ontotext.com/owlim): Commercial product also
> > implementing Reasoning on top of the Sesame API.
> >
> > But I am not an export in those topics so there might be additional
> > options I am not aware of.
> >
> > hope this helps
> > best
> > Rupert
> >
> >
> > On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
> > > Hi,
> > >
> > >   I have large RDF data.
> > >
> > >    The requirement is to be able to reason / run rules on this data
> > > /
> > >
> > > search this data along with any other unstructured data which  I
> > > have
> > enhanced using  Stanbol.
> > >
> > >
> > >
> > > Any pointers on how I can achieve this?
> > >
> > >
> > >
> > >
> > >
> > > Thanking you and Rgds,
> > >
> > > Arthi
> > >
> > >
> > >
> > >
> > > Please do not print this email unless it is absolutely necessary.
> > >
> > > The information contained in this electronic message and any
> > > attachments
> > to this message are intended for the exclusive use of the addressee(s)
> > and may contain proprietary, confidential or privileged information.
> > If you are not the intended recipient, you should not disseminate,
> > distribute or copy this e-mail. Please notify the sender immediately
> > and destroy all copies of this message and any attachments.
> > >
> > > WARNING: Computer viruses can be transmitted via email. The
> > > recipient
> > should check this email and any attachments for the presence of viruses.
> > The company accepts no liability for any damage caused by any virus
> > transmitted by this email.
> > >
> > > www.wipro.com
> >
> >
> >
> > --
> > | Rupert Westenthaler             rupert.westenthaler@gmail.com
> > | Bodenlehenstraße 11                             ++43-699-11108907
> > | A-5500 Bischofshofen
> >
> > Please do not print this email unless it is absolutely necessary.
> >
> > The information contained in this electronic message and any
> > attachments to this message are intended for the exclusive use of the
> > addressee(s) and may contain proprietary, confidential or privileged
> > information. If you are not the intended recipient, you should not
> > disseminate, distribute or copy this e-mail. Please notify the sender
> > immediately and destroy all copies of this message and any attachments.
> >
> > WARNING: Computer viruses can be transmitted via email. The recipient
> > should check this email and any attachments for the presence of viruses.
> > The company accepts no liability for any damage caused by any virus
> > transmitted by this email.
> >
> > www.wipro.com
> >
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient
> should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus
> transmitted by this email.
>
> www.wipro.com
>

RE: Working with large RDF data

Posted by ar...@wipro.com.
Thanks Reto,
 The use case Iam looking at is to be able to search over the unstructured content which has been enhanced using Stanbol and some already existing RDF data.
I would also like run rules and reasoning over the RDF data.

Currently Iam using Stanbol for getting the unstructured text enhancements and have a separate SDB store outside Stanbol.

If I use the TDB based clerezza store will I be able to manage this from within Stanbol.   Can you point me to some links on how it can be done?

Is there a way to search across  this Clerezza store , the ingested text files into Stanbol and their enhancements. Is there a simple curl command / Java program to do this.

Thanking you and Regards,
Arthi





-----Original Message-----
From: Reto Bachmann-Gmür [mailto:reto@wymiwyg.com] 
Sent: Monday, September 16, 2013 6:59 PM
To: dev@stanbol.apache.org
Subject: Re: Working with large RDF data

Why in memory? TDB based clerezza store is quite efficient, so why not add the data to such a graph?

reto


On Sat, Sep 14, 2013 at 9:14 AM, <ar...@wipro.com> wrote:

> Thanks a lot Rupert
> If the RDF data is smaller ( can fit into memory ) is there a way we 
> can import into Stanbol and do a joint search across the enhancements 
> from unstructured text as well as the imported RDF data.
> If yes would this import be permanent or needs to be repeated each time.
>
>
> Thanks and Rgds,
> Arthi
>
>
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> Sent: Saturday, September 14, 2013 12:40 PM
> To: dev@stanbol.apache.org
> Subject: Re: Working with large RDF data
>
> Hi Arthi
>
> AFAIK the reasoning and rule components of Apache Stanbol are intended 
> to be used in "Sessions". They are not intended to be used on a whole 
> knowledge base. A typical use case could be validating RDF data 
> retrieved from a remote Server (e.g. Linked Data) against some validation rules.
> Rewriting RDF generated by the Enhancer (Refactor
> Engine) ...
>
> Applying Rules and Reasoning on a whole knowledge base (RDF data that 
> do not fit in-memory) is not a typical use case.
>
> Based on your problem description you might want to have a look onto
>
> * Apache Marmotta and the Kiwi Triple Store
> (http://marmotta.incubator.apache.org/kiwi/introduction.html): This is 
> a Sesame Sail implementation that supports reasoning
> * OWLLIM (http://www.ontotext.com/owlim): Commercial product also 
> implementing Reasoning on top of the Sesame API.
>
> But I am not an export in those topics so there might be additional 
> options I am not aware of.
>
> hope this helps
> best
> Rupert
>
>
> On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
> > Hi,
> >
> >   I have large RDF data.
> >
> >    The requirement is to be able to reason / run rules on this data 
> > /
> >
> > search this data along with any other unstructured data which  I 
> > have
> enhanced using  Stanbol.
> >
> >
> >
> > Any pointers on how I can achieve this?
> >
> >
> >
> >
> >
> > Thanking you and Rgds,
> >
> > Arthi
> >
> >
> >
> >
> > Please do not print this email unless it is absolutely necessary.
> >
> > The information contained in this electronic message and any 
> > attachments
> to this message are intended for the exclusive use of the addressee(s) 
> and may contain proprietary, confidential or privileged information. 
> If you are not the intended recipient, you should not disseminate, 
> distribute or copy this e-mail. Please notify the sender immediately 
> and destroy all copies of this message and any attachments.
> >
> > WARNING: Computer viruses can be transmitted via email. The 
> > recipient
> should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus 
> transmitted by this email.
> >
> > www.wipro.com
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any 
> attachments to this message are intended for the exclusive use of the 
> addressee(s) and may contain proprietary, confidential or privileged 
> information. If you are not the intended recipient, you should not 
> disseminate, distribute or copy this e-mail. Please notify the sender 
> immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient 
> should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus 
> transmitted by this email.
>
> www.wipro.com
>

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com

RE: Working with large RDF data

Posted by ar...@wipro.com.
Thanks a lot Rupert,
Regards,
Arthi

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Tuesday, September 17, 2013 9:06 PM
To: dev@stanbol.apache.org
Subject: Re: Working with large RDF data

Hi Arthi,

On Tue, Sep 17, 2013 at 3:30 PM,  <ar...@wipro.com> wrote:
> Thanks Rupert,
> I have few clarifications / queries which I have asked inline below.
>
> "It should be possible to reason over the enhancement results and store all triples (including the deduced one) in Jena TDB. After that you can use SPARQL on the Jena TDB as suggested by Reto. However note that any change in the Ontology will not be reflected in the Jena TDB - as there is not truth maintenance."
>
> Iam currently  storing RDF in a separate SDB outside Stanbol.   Is there a way this / TDB  can be stored and managed as  part of the Stanbol.?
> For the problem of stale triples I could refresh entire store on change of Ontology.

Storing enhancement results in a Clerezza Triple Store (by default Jena TDB) is one part of what the Contenthub does. You can also write a simple component that retrieves an EnhancementRequest calls the Enhancer API and stores the results to a Clerezza store.

>
> "If the data does fit into memory you just store the plain RDF data, load them into an reasoning session to get the results. After that you can store the results in an other RDF store (e.g. Jena TDB) for later queries."
> How do you load RDF data into a session.  I could see a way to load an Ontology into a session  but not RDF instances.

Loading an Ontology or Instance data is not different. Both are RDF triples. The problem is that instance data are typically much bigger and will not fit into a session. If you have a quad store (such as Jena TDB) you could store Triples of each ContentItem in an own context (you can use the URI of the ContentItem as Context). This would allow to do perform reasoning sessions per content item
(context) when the Ontology changes.

NOTE that the default Clerezza TDB storage provider does not scale to a high number of contexts. So you would need to use the Scaleable TcProvider (CLEREZZA-736). Clerezza does not support quads. However you can get TripleCollections for each context via the TcManager.

I can not tell you how to load RDF data into a session, but I hope the documentation of the OntologyManager, Reasoning and Rule components provide such information.

best
Rupert

>
>
> "IMO if you need reasoning support over the whole knowledge base you should use a System that natively supports it. While the above workflows would allow to mimic such functionality it will become unpractical as the amount of data grows."
> I will evaluate some other stores to be used along with Stanbol say Virtuoso , etc to see if this limitation can be overcome.
>
>
> Thanking you and Regards,
> Arthi
>
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> Sent: Tuesday, September 17, 2013 12:46 PM
> To: dev@stanbol.apache.org
> Subject: Re: Working with large RDF data
>
> Hi
>
> It should be possible to reason over the enhancement results and store all triples (including the deduced one) in Jena TDB. After that you can use SPARQL on the Jena TDB as suggested by Reto. However note that any change in the Ontology will not be reflected in the Jena TDB - as there is not truth maintenance.
>
> If the data does fit into memory you just store the plain RDF data, load them into an reasoning session to get the results. After that you can store the results in an other RDF store (e.g. Jena TDB) for later queries.
>
> IMO if you need reasoning support over the whole knowledge base you should use a System that natively supports it. While the above workflows would allow to mimic such functionality it will become unpractical as the amount of data grows.
>
> best
> Rupert
>
>
>
>
> On Mon, Sep 16, 2013 at 3:29 PM, Reto Bachmann-Gmür <re...@wymiwyg.com> wrote:
>> Why in memory? TDB based clerezza store is quite efficient, so why 
>> not add the data to such a graph?
>>
>> reto
>>
>>
>> On Sat, Sep 14, 2013 at 9:14 AM, <ar...@wipro.com> wrote:
>>
>>> Thanks a lot Rupert
>>> If the RDF data is smaller ( can fit into memory ) is there a way we 
>>> can import into Stanbol and do a joint search across the 
>>> enhancements from unstructured text as well as the imported RDF data.
>>> If yes would this import be permanent or needs to be repeated each time.
>>>
>>>
>>> Thanks and Rgds,
>>> Arthi
>>>
>>>
>>> -----Original Message-----
>>> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
>>> Sent: Saturday, September 14, 2013 12:40 PM
>>> To: dev@stanbol.apache.org
>>> Subject: Re: Working with large RDF data
>>>
>>> Hi Arthi
>>>
>>> AFAIK the reasoning and rule components of Apache Stanbol are 
>>> intended to be used in "Sessions". They are not intended to be used 
>>> on a whole knowledge base. A typical use case could be validating 
>>> RDF data retrieved from a remote Server (e.g. Linked Data) against some validation rules.
>>> Rewriting RDF generated by the Enhancer (Refactor
>>> Engine) ...
>>>
>>> Applying Rules and Reasoning on a whole knowledge base (RDF data 
>>> that do not fit in-memory) is not a typical use case.
>>>
>>> Based on your problem description you might want to have a look onto
>>>
>>> * Apache Marmotta and the Kiwi Triple Store
>>> (http://marmotta.incubator.apache.org/kiwi/introduction.html): This 
>>> is a Sesame Sail implementation that supports reasoning
>>> * OWLLIM (http://www.ontotext.com/owlim): Commercial product also 
>>> implementing Reasoning on top of the Sesame API.
>>>
>>> But I am not an export in those topics so there might be additional 
>>> options I am not aware of.
>>>
>>> hope this helps
>>> best
>>> Rupert
>>>
>>>
>>> On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
>>> > Hi,
>>> >
>>> >   I have large RDF data.
>>> >
>>> >    The requirement is to be able to reason / run rules on this 
>>> > data /
>>> >
>>> > search this data along with any other unstructured data which  I 
>>> > have
>>> enhanced using  Stanbol.
>>> >
>>> >
>>> >
>>> > Any pointers on how I can achieve this?
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Thanking you and Rgds,
>>> >
>>> > Arthi
>>> >
>>> >
>>> >
>>> >
>>> > Please do not print this email unless it is absolutely necessary.
>>> >
>>> > The information contained in this electronic message and any 
>>> > attachments
>>> to this message are intended for the exclusive use of the
>>> addressee(s) and may contain proprietary, confidential or privileged 
>>> information. If you are not the intended recipient, you should not 
>>> disseminate, distribute or copy this e-mail. Please notify the 
>>> sender immediately and destroy all copies of this message and any attachments.
>>> >
>>> > WARNING: Computer viruses can be transmitted via email. The 
>>> > recipient
>>> should check this email and any attachments for the presence of viruses.
>>> The company accepts no liability for any damage caused by any virus 
>>> transmitted by this email.
>>> >
>>> > www.wipro.com
>>>
>>>
>>>
>>> --
>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any 
>>> attachments to this message are intended for the exclusive use of 
>>> the
>>> addressee(s) and may contain proprietary, confidential or privileged 
>>> information. If you are not the intended recipient, you should not 
>>> disseminate, distribute or copy this e-mail. Please notify the 
>>> sender immediately and destroy all copies of this message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The 
>>> recipient should check this email and any attachments for the presence of viruses.
>>> The company accepts no liability for any damage caused by any virus 
>>> transmitted by this email.
>>>
>>> www.wipro.com
>>>
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com

Re: Working with large RDF data

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Arthi,

On Tue, Sep 17, 2013 at 3:30 PM,  <ar...@wipro.com> wrote:
> Thanks Rupert,
> I have few clarifications / queries which I have asked inline below.
>
> "It should be possible to reason over the enhancement results and store all triples (including the deduced one) in Jena TDB. After that you can use SPARQL on the Jena TDB as suggested by Reto. However note that any change in the Ontology will not be reflected in the Jena TDB - as there is not truth maintenance."
>
> Iam currently  storing RDF in a separate SDB outside Stanbol.   Is there a way this / TDB  can be stored and managed as  part of the Stanbol.?
> For the problem of stale triples I could refresh entire store on change of Ontology.

Storing enhancement results in a Clerezza Triple Store (by default
Jena TDB) is one part of what the Contenthub does. You can also write
a simple component that retrieves an EnhancementRequest calls the
Enhancer API and stores the results to a Clerezza store.

>
> "If the data does fit into memory you just store the plain RDF data, load them into an reasoning session to get the results. After that you can store the results in an other RDF store (e.g. Jena TDB) for later queries."
> How do you load RDF data into a session.  I could see a way to load an Ontology into a session  but not RDF instances.

Loading an Ontology or Instance data is not different. Both are RDF
triples. The problem is that instance data are typically much bigger
and will not fit into a session. If you have a quad store (such as
Jena TDB) you could store Triples of each ContentItem in an own
context (you can use the URI of the ContentItem as Context). This
would allow to do perform reasoning sessions per content item
(context) when the Ontology changes.

NOTE that the default Clerezza TDB storage provider does not scale to
a high number of contexts. So you would need to use the Scaleable
TcProvider (CLEREZZA-736). Clerezza does not support quads. However
you can get TripleCollections for each context via the TcManager.

I can not tell you how to load RDF data into a session, but I hope the
documentation of the OntologyManager, Reasoning and Rule components
provide such information.

best
Rupert

>
>
> "IMO if you need reasoning support over the whole knowledge base you should use a System that natively supports it. While the above workflows would allow to mimic such functionality it will become unpractical as the amount of data grows."
> I will evaluate some other stores to be used along with Stanbol say Virtuoso , etc to see if this limitation can be overcome.
>
>
> Thanking you and Regards,
> Arthi
>
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> Sent: Tuesday, September 17, 2013 12:46 PM
> To: dev@stanbol.apache.org
> Subject: Re: Working with large RDF data
>
> Hi
>
> It should be possible to reason over the enhancement results and store all triples (including the deduced one) in Jena TDB. After that you can use SPARQL on the Jena TDB as suggested by Reto. However note that any change in the Ontology will not be reflected in the Jena TDB - as there is not truth maintenance.
>
> If the data does fit into memory you just store the plain RDF data, load them into an reasoning session to get the results. After that you can store the results in an other RDF store (e.g. Jena TDB) for later queries.
>
> IMO if you need reasoning support over the whole knowledge base you should use a System that natively supports it. While the above workflows would allow to mimic such functionality it will become unpractical as the amount of data grows.
>
> best
> Rupert
>
>
>
>
> On Mon, Sep 16, 2013 at 3:29 PM, Reto Bachmann-Gmür <re...@wymiwyg.com> wrote:
>> Why in memory? TDB based clerezza store is quite efficient, so why not
>> add the data to such a graph?
>>
>> reto
>>
>>
>> On Sat, Sep 14, 2013 at 9:14 AM, <ar...@wipro.com> wrote:
>>
>>> Thanks a lot Rupert
>>> If the RDF data is smaller ( can fit into memory ) is there a way we
>>> can import into Stanbol and do a joint search across the enhancements
>>> from unstructured text as well as the imported RDF data.
>>> If yes would this import be permanent or needs to be repeated each time.
>>>
>>>
>>> Thanks and Rgds,
>>> Arthi
>>>
>>>
>>> -----Original Message-----
>>> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
>>> Sent: Saturday, September 14, 2013 12:40 PM
>>> To: dev@stanbol.apache.org
>>> Subject: Re: Working with large RDF data
>>>
>>> Hi Arthi
>>>
>>> AFAIK the reasoning and rule components of Apache Stanbol are
>>> intended to be used in "Sessions". They are not intended to be used
>>> on a whole knowledge base. A typical use case could be validating RDF
>>> data retrieved from a remote Server (e.g. Linked Data) against some validation rules.
>>> Rewriting RDF generated by the Enhancer (Refactor
>>> Engine) ...
>>>
>>> Applying Rules and Reasoning on a whole knowledge base (RDF data that
>>> do not fit in-memory) is not a typical use case.
>>>
>>> Based on your problem description you might want to have a look onto
>>>
>>> * Apache Marmotta and the Kiwi Triple Store
>>> (http://marmotta.incubator.apache.org/kiwi/introduction.html): This
>>> is a Sesame Sail implementation that supports reasoning
>>> * OWLLIM (http://www.ontotext.com/owlim): Commercial product also
>>> implementing Reasoning on top of the Sesame API.
>>>
>>> But I am not an export in those topics so there might be additional
>>> options I am not aware of.
>>>
>>> hope this helps
>>> best
>>> Rupert
>>>
>>>
>>> On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
>>> > Hi,
>>> >
>>> >   I have large RDF data.
>>> >
>>> >    The requirement is to be able to reason / run rules on this data
>>> > /
>>> >
>>> > search this data along with any other unstructured data which  I
>>> > have
>>> enhanced using  Stanbol.
>>> >
>>> >
>>> >
>>> > Any pointers on how I can achieve this?
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > Thanking you and Rgds,
>>> >
>>> > Arthi
>>> >
>>> >
>>> >
>>> >
>>> > Please do not print this email unless it is absolutely necessary.
>>> >
>>> > The information contained in this electronic message and any
>>> > attachments
>>> to this message are intended for the exclusive use of the
>>> addressee(s) and may contain proprietary, confidential or privileged
>>> information. If you are not the intended recipient, you should not
>>> disseminate, distribute or copy this e-mail. Please notify the sender
>>> immediately and destroy all copies of this message and any attachments.
>>> >
>>> > WARNING: Computer viruses can be transmitted via email. The
>>> > recipient
>>> should check this email and any attachments for the presence of viruses.
>>> The company accepts no liability for any damage caused by any virus
>>> transmitted by this email.
>>> >
>>> > www.wipro.com
>>>
>>>
>>>
>>> --
>>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>>> | Bodenlehenstraße 11                             ++43-699-11108907
>>> | A-5500 Bischofshofen
>>>
>>> Please do not print this email unless it is absolutely necessary.
>>>
>>> The information contained in this electronic message and any
>>> attachments to this message are intended for the exclusive use of the
>>> addressee(s) and may contain proprietary, confidential or privileged
>>> information. If you are not the intended recipient, you should not
>>> disseminate, distribute or copy this e-mail. Please notify the sender
>>> immediately and destroy all copies of this message and any attachments.
>>>
>>> WARNING: Computer viruses can be transmitted via email. The recipient
>>> should check this email and any attachments for the presence of viruses.
>>> The company accepts no liability for any damage caused by any virus
>>> transmitted by this email.
>>>
>>> www.wipro.com
>>>
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

RE: Working with large RDF data

Posted by ar...@wipro.com.
Thanks Rupert,
I have few clarifications / queries which I have asked inline below.

"It should be possible to reason over the enhancement results and store all triples (including the deduced one) in Jena TDB. After that you can use SPARQL on the Jena TDB as suggested by Reto. However note that any change in the Ontology will not be reflected in the Jena TDB - as there is not truth maintenance."

Iam currently  storing RDF in a separate SDB outside Stanbol.   Is there a way this / TDB  can be stored and managed as  part of the Stanbol.?
For the problem of stale triples I could refresh entire store on change of Ontology.


"If the data does fit into memory you just store the plain RDF data, load them into an reasoning session to get the results. After that you can store the results in an other RDF store (e.g. Jena TDB) for later queries."
How do you load RDF data into a session.  I could see a way to load an Ontology into a session  but not RDF instances. 


"IMO if you need reasoning support over the whole knowledge base you should use a System that natively supports it. While the above workflows would allow to mimic such functionality it will become unpractical as the amount of data grows."
I will evaluate some other stores to be used along with Stanbol say Virtuoso , etc to see if this limitation can be overcome.


Thanking you and Regards,
Arthi

-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Tuesday, September 17, 2013 12:46 PM
To: dev@stanbol.apache.org
Subject: Re: Working with large RDF data

Hi

It should be possible to reason over the enhancement results and store all triples (including the deduced one) in Jena TDB. After that you can use SPARQL on the Jena TDB as suggested by Reto. However note that any change in the Ontology will not be reflected in the Jena TDB - as there is not truth maintenance.

If the data does fit into memory you just store the plain RDF data, load them into an reasoning session to get the results. After that you can store the results in an other RDF store (e.g. Jena TDB) for later queries.

IMO if you need reasoning support over the whole knowledge base you should use a System that natively supports it. While the above workflows would allow to mimic such functionality it will become unpractical as the amount of data grows.

best
Rupert




On Mon, Sep 16, 2013 at 3:29 PM, Reto Bachmann-Gmür <re...@wymiwyg.com> wrote:
> Why in memory? TDB based clerezza store is quite efficient, so why not 
> add the data to such a graph?
>
> reto
>
>
> On Sat, Sep 14, 2013 at 9:14 AM, <ar...@wipro.com> wrote:
>
>> Thanks a lot Rupert
>> If the RDF data is smaller ( can fit into memory ) is there a way we 
>> can import into Stanbol and do a joint search across the enhancements 
>> from unstructured text as well as the imported RDF data.
>> If yes would this import be permanent or needs to be repeated each time.
>>
>>
>> Thanks and Rgds,
>> Arthi
>>
>>
>> -----Original Message-----
>> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
>> Sent: Saturday, September 14, 2013 12:40 PM
>> To: dev@stanbol.apache.org
>> Subject: Re: Working with large RDF data
>>
>> Hi Arthi
>>
>> AFAIK the reasoning and rule components of Apache Stanbol are 
>> intended to be used in "Sessions". They are not intended to be used 
>> on a whole knowledge base. A typical use case could be validating RDF 
>> data retrieved from a remote Server (e.g. Linked Data) against some validation rules.
>> Rewriting RDF generated by the Enhancer (Refactor
>> Engine) ...
>>
>> Applying Rules and Reasoning on a whole knowledge base (RDF data that 
>> do not fit in-memory) is not a typical use case.
>>
>> Based on your problem description you might want to have a look onto
>>
>> * Apache Marmotta and the Kiwi Triple Store
>> (http://marmotta.incubator.apache.org/kiwi/introduction.html): This 
>> is a Sesame Sail implementation that supports reasoning
>> * OWLLIM (http://www.ontotext.com/owlim): Commercial product also 
>> implementing Reasoning on top of the Sesame API.
>>
>> But I am not an export in those topics so there might be additional 
>> options I am not aware of.
>>
>> hope this helps
>> best
>> Rupert
>>
>>
>> On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
>> > Hi,
>> >
>> >   I have large RDF data.
>> >
>> >    The requirement is to be able to reason / run rules on this data 
>> > /
>> >
>> > search this data along with any other unstructured data which  I 
>> > have
>> enhanced using  Stanbol.
>> >
>> >
>> >
>> > Any pointers on how I can achieve this?
>> >
>> >
>> >
>> >
>> >
>> > Thanking you and Rgds,
>> >
>> > Arthi
>> >
>> >
>> >
>> >
>> > Please do not print this email unless it is absolutely necessary.
>> >
>> > The information contained in this electronic message and any 
>> > attachments
>> to this message are intended for the exclusive use of the 
>> addressee(s) and may contain proprietary, confidential or privileged 
>> information. If you are not the intended recipient, you should not 
>> disseminate, distribute or copy this e-mail. Please notify the sender 
>> immediately and destroy all copies of this message and any attachments.
>> >
>> > WARNING: Computer viruses can be transmitted via email. The 
>> > recipient
>> should check this email and any attachments for the presence of viruses.
>> The company accepts no liability for any damage caused by any virus 
>> transmitted by this email.
>> >
>> > www.wipro.com
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any 
>> attachments to this message are intended for the exclusive use of the 
>> addressee(s) and may contain proprietary, confidential or privileged 
>> information. If you are not the intended recipient, you should not 
>> disseminate, distribute or copy this e-mail. Please notify the sender 
>> immediately and destroy all copies of this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient 
>> should check this email and any attachments for the presence of viruses.
>> The company accepts no liability for any damage caused by any virus 
>> transmitted by this email.
>>
>> www.wipro.com
>>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com

Re: Working with large RDF data

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi

It should be possible to reason over the enhancement results and store
all triples (including the deduced one) in Jena TDB. After that you
can use SPARQL on the Jena TDB as suggested by Reto. However note that
any change in the Ontology will not be reflected in the Jena TDB - as
there is not truth maintenance.

If the data does fit into memory you just store the plain RDF data,
load them into an reasoning session to get the results. After that you
can store the results in an other RDF store (e.g. Jena TDB) for later
queries.

IMO if you need reasoning support over the whole knowledge base you
should use a System that natively supports it. While the above
workflows would allow to mimic such functionality it will become
unpractical as the amount of data grows.

best
Rupert




On Mon, Sep 16, 2013 at 3:29 PM, Reto Bachmann-Gmür <re...@wymiwyg.com> wrote:
> Why in memory? TDB based clerezza store is quite efficient, so why not add
> the data to such a graph?
>
> reto
>
>
> On Sat, Sep 14, 2013 at 9:14 AM, <ar...@wipro.com> wrote:
>
>> Thanks a lot Rupert
>> If the RDF data is smaller ( can fit into memory ) is there a way we can
>> import into Stanbol and do a joint search across the enhancements from
>> unstructured text as well as the imported RDF data.
>> If yes would this import be permanent or needs to be repeated each time.
>>
>>
>> Thanks and Rgds,
>> Arthi
>>
>>
>> -----Original Message-----
>> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
>> Sent: Saturday, September 14, 2013 12:40 PM
>> To: dev@stanbol.apache.org
>> Subject: Re: Working with large RDF data
>>
>> Hi Arthi
>>
>> AFAIK the reasoning and rule components of Apache Stanbol are intended to
>> be used in "Sessions". They are not intended to be used on a whole
>> knowledge base. A typical use case could be validating RDF data retrieved
>> from a remote Server (e.g. Linked Data) against some validation rules.
>> Rewriting RDF generated by the Enhancer (Refactor
>> Engine) ...
>>
>> Applying Rules and Reasoning on a whole knowledge base (RDF data that do
>> not fit in-memory) is not a typical use case.
>>
>> Based on your problem description you might want to have a look onto
>>
>> * Apache Marmotta and the Kiwi Triple Store
>> (http://marmotta.incubator.apache.org/kiwi/introduction.html): This is a
>> Sesame Sail implementation that supports reasoning
>> * OWLLIM (http://www.ontotext.com/owlim): Commercial product also
>> implementing Reasoning on top of the Sesame API.
>>
>> But I am not an export in those topics so there might be additional
>> options I am not aware of.
>>
>> hope this helps
>> best
>> Rupert
>>
>>
>> On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
>> > Hi,
>> >
>> >   I have large RDF data.
>> >
>> >    The requirement is to be able to reason / run rules on this data /
>> >
>> > search this data along with any other unstructured data which  I have
>> enhanced using  Stanbol.
>> >
>> >
>> >
>> > Any pointers on how I can achieve this?
>> >
>> >
>> >
>> >
>> >
>> > Thanking you and Rgds,
>> >
>> > Arthi
>> >
>> >
>> >
>> >
>> > Please do not print this email unless it is absolutely necessary.
>> >
>> > The information contained in this electronic message and any attachments
>> to this message are intended for the exclusive use of the addressee(s) and
>> may contain proprietary, confidential or privileged information. If you are
>> not the intended recipient, you should not disseminate, distribute or copy
>> this e-mail. Please notify the sender immediately and destroy all copies of
>> this message and any attachments.
>> >
>> > WARNING: Computer viruses can be transmitted via email. The recipient
>> should check this email and any attachments for the presence of viruses.
>> The company accepts no liability for any damage caused by any virus
>> transmitted by this email.
>> >
>> > www.wipro.com
>>
>>
>>
>> --
>> | Rupert Westenthaler             rupert.westenthaler@gmail.com
>> | Bodenlehenstraße 11                             ++43-699-11108907
>> | A-5500 Bischofshofen
>>
>> Please do not print this email unless it is absolutely necessary.
>>
>> The information contained in this electronic message and any attachments
>> to this message are intended for the exclusive use of the addressee(s) and
>> may contain proprietary, confidential or privileged information. If you are
>> not the intended recipient, you should not disseminate, distribute or copy
>> this e-mail. Please notify the sender immediately and destroy all copies of
>> this message and any attachments.
>>
>> WARNING: Computer viruses can be transmitted via email. The recipient
>> should check this email and any attachments for the presence of viruses.
>> The company accepts no liability for any damage caused by any virus
>> transmitted by this email.
>>
>> www.wipro.com
>>



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Re: Working with large RDF data

Posted by Reto Bachmann-Gmür <re...@wymiwyg.com>.
Why in memory? TDB based clerezza store is quite efficient, so why not add
the data to such a graph?

reto


On Sat, Sep 14, 2013 at 9:14 AM, <ar...@wipro.com> wrote:

> Thanks a lot Rupert
> If the RDF data is smaller ( can fit into memory ) is there a way we can
> import into Stanbol and do a joint search across the enhancements from
> unstructured text as well as the imported RDF data.
> If yes would this import be permanent or needs to be repeated each time.
>
>
> Thanks and Rgds,
> Arthi
>
>
> -----Original Message-----
> From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com]
> Sent: Saturday, September 14, 2013 12:40 PM
> To: dev@stanbol.apache.org
> Subject: Re: Working with large RDF data
>
> Hi Arthi
>
> AFAIK the reasoning and rule components of Apache Stanbol are intended to
> be used in "Sessions". They are not intended to be used on a whole
> knowledge base. A typical use case could be validating RDF data retrieved
> from a remote Server (e.g. Linked Data) against some validation rules.
> Rewriting RDF generated by the Enhancer (Refactor
> Engine) ...
>
> Applying Rules and Reasoning on a whole knowledge base (RDF data that do
> not fit in-memory) is not a typical use case.
>
> Based on your problem description you might want to have a look onto
>
> * Apache Marmotta and the Kiwi Triple Store
> (http://marmotta.incubator.apache.org/kiwi/introduction.html): This is a
> Sesame Sail implementation that supports reasoning
> * OWLLIM (http://www.ontotext.com/owlim): Commercial product also
> implementing Reasoning on top of the Sesame API.
>
> But I am not an export in those topics so there might be additional
> options I am not aware of.
>
> hope this helps
> best
> Rupert
>
>
> On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
> > Hi,
> >
> >   I have large RDF data.
> >
> >    The requirement is to be able to reason / run rules on this data /
> >
> > search this data along with any other unstructured data which  I have
> enhanced using  Stanbol.
> >
> >
> >
> > Any pointers on how I can achieve this?
> >
> >
> >
> >
> >
> > Thanking you and Rgds,
> >
> > Arthi
> >
> >
> >
> >
> > Please do not print this email unless it is absolutely necessary.
> >
> > The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments.
> >
> > WARNING: Computer viruses can be transmitted via email. The recipient
> should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus
> transmitted by this email.
> >
> > www.wipro.com
>
>
>
> --
> | Rupert Westenthaler             rupert.westenthaler@gmail.com
> | Bodenlehenstraße 11                             ++43-699-11108907
> | A-5500 Bischofshofen
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments
> to this message are intended for the exclusive use of the addressee(s) and
> may contain proprietary, confidential or privileged information. If you are
> not the intended recipient, you should not disseminate, distribute or copy
> this e-mail. Please notify the sender immediately and destroy all copies of
> this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient
> should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus
> transmitted by this email.
>
> www.wipro.com
>

RE: Working with large RDF data

Posted by ar...@wipro.com.
Thanks a lot Rupert
If the RDF data is smaller ( can fit into memory ) is there a way we can import into Stanbol and do a joint search across the enhancements from unstructured text as well as the imported RDF data.
If yes would this import be permanent or needs to be repeated each time.


Thanks and Rgds,
Arthi


-----Original Message-----
From: Rupert Westenthaler [mailto:rupert.westenthaler@gmail.com] 
Sent: Saturday, September 14, 2013 12:40 PM
To: dev@stanbol.apache.org
Subject: Re: Working with large RDF data

Hi Arthi

AFAIK the reasoning and rule components of Apache Stanbol are intended to be used in "Sessions". They are not intended to be used on a whole knowledge base. A typical use case could be validating RDF data retrieved from a remote Server (e.g. Linked Data) against some validation rules. Rewriting RDF generated by the Enhancer (Refactor
Engine) ...

Applying Rules and Reasoning on a whole knowledge base (RDF data that do not fit in-memory) is not a typical use case.

Based on your problem description you might want to have a look onto

* Apache Marmotta and the Kiwi Triple Store
(http://marmotta.incubator.apache.org/kiwi/introduction.html): This is a Sesame Sail implementation that supports reasoning
* OWLLIM (http://www.ontotext.com/owlim): Commercial product also implementing Reasoning on top of the Sesame API.

But I am not an export in those topics so there might be additional options I am not aware of.

hope this helps
best
Rupert


On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
> Hi,
>
>   I have large RDF data.
>
>    The requirement is to be able to reason / run rules on this data /
>
> search this data along with any other unstructured data which  I have enhanced using  Stanbol.
>
>
>
> Any pointers on how I can achieve this?
>
>
>
>
>
> Thanking you and Rgds,
>
> Arthi
>
>
>
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. 

www.wipro.com

Re: Working with large RDF data

Posted by Rupert Westenthaler <ru...@gmail.com>.
Hi Arthi

AFAIK the reasoning and rule components of Apache Stanbol are intended
to be used in "Sessions". They are not intended to be used on a whole
knowledge base. A typical use case could be validating RDF data
retrieved from a remote Server (e.g. Linked Data) against some
validation rules. Rewriting RDF generated by the Enhancer (Refactor
Engine) ...

Applying Rules and Reasoning on a whole knowledge base (RDF data that
do not fit in-memory) is not a typical use case.

Based on your problem description you might want to have a look onto

* Apache Marmotta and the Kiwi Triple Store
(http://marmotta.incubator.apache.org/kiwi/introduction.html): This is
a Sesame Sail implementation that supports reasoning
* OWLLIM (http://www.ontotext.com/owlim): Commercial product also
implementing Reasoning on top of the Sesame API.

But I am not an export in those topics so there might be additional
options I am not aware of.

hope this helps
best
Rupert


On Fri, Sep 13, 2013 at 1:48 PM,  <ar...@wipro.com> wrote:
> Hi,
>
>   I have large RDF data.
>
>    The requirement is to be able to reason / run rules on this data /
>
> search this data along with any other unstructured data which  I have enhanced using  Stanbol.
>
>
>
> Any pointers on how I can achieve this?
>
>
>
>
>
> Thanking you and Rgds,
>
> Arthi
>
>
>
>
> Please do not print this email unless it is absolutely necessary.
>
> The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments.
>
> WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email.
>
> www.wipro.com



-- 
| Rupert Westenthaler             rupert.westenthaler@gmail.com
| Bodenlehenstraße 11                             ++43-699-11108907
| A-5500 Bischofshofen