You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@jena.apache.org by Davy Cox <da...@gmail.com> on 2015/11/27 14:35:16 UTC

Persisten Jena Fuseki on AWS

I would like to set up a redundant jena fuseki RDF sparql server on amazon
AWS.

I'd like to do this with AWS container service (docker) which loads the
server as a docker allowing me to scale the jena sparql server as required
for the load.

Locally I already use the "stain/jena-fuseki" docker which works great, but
I'm a bit puzzled with the best practice on saving the data in one
location, but allowing access from multiple jena-fuseki docker instances at
the same time. (I read somewhere it should only be written to by one
instance???)

I see jena supports relational databases as a back-end via TDB (which would
probably allow me to use one of the AWS RDS), but because I also read
somewhere this is not recommended.

What would you recommend to set up a multi docker jena-fuseki server where
data is stored in one persistent location (like DynamoDB or S3???) allowing
scaling and redundancy but at the same time fault tolerance?

Thank you in advance!

Re: Persisten Jena Fuseki on AWS

Posted by Andy Seaborne <an...@apache.org>.

A successful setup that I have seen used (and used myself) is to have 2+ 
Fuseki instances (with TDB) behind an AWS load balancer.  This supports 
data publishing ("reads" from the load balencer side).

Update is done by updating each server (in parallel or in sequence)   It 
is then not transactional across the 2+ servers, it's only transactional 
at each server.

	Andy

On 27/11/15 15:03, A. Soroka wrote:
> This is a Jena-specific list, so maybe not the best place to research that question, but there is a new development for Jena called TDB2 that may be of interest, announced here:
>
> https://mail-archives.apache.org/mod_mbox/jena-dev/201506.mbox/%3C5575B7B3.8020101@apache.org%3E
>
> Andy would be able to say more about that.
>
> ---
> A. Soroka
> The University of Virginia Library
>
>> On Nov 27, 2015, at 9:03 AM, Davy Cox <da...@gmail.com> wrote:
>>
>> What would you then suggest as a server deployment for SPARQL and RDF
>> support that allows HA without expensive licensing?
>> I'm currently looking at Blazegraph (Bigdata)?
>>
>> Any suggestions?
>>
>> On Fri, Nov 27, 2015 at 3:00 PM, A. Soroka <aj...@virginia.edu> wrote:
>>
>>>> I see jena supports relational databases as a back-end via TDB (which
>>> would probably allow me to use one of the AWS RDS), but because I also read
>>> somewhere this is not recommended.
>>>
>>> This isn’t quite accurate. Jena supports relational databases as backends
>>> via SDB, not TDB, and SDB isn’t recommended for performance reasons
>>> (possibly amongst others that I don’t know about). TDB is more performant
>>> and much closer to the front of development.
>>>
>>> ---
>>> A. Soroka
>>> The University of Virginia Library
>>>
>>>> On Nov 27, 2015, at 8:35 AM, Davy Cox <da...@gmail.com> wrote:
>>>>
>>>> I would like to set up a redundant jena fuseki RDF sparql server on
>>> amazon
>>>> AWS.
>>>>
>>>> I'd like to do this with AWS container service (docker) which loads the
>>>> server as a docker allowing me to scale the jena sparql server as
>>> required
>>>> for the load.
>>>>
>>>> Locally I already use the "stain/jena-fuseki" docker which works great,
>>> but
>>>> I'm a bit puzzled with the best practice on saving the data in one
>>>> location, but allowing access from multiple jena-fuseki docker instances
>>> at
>>>> the same time. (I read somewhere it should only be written to by one
>>>> instance???)
>>>>
>>>> I see jena supports relational databases as a back-end via TDB (which
>>> would
>>>> probably allow me to use one of the AWS RDS), but because I also read
>>>> somewhere this is not recommended.
>>>>
>>>> What would you recommend to set up a multi docker jena-fuseki server
>>> where
>>>> data is stored in one persistent location (like DynamoDB or S3???)
>>> allowing
>>>> scaling and redundancy but at the same time fault tolerance?
>>>>
>>>> Thank you in advance!
>>>
>>>
>

Re: Persisten Jena Fuseki on AWS

Posted by "A. Soroka" <aj...@virginia.edu>.

This is a Jena-specific list, so maybe not the best place to research that question, but there is a new development for Jena called TDB2 that may be of interest, announced here:

https://mail-archives.apache.org/mod_mbox/jena-dev/201506.mbox/%3C5575B7B3.8020101@apache.org%3E

Andy would be able to say more about that.

---
A. Soroka
The University of Virginia Library

> On Nov 27, 2015, at 9:03 AM, Davy Cox <da...@gmail.com> wrote:
> 
> What would you then suggest as a server deployment for SPARQL and RDF
> support that allows HA without expensive licensing?
> I'm currently looking at Blazegraph (Bigdata)?
> 
> Any suggestions?
> 
> On Fri, Nov 27, 2015 at 3:00 PM, A. Soroka <aj...@virginia.edu> wrote:
> 
>>> I see jena supports relational databases as a back-end via TDB (which
>> would probably allow me to use one of the AWS RDS), but because I also read
>> somewhere this is not recommended.
>> 
>> This isn’t quite accurate. Jena supports relational databases as backends
>> via SDB, not TDB, and SDB isn’t recommended for performance reasons
>> (possibly amongst others that I don’t know about). TDB is more performant
>> and much closer to the front of development.
>> 
>> ---
>> A. Soroka
>> The University of Virginia Library
>> 
>>> On Nov 27, 2015, at 8:35 AM, Davy Cox <da...@gmail.com> wrote:
>>> 
>>> I would like to set up a redundant jena fuseki RDF sparql server on
>> amazon
>>> AWS.
>>> 
>>> I'd like to do this with AWS container service (docker) which loads the
>>> server as a docker allowing me to scale the jena sparql server as
>> required
>>> for the load.
>>> 
>>> Locally I already use the "stain/jena-fuseki" docker which works great,
>> but
>>> I'm a bit puzzled with the best practice on saving the data in one
>>> location, but allowing access from multiple jena-fuseki docker instances
>> at
>>> the same time. (I read somewhere it should only be written to by one
>>> instance???)
>>> 
>>> I see jena supports relational databases as a back-end via TDB (which
>> would
>>> probably allow me to use one of the AWS RDS), but because I also read
>>> somewhere this is not recommended.
>>> 
>>> What would you recommend to set up a multi docker jena-fuseki server
>> where
>>> data is stored in one persistent location (like DynamoDB or S3???)
>> allowing
>>> scaling and redundancy but at the same time fault tolerance?
>>> 
>>> Thank you in advance!
>> 
>>

Re: Persisten Jena Fuseki on AWS

Posted by Davy Cox <da...@gmail.com>.

What would you then suggest as a server deployment for SPARQL and RDF
support that allows HA without expensive licensing?
I'm currently looking at Blazegraph (Bigdata)?

Any suggestions?

On Fri, Nov 27, 2015 at 3:00 PM, A. Soroka <aj...@virginia.edu> wrote:

> > I see jena supports relational databases as a back-end via TDB (which
> would probably allow me to use one of the AWS RDS), but because I also read
> somewhere this is not recommended.
>
> This isn’t quite accurate. Jena supports relational databases as backends
> via SDB, not TDB, and SDB isn’t recommended for performance reasons
> (possibly amongst others that I don’t know about). TDB is more performant
> and much closer to the front of development.
>
> ---
> A. Soroka
> The University of Virginia Library
>
> > On Nov 27, 2015, at 8:35 AM, Davy Cox <da...@gmail.com> wrote:
> >
> > I would like to set up a redundant jena fuseki RDF sparql server on
> amazon
> > AWS.
> >
> > I'd like to do this with AWS container service (docker) which loads the
> > server as a docker allowing me to scale the jena sparql server as
> required
> > for the load.
> >
> > Locally I already use the "stain/jena-fuseki" docker which works great,
> but
> > I'm a bit puzzled with the best practice on saving the data in one
> > location, but allowing access from multiple jena-fuseki docker instances
> at
> > the same time. (I read somewhere it should only be written to by one
> > instance???)
> >
> > I see jena supports relational databases as a back-end via TDB (which
> would
> > probably allow me to use one of the AWS RDS), but because I also read
> > somewhere this is not recommended.
> >
> > What would you recommend to set up a multi docker jena-fuseki server
> where
> > data is stored in one persistent location (like DynamoDB or S3???)
> allowing
> > scaling and redundancy but at the same time fault tolerance?
> >
> > Thank you in advance!
>
>

Re: Persisten Jena Fuseki on AWS

Posted by "A. Soroka" <aj...@virginia.edu>.

> I see jena supports relational databases as a back-end via TDB (which would probably allow me to use one of the AWS RDS), but because I also read somewhere this is not recommended.

This isn’t quite accurate. Jena supports relational databases as backends via SDB, not TDB, and SDB isn’t recommended for performance reasons (possibly amongst others that I don’t know about). TDB is more performant and much closer to the front of development.

---
A. Soroka
The University of Virginia Library

> On Nov 27, 2015, at 8:35 AM, Davy Cox <da...@gmail.com> wrote:
> 
> I would like to set up a redundant jena fuseki RDF sparql server on amazon
> AWS.
> 
> I'd like to do this with AWS container service (docker) which loads the
> server as a docker allowing me to scale the jena sparql server as required
> for the load.
> 
> Locally I already use the "stain/jena-fuseki" docker which works great, but
> I'm a bit puzzled with the best practice on saving the data in one
> location, but allowing access from multiple jena-fuseki docker instances at
> the same time. (I read somewhere it should only be written to by one
> instance???)
> 
> I see jena supports relational databases as a back-end via TDB (which would
> probably allow me to use one of the AWS RDS), but because I also read
> somewhere this is not recommended.
> 
> What would you recommend to set up a multi docker jena-fuseki server where
> data is stored in one persistent location (like DynamoDB or S3???) allowing
> scaling and redundancy but at the same time fault tolerance?
> 
> Thank you in advance!