You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rya.apache.org by Plamen Tarkalanov <p_...@abv.bg> on 2020/08/05 13:23:37 UTC

Benchmark question

Hello,

I am trying to find benchmarks for the database.
I was able to find only this paper from 2013: https://www.usna.edu/Users/cs/adina/research/Rya_ISjournal2013.pdf <https://www.usna.edu/Users/cs/adina/research/Rya_ISjournal2013.pdf> . 
Do you have anything fresher?

On your site it says that the DB “query processing techniques that scale to billions of triples”.
Is the database supposed to handle more than 100 B of triples?
According to https://www.w3.org/wiki/LargeTripleStores <https://www.w3.org/wiki/LargeTripleStores> there are single node implementations which scale in the billions.

Thanks!
Plamen


Re: Benchmark question

Posted by Brad Rushworth <br...@remote.com.au>.
Hi Plamen,

Apologies for the delay in responding to you.

Accumulo itself, the default underlying database of Rya, is known to scale to very large instances. Clusters of hundreds or thousands of nodes are common. Those store tens, even hundreds, of petabytes. A lot more than 100B triples would fit. Accumulo scales well as a warehouse of data.

Rya doesn't really do anything fancy that would limit the underlying scalability of Accumulo, as far as I can tell, but that really depends on the complexity of your use of RDFS and OWL ontology features, for instance. The limiting factor is probably two things:

  1.  MapReduce and/or Fluo is used to pre-compute answers to various indexes or optimization strategies. These frameworks are distributed across a network and read from disks, so are never going to be as fast as alternatives that reside on huge hardware nodes, in memory (if you have a machine with a few TB of RAM for instance).
  2.  Queries are performed through a single Tomcat instance, so some query results (for example complex joins) may need to fit into memory on that box. Simple queries stream though.

Hope this helps.

Brad


________________________________
From: Plamen Tarkalanov <p_...@abv.bg>
Sent: Wednesday, August 5, 2020 11:23 PM
To: dev@rya.apache.org <de...@rya.apache.org>
Subject: Benchmark question

Hello,

I am trying to find benchmarks for the database.
I was able to find only this paper from 2013: https://www.usna.edu/Users/cs/adina/research/Rya_ISjournal2013.pdf <https://www.usna.edu/Users/cs/adina/research/Rya_ISjournal2013.pdf> .
Do you have anything fresher?

On your site it says that the DB “query processing techniques that scale to billions of triples”.
Is the database supposed to handle more than 100 B of triples?
According to https://www.w3.org/wiki/LargeTripleStores <https://www.w3.org/wiki/LargeTripleStores> there are single node implementations which scale in the billions.

Thanks!
Plamen


Re: Benchmark question

Posted by Adina Crainiceanu <ad...@usna.edu>.
Hi Plamen,

Thank you for your interest in Rya. I don't have more recent benchmark
results besides those reported in the paper you found. Maybe someone else
on this list tried running Rya with more than 100 B triples and can answer
your question.

To make sure you don't miss an answer, I suggest you subscribe to the dev
list by sending an email to dev-subscribe@rya.apache.org.(see
https://rya.apache.org/community/)

Best regards,
Adina

On Thu, Aug 6, 2020 at 1:10 PM Plamen Tarkalanov <p_...@abv.bg>
wrote:

> Hello,
>
> I am trying to find benchmarks for the database.
> I was able to find only this paper from 2013:
> https://www.usna.edu/Users/cs/adina/research/Rya_ISjournal2013.pdf <
> https://www.usna.edu/Users/cs/adina/research/Rya_ISjournal2013.pdf> .
> Do you have anything fresher?
>
> On your site it says that the DB “query processing techniques that scale
> to billions of triples”.
> Is the database supposed to handle more than 100 B of triples?
> According to https://www.w3.org/wiki/LargeTripleStores <
> https://www.w3.org/wiki/LargeTripleStores> there are single node
> implementations which scale in the billions.
>
> Thanks!
> Plamen
>
>

-- 
Dr. Adina Crainiceanu
Professor
Computer Science Department
United States Naval Academy
410-293-6822
adina@usna.edu
http://www.usna.edu/Users/cs/adina/