You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@rya.apache.org by "Ly, Kiet" <Ki...@finra.org> on 2016/06/15 00:48:51 UTC

RYA as triple stores questions?

We are trying to evaluate RYA as potential semantic data lake running on EMR cluster. If I can get some answer to these questions, it would help us greatly.

What is the largest triples store currently in production using RYA in term of billion/trillion of triples?
Any of the production installation running on EMR in AWS? If yes, what is the instance types, #of nodes, # of zookeepers, etc..
Are there any hard limitations (# of nodes, min/max memory, etc…)?
Any operational issues with Accumulo? We had a lot of experience with Hbase but not with Accumulo.




Confidentiality Notice::  This email, including attachments, may include non-public, proprietary, confidential or legally privileged information.  If you are not an intended recipient or an authorized agent of an intended recipient, you are hereby notified that any dissemination, distribution or copying of the information contained in or transmitted with this e-mail is unauthorized and strictly prohibited.  If you have received this email in error, please notify the sender by replying to this message and permanently delete this e-mail, its attachments, and any copies of it immediately.  You should not retain, copy or use this e-mail or any attachment for any purpose, nor disclose all or any part of the contents to any other person. Thank you.

Re: RYA as triple stores questions?

Posted by "Ly, Kiet" <Ki...@finra.org>.
I know. It is the best way the reach at a lot of people at once. 
Of course, if any want to share the deployment info please email me privately kiet.ly@finra.org.


I saw your list of inference properties (rdfs/owl) and it is very basic. Is there plan to improve it?
Bring in OWL 2 reasoner (Pellet/Hermit)?. I will attend the bimonthly meeting later today.



On 6/15/16, 9:49 AM, "Puja Valiyil" <pu...@gmail.com> wrote:

>Hi Kiet,
>I'm not sure that a lot of us can disclose over the dev list a lot of the
>places that Rya is currently running operationally.
>As far as your other questions, I'm not aware of any hard limitations
>though I'm not sure I understood the question -- maybe Aaron or Adina (or
>someone else) might have more insight?
>I haven't heard of many operational issues with Accumulo, other than a lot
>of performance can be configuration dependent.  Accumulo has a lot of
>documentation on configuration and installation.  If you have more
>specifics about your particular deployment, we may be able to help more.
>
>On Tue, Jun 14, 2016 at 8:48 PM, Ly, Kiet <Ki...@finra.org> wrote:
>
>> We are trying to evaluate RYA as potential semantic data lake running on
>> EMR cluster. If I can get some answer to these questions, it would help us
>> greatly.
>>
>> What is the largest triples store currently in production using RYA in
>> term of billion/trillion of triples?
>> Any of the production installation running on EMR in AWS? If yes, what is
>> the instance types, #of nodes, # of zookeepers, etc..
>> Are there any hard limitations (# of nodes, min/max memory, etc…)?
>> Any operational issues with Accumulo? We had a lot of experience with
>> Hbase but not with Accumulo.
>>
>>
>>
>>
>> Confidentiality Notice::  This email, including attachments, may include
>> non-public, proprietary, confidential or legally privileged information.
>> If you are not an intended recipient or an authorized agent of an intended
>> recipient, you are hereby notified that any dissemination, distribution or
>> copying of the information contained in or transmitted with this e-mail is
>> unauthorized and strictly prohibited.  If you have received this email in
>> error, please notify the sender by replying to this message and permanently
>> delete this e-mail, its attachments, and any copies of it immediately.  You
>> should not retain, copy or use this e-mail or any attachment for any
>> purpose, nor disclose all or any part of the contents to any other person.
>> Thank you.
>>

Re: RYA as triple stores questions?

Posted by Puja Valiyil <pu...@gmail.com>.
Hi Kiet,
I'm not sure that a lot of us can disclose over the dev list a lot of the
places that Rya is currently running operationally.
As far as your other questions, I'm not aware of any hard limitations
though I'm not sure I understood the question -- maybe Aaron or Adina (or
someone else) might have more insight?
I haven't heard of many operational issues with Accumulo, other than a lot
of performance can be configuration dependent.  Accumulo has a lot of
documentation on configuration and installation.  If you have more
specifics about your particular deployment, we may be able to help more.

On Tue, Jun 14, 2016 at 8:48 PM, Ly, Kiet <Ki...@finra.org> wrote:

> We are trying to evaluate RYA as potential semantic data lake running on
> EMR cluster. If I can get some answer to these questions, it would help us
> greatly.
>
> What is the largest triples store currently in production using RYA in
> term of billion/trillion of triples?
> Any of the production installation running on EMR in AWS? If yes, what is
> the instance types, #of nodes, # of zookeepers, etc..
> Are there any hard limitations (# of nodes, min/max memory, etc…)?
> Any operational issues with Accumulo? We had a lot of experience with
> Hbase but not with Accumulo.
>
>
>
>
> Confidentiality Notice::  This email, including attachments, may include
> non-public, proprietary, confidential or legally privileged information.
> If you are not an intended recipient or an authorized agent of an intended
> recipient, you are hereby notified that any dissemination, distribution or
> copying of the information contained in or transmitted with this e-mail is
> unauthorized and strictly prohibited.  If you have received this email in
> error, please notify the sender by replying to this message and permanently
> delete this e-mail, its attachments, and any copies of it immediately.  You
> should not retain, copy or use this e-mail or any attachment for any
> purpose, nor disclose all or any part of the contents to any other person.
> Thank you.
>