You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Zhengyi Liu (Jira)" <ji...@apache.org> on 2020/02/03 20:52:00 UTC
[jira] [Commented] (ATLAS-488) Atlas service scalability

    [ https://issues.apache.org/jira/browse/ATLAS-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029258#comment-17029258 ] 

Zhengyi Liu commented on ATLAS-488:
-----------------------------------

Hey [~vmadugun], I am working on setting up an Atlas production env and I am curious about the progress of this task. Seems like we didn't put extra effort since last update and I am curious how would that block us from adopting Atlas and eventually ingest 40 million + entities. Is there an existing benchmark from the industry peers that prove the scalability of the existing solution? I can see how that would give people confidence even with current approach (single node with HA).

> Atlas service scalability
> -------------------------
>
>                 Key: ATLAS-488
>                 URL: https://issues.apache.org/jira/browse/ATLAS-488
>             Project: Atlas
>          Issue Type: Improvement
>            Reporter: venkata madugundu
>            Priority: Major
>
> Requirement -- We are looking for ways to have Atlas service running in multiple instances for Cloud deployment, and this requirement is kind of crucial to the adoption of Atlas.
> On the dev discussion forum, I had discussed earlier (Jan 25th) on the requirement to run multiple instances of Atlas service/runtime for horizontal scale. Per answers that we got from Hemanth, Atlas has a current limitation that only instance can be active at any given point of time for a couple of reasons. Few reasons cited were ...
> 1) For performance reasons, the typesystem is cached in-memory. Type lookup is needed for query evaluation, instance serialization/deserialization etc.,
> 2) Titan locks for Hbase transactions, causing performance degradation.
> Considering the above reasons, among them,
>  #1 can be mitigated in our scenarios. The types that we create are more or less design-time application data-models (metamodels) and they do not change at runtime. So if I were to pre-register types with Atlas before my cluster is up, I am hoping #1 is not an issue. In the sense, how many ever Atlas instances come up, all would have the same state, which would mean I am guaranteed API correctness.
> I am not so sure how we can avoid #2. I neither understand the persistence mapping of metadata to the graph(Atlas) and the mapping of graph to the Hbase (Titan) to comment on whether #2 is really a problem for us in terms of lock-contention. Atlas as REST API inherently does not offer any "transaction" like facility to run a set of API for its end-users. Given that, we are more likely to implement some kind of optimistic locking strategy to avoid data inconsistencies.
> Are there are any other reasons, that you can think of why multiple Atlas runtime instances cannot be talking to a single Hbase cluster for horizontal scale ?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)