You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "venkata madugundu (JIRA)" <ji...@apache.org> on 2016/05/04 10:07:12 UTC

[jira] [Updated] (ATLAS-683) Refactor local type-system cache with cache provider interface

     [ https://issues.apache.org/jira/browse/ATLAS-683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

venkata madugundu updated ATLAS-683:
------------------------------------
    Attachment: RedisTypeCacheProvider.java

FYI - Attached an experimental Redis based implementation of the cache provider interface. This helped me to validate the completeness of cache provider interface with two implementations.

Tested it with Atlas and Redis running on same machine. All operations of Type, Entity, Query, Lineage worked fine. Ran quick_start.py and did a sanity test of UI by navigation, running queries and looking at Lineage graph.

> Refactor local type-system cache with cache provider interface
> --------------------------------------------------------------
>
>                 Key: ATLAS-683
>                 URL: https://issues.apache.org/jira/browse/ATLAS-683
>             Project: Atlas
>          Issue Type: Sub-task
>    Affects Versions: 0.7-incubating
>            Reporter: venkata madugundu
>            Assignee: venkata madugundu
>            Priority: Critical
>              Labels: high-availability, performance, scalability
>             Fix For: 0.7-incubating
>
>         Attachments: ATLAS-683-1.patch, ATLAS-683.patch, RedisTypeCacheProvider.java
>
>
> As noted in ATLAS-488, local type-system cache makes Atlas runtime stateful and prevents multiple Atlas instances to be active in a cluster. Either the type-cache should be synched across Atlas instances (on all type create/update requests) or the type-cache should be moved out of Atlas to something like a distributed cache. 
> 1. As a first step, the local type-cache code in TypeSystem.java can be refactored to be carved out as an interface like TypeCacheProvider (whose default implementation for a standalone Atlas server would just use in-process local cache). The cache provider implementation itself could be specified as an optional configuration property. Expert users of Atlas can choose to inject a custom cache provider which can likely hit a distributed cache. We are evaluating the use of a distributed cache. 
> 2. As a second step, some more refactoring can be done to minimize/optimize the calls made to TypeSystem for type lookup queries. Essentially, in a given transaction/request, once a type lookup is done, it should not be requeried again. A request scoped variable (guice would probably help with that scoping) can hold all the lookups made in a request. This might sound like a cache of a cache, but I think it should help in reducing the hits to cache provider if the provider is hitting a remote cache.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)