You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@atlas.apache.org by "Sharmadha Sainath (JIRA)" <ji...@apache.org> on 2016/05/10 06:59:12 UTC

[jira] [Created] (ATLAS-769) Queries on CTAS tables complete faster than regular tables.

Sharmadha Sainath created ATLAS-769:
---------------------------------------

             Summary: Queries on CTAS tables complete faster than regular tables.
                 Key: ATLAS-769
                 URL: https://issues.apache.org/jira/browse/ATLAS-769
             Project: Atlas
          Issue Type: Bug
         Environment: Cluster Setup :
Machine 1 : Atlas Server, Solr
Machine 2 : HBase , External Kafka 
Machine 3 : Client


Atlas : c69df40f7c069646b613ebb58739f6be47ea0f89 
with patch ATLAS-690-4.PATCH 
            Reporter: Sharmadha Sainath


Atlas is populated with 10,000 tables. ( 6000 tables with 10 columns , 3000 tables with 50 columns , 1000 tables with 100 columns ). Load testing for read is done by simulating 30 users running 5 queries one after other for 80 times . This is simulated using Apache JMeter , creating 5 samplers (queries) in a thread group , setting number of users = 30, number of loops = 80.

5 queries :
1. Get table given name
api/atlas/discovery/search/dsl?query=hive_table+where+name%3D%27database.table@cluster%27
2. Get details of a table
(api/atlas/entities/guid)
3. Get schema of table
/api/atlas/lineage/hive/table/"table"/schema
4.Lineage graph input
/api/atlas/lineage/hive/table/"table"/inputs/graph
5.Lineage output graph
/api/atlas/lineage/hive/table/"table"/outputs/graph"

Then created 2000 CTAS tables (20 % of 10,000 - 1200 Small tables,600 Medium tables, 200 large tables) and the same test is run on CTAS tables.

Time taken for test run completion on Regular tables is slower that CTAS tables run. 

Regular tables run : 20 mins
CTAS tables run : 17 mins









--
This message was sent by Atlassian JIRA
(v6.3.4#6332)