You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@ambari.apache.org by "Jonathan Hurley (JIRA)" <ji...@apache.org> on 2015/08/07 17:15:46 UTC

[jira] [Reopened] (AMBARI-12657) Cluster creates fail on larger deployments with SQL Azure DB

     [ https://issues.apache.org/jira/browse/AMBARI-12657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jonathan Hurley reopened AMBARI-12657:
--------------------------------------

> Cluster creates fail on larger deployments with SQL Azure DB
> ------------------------------------------------------------
>
>                 Key: AMBARI-12657
>                 URL: https://issues.apache.org/jira/browse/AMBARI-12657
>             Project: Ambari
>          Issue Type: Bug
>          Components: ambari-server
>    Affects Versions: 2.0.0
>            Reporter: Jonathan Hurley
>            Assignee: Jonathan Hurley
>            Priority: Critical
>             Fix For: 2.1.1
>
>         Attachments: AMBARI-12657.patch
>
>
> We started doing larger cluster creates (48 workernodes) with SQL Azure DB as an Ambari DB, and we are seeing below HTTP GET requests timeout on the client side (even after retries), resulting in cluster create failures (15%). This is a tracking Jira to resolve the CRUD failures.
> What I’m seeing is that DB CPU usage goes above 50% in some of my experiments for 48 node clusters. This might explain why SQL is running slow.
> end_time            avg_cpu_percent            avg_data_io_percent    avg_log_write_percent                avg_memory_usage_percent
> 2015-08-05 18:51:24.153                40.89     0.00        0.62        0.67
> 2015-08-05 18:51:09.107                41.86     0.00        1.49        0.67
> 2015-08-05 18:50:54.090                24.36     0.00        0.08        0.67
> 2015-08-05 18:50:38.763                43.16     0.00        0.57        0.67
> 2015-08-05 18:50:23.700                65.03     0.00        0.51        0.67
> 2015-08-05 18:50:07.840                28.57     0.00        0.45        0.67
> 2015-08-05 18:49:49.480                39.78     0.00        0.42        0.67
> 2015-08-05 18:49:34.383                28.14     0.00        0.43        0.67
> Most expensive queries in terms of CPU time are below. 
> Basically, it’s this one query which consumes most of the CPU. Query plan is also attached.
> {code}
> SELECT DISTINCT t0.request_id FROM host_role_command t0 WHERE NOT EXISTS (SELECT @P0 FROM host_role_command t1 WHERE (t1.status IN (@P1,@P2,@P3,@P4,@P5,@P6,@P7,@P8,@P9)))  ORDER BY t0.request_id ASC
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)