You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Thomas Friedrich <tf...@yahoo.com> on 2015/10/09 00:20:43 UTC

Review Request 39146: Make mysql service check more robust

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39146/
-----------------------------------------------------------

Review request for Ambari and Andrew Onischuk.


Bugs: AMBARI-13238
    https://issues.apache.org/jira/browse/AMBARI-13238


Repository: ambari


Description
-------

The MySQL service check in mysql_service.py simply checks for a process with name mysqld. In our environment, a different service ran another MySQL instance on that node and as a result, the status of the MySQL service in Hive showed green (because it could find a mysqld process) even though the instance used by Hive wasn't started. That also made the "Start Service" action for Hive fail, because the metastore service couldn't connect to the MySQL database.

The proposed fix makes the service check more robust by retrieving the pid_file of the MySQL instance first by running "mysqladmin variables" and parsing out the pid_file. Then it checks if the process exists. 

As a side-note: the initial fix tried to get the pid_file from the /etc/my.cnf file but I found that while this is one default location for the MySQL configuration, there are other places where the file can be referenced. Since the MySQL instance launched by Hive is not getting any parameters for config file or pid_file, MySQL must read it from one of the default locations and running "mysqladmin variables" must return the config values for the running instance used by Hive. The other instance running needs to explicitely set values for config file and pid because otherwise would collide with the default instance.


Diffs
-----

  ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py c48c1ef 
  ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py 1155e9f 

Diff: https://reviews.apache.org/r/39146/diff/


Testing
-------

Updated UTs in test_mysql_server.py with new status command and ran them successfully. Also manually tested in an installed cluster environment.


Thanks,

Thomas Friedrich


Re: Review Request 39146: Make mysql service check more robust

Posted by Andrew Onischuk <ao...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39146/#review104590
-----------------------------------------------------------



ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py (line 26)
<https://reviews.apache.org/r/39146/#comment162822>

    packages for mysql are different on Oses.
    Please make sure you tested this on:
    - Centos
    - Sles or Opensuse
    - Ubuntu or Debian


- Andrew Onischuk


On Oct. 8, 2015, 10:20 p.m., Thomas Friedrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39146/
> -----------------------------------------------------------
> 
> (Updated Oct. 8, 2015, 10:20 p.m.)
> 
> 
> Review request for Ambari and Andrew Onischuk.
> 
> 
> Bugs: AMBARI-13238
>     https://issues.apache.org/jira/browse/AMBARI-13238
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> The MySQL service check in mysql_service.py simply checks for a process with name mysqld. In our environment, a different service ran another MySQL instance on that node and as a result, the status of the MySQL service in Hive showed green (because it could find a mysqld process) even though the instance used by Hive wasn't started. That also made the "Start Service" action for Hive fail, because the metastore service couldn't connect to the MySQL database.
> 
> The proposed fix makes the service check more robust by retrieving the pid_file of the MySQL instance first by running "mysqladmin variables" and parsing out the pid_file. Then it checks if the process exists. 
> 
> As a side-note: the initial fix tried to get the pid_file from the /etc/my.cnf file but I found that while this is one default location for the MySQL configuration, there are other places where the file can be referenced. Since the MySQL instance launched by Hive is not getting any parameters for config file or pid_file, MySQL must read it from one of the default locations and running "mysqladmin variables" must return the config values for the running instance used by Hive. The other instance running needs to explicitely set values for config file and pid because otherwise would collide with the default instance.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py c48c1ef 
>   ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py 1155e9f 
> 
> Diff: https://reviews.apache.org/r/39146/diff/
> 
> 
> Testing
> -------
> 
> Updated UTs in test_mysql_server.py with new status command and ran them successfully. Also manually tested in an installed cluster environment.
> 
> 
> Thanks,
> 
> Thomas Friedrich
> 
>


Re: Review Request 39146: Make mysql service check more robust

Posted by Andrew Onischuk <ao...@hortonworks.com>.

> On Oct. 29, 2015, 8:25 p.m., Dmitro Lisnichenko wrote:
> > ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py, line 26
> > <https://reviews.apache.org/r/39146/diff/1/?file=1093356#file1093356line26>
> >
> >     Also, not sure how it works with a non-root agent (ps of process ran by another user)

Nice catch putting umask 027 + non-root can possible have 'permission denied' for pid file.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39146/#review104468
-----------------------------------------------------------


On Oct. 8, 2015, 10:20 p.m., Thomas Friedrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39146/
> -----------------------------------------------------------
> 
> (Updated Oct. 8, 2015, 10:20 p.m.)
> 
> 
> Review request for Ambari and Andrew Onischuk.
> 
> 
> Bugs: AMBARI-13238
>     https://issues.apache.org/jira/browse/AMBARI-13238
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> The MySQL service check in mysql_service.py simply checks for a process with name mysqld. In our environment, a different service ran another MySQL instance on that node and as a result, the status of the MySQL service in Hive showed green (because it could find a mysqld process) even though the instance used by Hive wasn't started. That also made the "Start Service" action for Hive fail, because the metastore service couldn't connect to the MySQL database.
> 
> The proposed fix makes the service check more robust by retrieving the pid_file of the MySQL instance first by running "mysqladmin variables" and parsing out the pid_file. Then it checks if the process exists. 
> 
> As a side-note: the initial fix tried to get the pid_file from the /etc/my.cnf file but I found that while this is one default location for the MySQL configuration, there are other places where the file can be referenced. Since the MySQL instance launched by Hive is not getting any parameters for config file or pid_file, MySQL must read it from one of the default locations and running "mysqladmin variables" must return the config values for the running instance used by Hive. The other instance running needs to explicitely set values for config file and pid because otherwise would collide with the default instance.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py c48c1ef 
>   ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py 1155e9f 
> 
> Diff: https://reviews.apache.org/r/39146/diff/
> 
> 
> Testing
> -------
> 
> Updated UTs in test_mysql_server.py with new status command and ran them successfully. Also manually tested in an installed cluster environment.
> 
> 
> Thanks,
> 
> Thomas Friedrich
> 
>


Re: Review Request 39146: Make mysql service check more robust

Posted by Dmitro Lisnichenko <dl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39146/#review104468
-----------------------------------------------------------



ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py (line 26)
<https://reviews.apache.org/r/39146/#comment162726>

    Also, not sure how it works with a non-root agent (ps of process ran by another user)


- Dmitro Lisnichenko


On Oct. 9, 2015, 1:20 a.m., Thomas Friedrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39146/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:20 a.m.)
> 
> 
> Review request for Ambari and Andrew Onischuk.
> 
> 
> Bugs: AMBARI-13238
>     https://issues.apache.org/jira/browse/AMBARI-13238
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> The MySQL service check in mysql_service.py simply checks for a process with name mysqld. In our environment, a different service ran another MySQL instance on that node and as a result, the status of the MySQL service in Hive showed green (because it could find a mysqld process) even though the instance used by Hive wasn't started. That also made the "Start Service" action for Hive fail, because the metastore service couldn't connect to the MySQL database.
> 
> The proposed fix makes the service check more robust by retrieving the pid_file of the MySQL instance first by running "mysqladmin variables" and parsing out the pid_file. Then it checks if the process exists. 
> 
> As a side-note: the initial fix tried to get the pid_file from the /etc/my.cnf file but I found that while this is one default location for the MySQL configuration, there are other places where the file can be referenced. Since the MySQL instance launched by Hive is not getting any parameters for config file or pid_file, MySQL must read it from one of the default locations and running "mysqladmin variables" must return the config values for the running instance used by Hive. The other instance running needs to explicitely set values for config file and pid because otherwise would collide with the default instance.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py c48c1ef 
>   ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py 1155e9f 
> 
> Diff: https://reviews.apache.org/r/39146/diff/
> 
> 
> Testing
> -------
> 
> Updated UTs in test_mysql_server.py with new status command and ran them successfully. Also manually tested in an installed cluster environment.
> 
> 
> Thanks,
> 
> Thomas Friedrich
> 
>


Re: Review Request 39146: Make mysql service check more robust

Posted by Dmitro Lisnichenko <dl...@hortonworks.com>.

> On Oct. 29, 2015, 10:24 p.m., Dmitro Lisnichenko wrote:
> > ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py, line 25
> > <https://reviews.apache.org/r/39146/diff/1/?file=1093356#file1093356line25>
> >
> >     Also, not sure how it works with a non-root agent (ps of process ran by another user)

I was attempting to point to another line, see below


- Dmitro


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39146/#review104467
-----------------------------------------------------------


On Oct. 9, 2015, 1:20 a.m., Thomas Friedrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39146/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:20 a.m.)
> 
> 
> Review request for Ambari and Andrew Onischuk.
> 
> 
> Bugs: AMBARI-13238
>     https://issues.apache.org/jira/browse/AMBARI-13238
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> The MySQL service check in mysql_service.py simply checks for a process with name mysqld. In our environment, a different service ran another MySQL instance on that node and as a result, the status of the MySQL service in Hive showed green (because it could find a mysqld process) even though the instance used by Hive wasn't started. That also made the "Start Service" action for Hive fail, because the metastore service couldn't connect to the MySQL database.
> 
> The proposed fix makes the service check more robust by retrieving the pid_file of the MySQL instance first by running "mysqladmin variables" and parsing out the pid_file. Then it checks if the process exists. 
> 
> As a side-note: the initial fix tried to get the pid_file from the /etc/my.cnf file but I found that while this is one default location for the MySQL configuration, there are other places where the file can be referenced. Since the MySQL instance launched by Hive is not getting any parameters for config file or pid_file, MySQL must read it from one of the default locations and running "mysqladmin variables" must return the config values for the running instance used by Hive. The other instance running needs to explicitely set values for config file and pid because otherwise would collide with the default instance.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py c48c1ef 
>   ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py 1155e9f 
> 
> Diff: https://reviews.apache.org/r/39146/diff/
> 
> 
> Testing
> -------
> 
> Updated UTs in test_mysql_server.py with new status command and ran them successfully. Also manually tested in an installed cluster environment.
> 
> 
> Thanks,
> 
> Thomas Friedrich
> 
>


Re: Review Request 39146: Make mysql service check more robust

Posted by Dmitro Lisnichenko <dl...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39146/#review104467
-----------------------------------------------------------



ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py (line 25)
<https://reviews.apache.org/r/39146/#comment162725>

    Also, not sure how it works with a non-root agent (ps of process ran by another user)


- Dmitro Lisnichenko


On Oct. 9, 2015, 1:20 a.m., Thomas Friedrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39146/
> -----------------------------------------------------------
> 
> (Updated Oct. 9, 2015, 1:20 a.m.)
> 
> 
> Review request for Ambari and Andrew Onischuk.
> 
> 
> Bugs: AMBARI-13238
>     https://issues.apache.org/jira/browse/AMBARI-13238
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> The MySQL service check in mysql_service.py simply checks for a process with name mysqld. In our environment, a different service ran another MySQL instance on that node and as a result, the status of the MySQL service in Hive showed green (because it could find a mysqld process) even though the instance used by Hive wasn't started. That also made the "Start Service" action for Hive fail, because the metastore service couldn't connect to the MySQL database.
> 
> The proposed fix makes the service check more robust by retrieving the pid_file of the MySQL instance first by running "mysqladmin variables" and parsing out the pid_file. Then it checks if the process exists. 
> 
> As a side-note: the initial fix tried to get the pid_file from the /etc/my.cnf file but I found that while this is one default location for the MySQL configuration, there are other places where the file can be referenced. Since the MySQL instance launched by Hive is not getting any parameters for config file or pid_file, MySQL must read it from one of the default locations and running "mysqladmin variables" must return the config values for the running instance used by Hive. The other instance running needs to explicitely set values for config file and pid because otherwise would collide with the default instance.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py c48c1ef 
>   ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py 1155e9f 
> 
> Diff: https://reviews.apache.org/r/39146/diff/
> 
> 
> Testing
> -------
> 
> Updated UTs in test_mysql_server.py with new status command and ran them successfully. Also manually tested in an installed cluster environment.
> 
> 
> Thanks,
> 
> Thomas Friedrich
> 
>


Re: Review Request 39146: Make mysql service check more robust

Posted by Andrew Onischuk <ao...@hortonworks.com>.

> On Oct. 29, 2015, 6:51 p.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py, line 25
> > <https://reviews.apache.org/r/39146/diff/1/?file=1093356#file1093356line25>
> >
> >     Putting too many commands inside each other is risky, especially if someone else in the future modifies the pid_file_cmd to contain an illegal character. It also makes this harder to read.
> >     
> >     Can we rely more on python to parse the pid file name and check if the process is up?

In that case I would be very caryfull with throwing good understable exceptions if anything goes wrong, as long as Execute just pipes bash output and fails if it went wrong , here we have to implement this logic by ourselves.


- Andrew


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39146/#review104459
-----------------------------------------------------------


On Oct. 8, 2015, 10:20 p.m., Thomas Friedrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39146/
> -----------------------------------------------------------
> 
> (Updated Oct. 8, 2015, 10:20 p.m.)
> 
> 
> Review request for Ambari and Andrew Onischuk.
> 
> 
> Bugs: AMBARI-13238
>     https://issues.apache.org/jira/browse/AMBARI-13238
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> The MySQL service check in mysql_service.py simply checks for a process with name mysqld. In our environment, a different service ran another MySQL instance on that node and as a result, the status of the MySQL service in Hive showed green (because it could find a mysqld process) even though the instance used by Hive wasn't started. That also made the "Start Service" action for Hive fail, because the metastore service couldn't connect to the MySQL database.
> 
> The proposed fix makes the service check more robust by retrieving the pid_file of the MySQL instance first by running "mysqladmin variables" and parsing out the pid_file. Then it checks if the process exists. 
> 
> As a side-note: the initial fix tried to get the pid_file from the /etc/my.cnf file but I found that while this is one default location for the MySQL configuration, there are other places where the file can be referenced. Since the MySQL instance launched by Hive is not getting any parameters for config file or pid_file, MySQL must read it from one of the default locations and running "mysqladmin variables" must return the config values for the running instance used by Hive. The other instance running needs to explicitely set values for config file and pid because otherwise would collide with the default instance.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py c48c1ef 
>   ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py 1155e9f 
> 
> Diff: https://reviews.apache.org/r/39146/diff/
> 
> 
> Testing
> -------
> 
> Updated UTs in test_mysql_server.py with new status command and ran them successfully. Also manually tested in an installed cluster environment.
> 
> 
> Thanks,
> 
> Thomas Friedrich
> 
>


Re: Review Request 39146: Make mysql service check more robust

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39146/#review104459
-----------------------------------------------------------



ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py (line 25)
<https://reviews.apache.org/r/39146/#comment162721>

    Putting too many commands inside each other is risky, especially if someone else in the future modifies the pid_file_cmd to contain an illegal character. It also makes this harder to read.
    
    Can we rely more on python to parse the pid file name and check if the process is up?


- Alejandro Fernandez


On Oct. 8, 2015, 10:20 p.m., Thomas Friedrich wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39146/
> -----------------------------------------------------------
> 
> (Updated Oct. 8, 2015, 10:20 p.m.)
> 
> 
> Review request for Ambari and Andrew Onischuk.
> 
> 
> Bugs: AMBARI-13238
>     https://issues.apache.org/jira/browse/AMBARI-13238
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> The MySQL service check in mysql_service.py simply checks for a process with name mysqld. In our environment, a different service ran another MySQL instance on that node and as a result, the status of the MySQL service in Hive showed green (because it could find a mysqld process) even though the instance used by Hive wasn't started. That also made the "Start Service" action for Hive fail, because the metastore service couldn't connect to the MySQL database.
> 
> The proposed fix makes the service check more robust by retrieving the pid_file of the MySQL instance first by running "mysqladmin variables" and parsing out the pid_file. Then it checks if the process exists. 
> 
> As a side-note: the initial fix tried to get the pid_file from the /etc/my.cnf file but I found that while this is one default location for the MySQL configuration, there are other places where the file can be referenced. Since the MySQL instance launched by Hive is not getting any parameters for config file or pid_file, MySQL must read it from one of the default locations and running "mysqladmin variables" must return the config values for the running instance used by Hive. The other instance running needs to explicitely set values for config file and pid because otherwise would collide with the default instance.
> 
> 
> Diffs
> -----
> 
>   ambari-server/src/main/resources/common-services/HIVE/0.12.0.2.0/package/scripts/mysql_service.py c48c1ef 
>   ambari-server/src/test/python/stacks/2.0.6/HIVE/test_mysql_server.py 1155e9f 
> 
> Diff: https://reviews.apache.org/r/39146/diff/
> 
> 
> Testing
> -------
> 
> Updated UTs in test_mysql_server.py with new status command and ran them successfully. Also manually tested in an installed cluster environment.
> 
> 
> Thanks,
> 
> Thomas Friedrich
> 
>