You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by Di Li <di...@ca.ibm.com> on 2015/09/25 15:16:13 UTC

Review Request 38761: AMBARI-13233: Error message for ambari agent install failure when the ping port is taken by old agent process should state port and old process

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38761/
-----------------------------------------------------------

Review request for Ambari and Alejandro Fernandez.


Bugs: AMBARI-13233
    https://issues.apache.org/jira/browse/AMBARI-13233


Repository: ambari


Description
-------

When the Ambari Agent's ping port is taken by old/left over ambari agent process. the Ambari agent install fails during the cluster installation time. The error message simply says Address already in use. As shown below.
"ERROR 2015-08-14 17:11:26,557 main.py:272 - Failed to start ping port listener of: [Errno 98] Address already in use"

The error message could have been more specific on which port was used by which process.

In fact, the err msg is there in the ambari-agent/src/main/python/ambari_agent/PingPortListener.py already. It's just that it was never used due to the way the awk parses the pid.

This JIRA is to attempt to fix the awk parse cmd so that the Ambari agent python script would throw the exception with a detailed error message.


Diffs
-----

  ambari-agent/src/main/python/ambari_agent/PingPortListener.py 46be26b 

Diff: https://reviews.apache.org/r/38761/diff/


Testing
-------

Issue:

The logic in the PingPortListener.py is to first use fuser to check the port then awk to parse the result. 

The fuser cmd is "fuser {0}/tcp 2>/dev/null" . {0} to be replaced by the ping port. 
The output is:
[root@test1 tmp]# fuser 8670/tcp 2>/dev/null
 39248[root@test1 tmp]#

Old code's awk parser returns empty string against output listed above as shown below
[root@test1 tmp]# fuser 8670/tcp 2>/dev/null | awk '{print $2}'

[root@test1 tmp]#

The fix is to update the awk parser. Note the print is now to print $1 instead of $2.
[root@test1 tmp]# fuser 8670/tcp 2>/dev/null | awk '{print $1}'
39248

With awk now returns a number, the current error handling logic can print out the correct error message.

Test:
Installed Ambari agent on a node where the ping port 8670 was taken by an left-over Ambari agent process. Verify the error message displayed on the Host registration page on the Ambari web UI.


Thanks,

Di Li


Re: Review Request 38761: AMBARI-13233: Error message for ambari agent install failure when the ping port is taken by old agent process should state port and old process

Posted by Di Li <di...@ca.ibm.com>.

> On Sept. 25, 2015, 5:15 p.m., Alejandro Fernandez wrote:
> > Ship It!

Hello Alejandro,

Could you please help push the change to the trunk.

Thank you.

Di


- Di


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38761/#review100617
-----------------------------------------------------------


On Sept. 25, 2015, 1:16 p.m., Di Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38761/
> -----------------------------------------------------------
> 
> (Updated Sept. 25, 2015, 1:16 p.m.)
> 
> 
> Review request for Ambari and Alejandro Fernandez.
> 
> 
> Bugs: AMBARI-13233
>     https://issues.apache.org/jira/browse/AMBARI-13233
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When the Ambari Agent's ping port is taken by old/left over ambari agent process. the Ambari agent install fails during the cluster installation time. The error message simply says Address already in use. As shown below.
> "ERROR 2015-08-14 17:11:26,557 main.py:272 - Failed to start ping port listener of: [Errno 98] Address already in use"
> 
> The error message could have been more specific on which port was used by which process.
> 
> In fact, the err msg is there in the ambari-agent/src/main/python/ambari_agent/PingPortListener.py already. It's just that it was never used due to the way the awk parses the pid.
> 
> This JIRA is to attempt to fix the awk parse cmd so that the Ambari agent python script would throw the exception with a detailed error message.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/PingPortListener.py 46be26b 
> 
> Diff: https://reviews.apache.org/r/38761/diff/
> 
> 
> Testing
> -------
> 
> Issue:
> 
> The logic in the PingPortListener.py is to first use fuser to check the port then awk to parse the result. 
> 
> The fuser cmd is "fuser {0}/tcp 2>/dev/null" . {0} to be replaced by the ping port. 
> The output is:
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null
>  39248[root@test1 tmp]#
> 
> Old code's awk parser returns empty string against output listed above as shown below
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null | awk '{print $2}'
> 
> [root@test1 tmp]#
> 
> The fix is to update the awk parser. Note the print is now to print $1 instead of $2.
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null | awk '{print $1}'
> 39248
> 
> With awk now returns a number, the current error handling logic can print out the correct error message.
> 
> Test:
> Installed Ambari agent on a node where the ping port 8670 was taken by an left-over Ambari agent process. Verify the error message displayed on the Host registration page on the Ambari web UI.
> 
> 
> Thanks,
> 
> Di Li
> 
>


Re: Review Request 38761: AMBARI-13233: Error message for ambari agent install failure when the ping port is taken by old agent process should state port and old process

Posted by Alejandro Fernandez <af...@hortonworks.com>.

> On Sept. 25, 2015, 5:15 p.m., Alejandro Fernandez wrote:
> > Ship It!
> 
> Di Li wrote:
>     Hello Alejandro,
>     
>     Could you please help push the change to the trunk.
>     
>     Thank you.
>     
>     Di

Pushed to trunk,
commit 2b3401640473c0b0988f52dff5683820107d3ecc

Thanks


- Alejandro


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38761/#review100617
-----------------------------------------------------------


On Sept. 25, 2015, 1:16 p.m., Di Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38761/
> -----------------------------------------------------------
> 
> (Updated Sept. 25, 2015, 1:16 p.m.)
> 
> 
> Review request for Ambari and Alejandro Fernandez.
> 
> 
> Bugs: AMBARI-13233
>     https://issues.apache.org/jira/browse/AMBARI-13233
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When the Ambari Agent's ping port is taken by old/left over ambari agent process. the Ambari agent install fails during the cluster installation time. The error message simply says Address already in use. As shown below.
> "ERROR 2015-08-14 17:11:26,557 main.py:272 - Failed to start ping port listener of: [Errno 98] Address already in use"
> 
> The error message could have been more specific on which port was used by which process.
> 
> In fact, the err msg is there in the ambari-agent/src/main/python/ambari_agent/PingPortListener.py already. It's just that it was never used due to the way the awk parses the pid.
> 
> This JIRA is to attempt to fix the awk parse cmd so that the Ambari agent python script would throw the exception with a detailed error message.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/PingPortListener.py 46be26b 
> 
> Diff: https://reviews.apache.org/r/38761/diff/
> 
> 
> Testing
> -------
> 
> Issue:
> 
> The logic in the PingPortListener.py is to first use fuser to check the port then awk to parse the result. 
> 
> The fuser cmd is "fuser {0}/tcp 2>/dev/null" . {0} to be replaced by the ping port. 
> The output is:
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null
>  39248[root@test1 tmp]#
> 
> Old code's awk parser returns empty string against output listed above as shown below
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null | awk '{print $2}'
> 
> [root@test1 tmp]#
> 
> The fix is to update the awk parser. Note the print is now to print $1 instead of $2.
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null | awk '{print $1}'
> 39248
> 
> With awk now returns a number, the current error handling logic can print out the correct error message.
> 
> Test:
> Installed Ambari agent on a node where the ping port 8670 was taken by an left-over Ambari agent process. Verify the error message displayed on the Host registration page on the Ambari web UI.
> 
> 
> Thanks,
> 
> Di Li
> 
>


Re: Review Request 38761: AMBARI-13233: Error message for ambari agent install failure when the ping port is taken by old agent process should state port and old process

Posted by Alejandro Fernandez <af...@hortonworks.com>.
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/38761/#review100617
-----------------------------------------------------------

Ship it!


Ship It!

- Alejandro Fernandez


On Sept. 25, 2015, 1:16 p.m., Di Li wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/38761/
> -----------------------------------------------------------
> 
> (Updated Sept. 25, 2015, 1:16 p.m.)
> 
> 
> Review request for Ambari and Alejandro Fernandez.
> 
> 
> Bugs: AMBARI-13233
>     https://issues.apache.org/jira/browse/AMBARI-13233
> 
> 
> Repository: ambari
> 
> 
> Description
> -------
> 
> When the Ambari Agent's ping port is taken by old/left over ambari agent process. the Ambari agent install fails during the cluster installation time. The error message simply says Address already in use. As shown below.
> "ERROR 2015-08-14 17:11:26,557 main.py:272 - Failed to start ping port listener of: [Errno 98] Address already in use"
> 
> The error message could have been more specific on which port was used by which process.
> 
> In fact, the err msg is there in the ambari-agent/src/main/python/ambari_agent/PingPortListener.py already. It's just that it was never used due to the way the awk parses the pid.
> 
> This JIRA is to attempt to fix the awk parse cmd so that the Ambari agent python script would throw the exception with a detailed error message.
> 
> 
> Diffs
> -----
> 
>   ambari-agent/src/main/python/ambari_agent/PingPortListener.py 46be26b 
> 
> Diff: https://reviews.apache.org/r/38761/diff/
> 
> 
> Testing
> -------
> 
> Issue:
> 
> The logic in the PingPortListener.py is to first use fuser to check the port then awk to parse the result. 
> 
> The fuser cmd is "fuser {0}/tcp 2>/dev/null" . {0} to be replaced by the ping port. 
> The output is:
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null
>  39248[root@test1 tmp]#
> 
> Old code's awk parser returns empty string against output listed above as shown below
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null | awk '{print $2}'
> 
> [root@test1 tmp]#
> 
> The fix is to update the awk parser. Note the print is now to print $1 instead of $2.
> [root@test1 tmp]# fuser 8670/tcp 2>/dev/null | awk '{print $1}'
> 39248
> 
> With awk now returns a number, the current error handling logic can print out the correct error message.
> 
> Test:
> Installed Ambari agent on a node where the ping port 8670 was taken by an left-over Ambari agent process. Verify the error message displayed on the Host registration page on the Ambari web UI.
> 
> 
> Thanks,
> 
> Di Li
> 
>