You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@cloudstack.apache.org by Chiradeep Vittal <Ch...@citrix.com> on 2014/03/03 08:46:20 UTC

Re: [Proposal] Virtual Router service failure alerting

Hi,
What happens if there is an extended disconnect from MS? What is the
rollover strategy for the new log files? What is the format of the alarm
sent to the MS? How does the MS deduplicate alarms from the VR? How does
the admin see these?

On 2/17/14, 1:57 AM, "Harikrishna Patnala"
<ha...@citrix.com> wrote:

>Hi Sheng,
>Thank you for the corrections and suggestions. My comments inline.
>
>On 14-Feb-2014, at 1:53 am, Sheng Yang
><sh...@yasker.org>> wrote:
>
>Hi Hari,
>
>+1. Getting log from VR is a long awaited feature.
>
>One correction: VR does have mgmt network nic, but it doesn't have direct
>communcation with mgmt server(in the most case except VMware). So polling
>is still needed.
>[Hari] I did not see the management nic on virtual router in case of
>Xenserver. Anyway we need polling :)
>
>
>And another comment: you can reuse CheckRouterTask() for the purpose. It
>already has been used for s2s vpn connection status update and redundant
>router checking, and "router.check.interval" would be used as interval for
>checking. You can improve and reuse that rather than introduce another
>polling thread for VR.
>[Hari] Initially I thought of same to piggyback in the existing requests
>to VR, but I proposed this just not to change the semantic of
>checkRouterTask by including alerts getting request to VR.
>If this is fine I¹ll reuse the CheckRouterTask.
>
>
>How can you define "new alart"? Did the file would be deleted/archived
>after poll? Or you simply just looking after a certain point? I guess a
>diff works better than search for timestamp. More details in
>implementation
>would be helpful.
>[Hari] Yes Sheng, I agree that diff would be better, but there is a case
>where after taking diff, reply from router to MS may fail because of
>network failure or some other reasons. In this case we cannot get the
>diff back and there is a need of timestamp over here. So would it be
>better to use both diff and timestamp. If diff falls to retrieve alerts
>we also check using timestamp.
>I will update the FS with some implementation details (using both
>timestamps and diff) CWIKI is down now.
>
>
>--Sheng
>
>
>On Thu, Feb 13, 2014 at 1:56 AM, Harikrishna Patnala <
>harikrishna.patnala@citrix.com<ma...@citrix.com>>
>wrote:
>
>Hi,
>
>Currently in CS we can monitor the running services on Virtual Router and
>ensure they are running through the lifetime of VR. Upon failure of any
>service in VR, monitoring service logs the alerts in VR logs.
>These alerts need to be pushed to management server to notify admin.
>
>For this I'd like to introduce the feature Virtual Router service failure
>alerting.
>
>[1] https://issues.apache.org/jira/browse/CLOUDSTACK-6090
>[2]
>https://cwiki.apache.org/confluence/display/CLOUDSTACK/Virtual+Router+Serv
>ice+Failure+Alerting
>
>
>Comments/feedback are welcome
>
>Thank you,
>Harikrishna
>
>


Re: [Proposal] Virtual Router service failure alerting

Posted by Harikrishna Patnala <ha...@citrix.com>.
Hi,
My comments inline.

On 03-Mar-2014, at 1:16 pm, Chiradeep Vittal <Ch...@citrix.com>> wrote:

Hi,
What happens if there is an extended disconnect from MS? What is the
rollover strategy for the new log files?
In case of extended disconnect, we record the last received timestamp in our DB. When request goes from MS we include timestamp using which alerts after that timestamp are returned as response.
The rollover strategy is to maintain logs with maxsize of 10MB and backup count to 5.

What is the format of the alarm
sent to the MS?
TimeStamp: RouterName: ServiceName: Message:

How does the MS deduplicate alarms from the VR? How does
the admin see these?
Since we get the alerts based on the last received alert timestamp we do not get duplicates and upon receiving alerts from VR we publish the alerts.


Thanks
Harikrishna


On 2/17/14, 1:57 AM, "Harikrishna Patnala"
<ha...@citrix.com>> wrote:

Hi Sheng,
Thank you for the corrections and suggestions. My comments inline.

On 14-Feb-2014, at 1:53 am, Sheng Yang
<sh...@yasker.org>> wrote:

Hi Hari,

+1. Getting log from VR is a long awaited feature.

One correction: VR does have mgmt network nic, but it doesn't have direct
communcation with mgmt server(in the most case except VMware). So polling
is still needed.
[Hari] I did not see the management nic on virtual router in case of
Xenserver. Anyway we need polling :)


And another comment: you can reuse CheckRouterTask() for the purpose. It
already has been used for s2s vpn connection status update and redundant
router checking, and "router.check.interval" would be used as interval for
checking. You can improve and reuse that rather than introduce another
polling thread for VR.
[Hari] Initially I thought of same to piggyback in the existing requests
to VR, but I proposed this just not to change the semantic of
checkRouterTask by including alerts getting request to VR.
If this is fine I¹ll reuse the CheckRouterTask.


How can you define "new alart"? Did the file would be deleted/archived
after poll? Or you simply just looking after a certain point? I guess a
diff works better than search for timestamp. More details in
implementation
would be helpful.
[Hari] Yes Sheng, I agree that diff would be better, but there is a case
where after taking diff, reply from router to MS may fail because of
network failure or some other reasons. In this case we cannot get the
diff back and there is a need of timestamp over here. So would it be
better to use both diff and timestamp. If diff falls to retrieve alerts
we also check using timestamp.
I will update the FS with some implementation details (using both
timestamps and diff) CWIKI is down now.


--Sheng


On Thu, Feb 13, 2014 at 1:56 AM, Harikrishna Patnala <
harikrishna.patnala@citrix.com<ma...@citrix.com>>
wrote:

Hi,

Currently in CS we can monitor the running services on Virtual Router and
ensure they are running through the lifetime of VR. Upon failure of any
service in VR, monitoring service logs the alerts in VR logs.
These alerts need to be pushed to management server to notify admin.

For this I'd like to introduce the feature Virtual Router service failure
alerting.

[1] https://issues.apache.org/jira/browse/CLOUDSTACK-6090
[2]
https://cwiki.apache.org/confluence/display/CLOUDSTACK/Virtual+Router+Serv
ice+Failure+Alerting


Comments/feedback are welcome

Thank you,
Harikrishna