You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@trafficserver.apache.org by 永豪 <yo...@taobao.com> on 2013/08/22 02:51:15 UTC

TS-2143 and 'it just works'

I haven't go through many codes indeed, but TS-2143 make me sad, I love ATS because in most case, we can provide a working solution for your requirement. when you grow up, with big volumes, you will find out that ATS still work, and you may discover more features than your imagine. it is a good user experience compare to some other system, when you find out that the system builder is really 'user oriented'.

while we are still on the way to codes revolution, I really don't want we break that promise, turns ATS into another low quality codes just because we are free as in beer.

most of our active committers are working for big company, with tons of good servers, but we have newbies everyday, they need a good start from any point, one easy start. ATS is a big monster compare to the others, why can't we make it nice for the freshmen?  

things I'd like to keep:
1, feature should be outlined, and should keep revolution in a user friendly way
2, provide basic system 'it just work'
3, user interface changing should get more review before we can release into public

I need your input.

thanks

08:11 <@ming_zym> on: TS-2143, I think that is big break on alarm feature, and the proxy/example_alarm_bin.sh is total unusable compare to the origin
08:12 <@ming_zym> I think you can not send out any message use it
08:12 <@ming_zym> right?
08:13 <@jpeach> ming_zym: I think swoop is expecting you to use your own script that has and email address embedded in it
08:14 <@ming_zym> but why that it is a step forward for our end users?
08:17 <@jpeach> I think that anyone using this would write their own script anyway
08:17 <@ming_zym> maybe I can find someone work on the scripts, but the origin provide 'it just work'
08:18 <@jpeach> you were using example_alarm_bin.sh script?
08:19 <@ming_zym> I don't think every new coming will hack on the core TS system. that is why we provide 'it just work' as in most case
08:21 <@jpeach> I thought that it would be unlikely doe that example script to work for any one
08:21 <@ming_zym> I am really sad on every step we move away from newbies
08:21 <@jpeach> production environments tend to be quite different
08:21 <@ming_zym> the origin is just work
08:21 <@jpeach> we should discuss on dev@
08:21 <@ming_zym> at least on Linux
08:22 <@jpeach> if this config is useful, let's keep it
08:22 <@jpeach> I don't have a strong opinion :)
08:22 <@ming_zym> let us move on dev@
08:26 <@jpeach> I'm not sure that there is any site who could use that email script without changes
08:26 <@ming_zym> but we use it because it provide a 'just working' example there.
08:26 <@ming_zym> I don't think every user will hosting thousands of servers like us
08:27 <@jpeach> should the "just working" example send email? or do something else?
08:27 <@ming_zym> why shoud they build a complex system to collect those message?
08:27 <@jpeach> yes good point

Re: TS-2143 and 'it just works'

Posted by James Peach <jp...@apache.org>.
On Aug 23, 2013, at 8:22 AM, Leif Hedstrom <zw...@apache.org> wrote:

> On Aug 22, 2013, at 6:42 PM, Yongming Zhao <mi...@gmail.com> wrote:
> 
>> the result of the change will be:
>> 1, the email alarm may able to build, but we removed the origin records.config
>> 2, the custom alarm script still need your own work
>> 3, documents still lack, and more documents to be add on the change
> 
> I'm reverting the commit for master and 4.0.x branch.

Blech. This can be fixed in a compatible way and I'll aim to do it for 4.1. That said, I don't think that we are living in a golden age of Traffic Server alarms. No-one who is running distribution versions of ATS is using alarms because the documentation and installed files are not sufficient to do so. The only people using this are advanced users who have the ability to easily adapt to this change.

For 4.10, I will implement my previous proposal, and I will automatically convert legacy configurations to the new configuration (or leave the legacy junk in place). I would like alarms to be useful to the masses (ie. me); I would like to receive an alarm with a stack trace when AT crashes.

J

Re: TS-2143 and 'it just works'

Posted by Leif Hedstrom <zw...@apache.org>.
On Aug 22, 2013, at 6:42 PM, Yongming Zhao <mi...@gmail.com> wrote:

> the result of the change will be:
> 1, the email alarm may able to build, but we removed the origin records.config
> 2, the custom alarm script still need your own work
> 3, documents still lack, and more documents to be add on the change

I'm reverting the commit for master and 4.0.x branch.

-- leif


Re: TS-2143 and 'it just works'

Posted by Yongming Zhao <mi...@gmail.com>.
the result of the change will be:
1, the email alarm may able to build, but we removed the origin records.config
2, the custom alarm script still need your own work
3, documents still lack, and more documents to be add on the change

back to the real problem here, what we want to solve in TS-2143 then? for the feature point of view, is that a good idea to get big change in UI( config file change) while you can not get a better one?
I don't think we are improving the alarm feature.

in our new release processes:
• A best-effort to keep compatibility should be made; don't break compatibility for no good reason.
• Commits that breaks compatibility are not allowed on master. Such changes gets committed instead on a next-incompatible-rev branch (call it 5.0.x).

I think there is NO GOOD REASON for breaking the alarm feature, if is not a improvement on the feature.

I think we can find out some compatible way to get the codes clean and don't break the feature.


在 2013-8-23,上午2:36,James Peach <jp...@apache.org> 写道:

> On Aug 22, 2013, at 9:35 AM, Leif Hedstrom <zw...@apache.org> wrote:
> 
>> On Aug 22, 2013, at 10:11 AM, James Peach <jp...@apache.org> wrote:
>> 
>>> On Aug 21, 2013, at 5:51 PM, 永豪 <yo...@taobao.com> wrote:
>>> 
>>>> things I'd like to keep:
>>>> 1, feature should be outlined, and should keep revolution in a user friendly way
>>>> 2, provide basic system 'it just work'
>>>> 3, user interface changing should get more review before we can release into public
>>> 
>>> Yes, I strongly agree with all 3 of these points, though I don't think this particular commit is too problematic, particularly since we never actually installed the example_alarm_bin.sh script :)
>>> 
>>> I looked at the alarm documentation and there's a few things that we can improve:
>>> 
>>> 	- the docs still reference example_alarm_bin.sh though it no longer exists
>>> 	- the docs reference proxy.config.alarm_email, though it's no longer clear what this is for
>> 
>> Yeah, proxy.config.alarm_email is no longer used, and unless we back out this commit, we should remove it. 
>> 
>> So, I'm asking now for consensus, with two options:
>> 
>> 1) We restore the old behavior, which passed the email address on the command line to the alarm script. I'd still argue that this old behavior simply did not "just work", it basically "just failed miserably".

+1, it just work on the ATS point of view. if that may fill the filesystem with emails, it is the problem with email system. 

I think I have filed a mail to ask your guys about including the example_alarm_bin.sh in the 'make install' :(
http://mail-archives.apache.org/mod_mbox/trafficserver-dev/201204.mbox/%3C1334756466.21031.5.camel@zym6400%3E

>> 2) We keep the commit, but also remove proxy.config.alarm_email (cause it's unused right now).
> 
> +1
> 
>> The improvements James points out are great, lets file an RFE on those. For example, there's nothing right now preventing someone from contributing a much better alarms.sh script. Or several of them, for different use cases, and something that actually does work.
>> 
>> Please voice your opinions asap, I'd like to get this resolved by tomorrow (Friday) morning.
> 
> Let's not revert, let's improve.
> 
> I will commit a change to add a new configuration option proxy.config.alarm.arguments that contains a string of arguments that get passed to the the alarm script. The final invoked command will be:
> 
>    "%s/%s %s %s %s", proxy.config.alarm.abs_path, proxy.config.alarm.bin, proxy.config.alarm.arguments, description, alarm
> 
> Then I will commit a variation of the original emailing script as a default. IMHO this supports the original use case, actually works out of the box for clean installations, and makes sense for sites that don't want this to be emailed.
> 
> J


Re: TS-2143 and 'it just works'

Posted by Igor Galić <i....@brainsware.org>.
+1 
----- Original Message -----
> On Aug 22, 2013, at 9:35 AM, Leif Hedstrom <zw...@apache.org> wrote:
> 
> > On Aug 22, 2013, at 10:11 AM, James Peach <jp...@apache.org> wrote:
> > 
> >> On Aug 21, 2013, at 5:51 PM, 永豪 <yo...@taobao.com> wrote:
> >> 
> >>> things I'd like to keep:
> >>> 1, feature should be outlined, and should keep revolution in a user
> >>> friendly way
> >>> 2, provide basic system 'it just work'
> >>> 3, user interface changing should get more review before we can release
> >>> into public
> >> 
> >> Yes, I strongly agree with all 3 of these points, though I don't think
> >> this particular commit is too problematic, particularly since we never
> >> actually installed the example_alarm_bin.sh script :)
> >> 
> >> I looked at the alarm documentation and there's a few things that we can
> >> improve:
> >> 
> >> 	- the docs still reference example_alarm_bin.sh though it no longer
> >> 	exists
> >> 	- the docs reference proxy.config.alarm_email, though it's no longer
> >> 	clear what this is for
> > 
> > Yeah, proxy.config.alarm_email is no longer used, and unless we back out
> > this commit, we should remove it.
> > 
> > So, I'm asking now for consensus, with two options:
> > 
> > 1) We restore the old behavior, which passed the email address on the
> > command line to the alarm script. I'd still argue that this old behavior
> > simply did not "just work", it basically "just failed miserably".
> > 
> > 2) We keep the commit, but also remove proxy.config.alarm_email (cause it's
> > unused right now).
> 
> +1
> 
> > The improvements James points out are great, lets file an RFE on those. For
> > example, there's nothing right now preventing someone from contributing a
> > much better alarms.sh script. Or several of them, for different use cases,
> > and something that actually does work.
> > 
> > Please voice your opinions asap, I'd like to get this resolved by tomorrow
> > (Friday) morning.
> 
> Let's not revert, let's improve.
> 
> I will commit a change to add a new configuration option
> proxy.config.alarm.arguments that contains a string of arguments that get
> passed to the the alarm script. The final invoked command will be:
> 
>     "%s/%s %s %s %s", proxy.config.alarm.abs_path, proxy.config.alarm.bin,
>     proxy.config.alarm.arguments, description, alarm
> 
> Then I will commit a variation of the original emailing script as a default.
> IMHO this supports the original use case, actually works out of the box for
> clean installations, and makes sense for sites that don't want this to be
> emailed.
> 
> J

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/
GPG: 6880 4155 74BD FD7C B515  2EA5 4B1D 9E08 A097 C9AE


Re: TS-2143 and 'it just works'

Posted by James Peach <jp...@apache.org>.
On Aug 22, 2013, at 9:35 AM, Leif Hedstrom <zw...@apache.org> wrote:

> On Aug 22, 2013, at 10:11 AM, James Peach <jp...@apache.org> wrote:
> 
>> On Aug 21, 2013, at 5:51 PM, 永豪 <yo...@taobao.com> wrote:
>> 
>>> things I'd like to keep:
>>> 1, feature should be outlined, and should keep revolution in a user friendly way
>>> 2, provide basic system 'it just work'
>>> 3, user interface changing should get more review before we can release into public
>> 
>> Yes, I strongly agree with all 3 of these points, though I don't think this particular commit is too problematic, particularly since we never actually installed the example_alarm_bin.sh script :)
>> 
>> I looked at the alarm documentation and there's a few things that we can improve:
>> 
>> 	- the docs still reference example_alarm_bin.sh though it no longer exists
>> 	- the docs reference proxy.config.alarm_email, though it's no longer clear what this is for
> 
> Yeah, proxy.config.alarm_email is no longer used, and unless we back out this commit, we should remove it. 
> 
> So, I'm asking now for consensus, with two options:
> 
> 1) We restore the old behavior, which passed the email address on the command line to the alarm script. I'd still argue that this old behavior simply did not "just work", it basically "just failed miserably".
> 
> 2) We keep the commit, but also remove proxy.config.alarm_email (cause it's unused right now).

+1

> The improvements James points out are great, lets file an RFE on those. For example, there's nothing right now preventing someone from contributing a much better alarms.sh script. Or several of them, for different use cases, and something that actually does work.
> 
> Please voice your opinions asap, I'd like to get this resolved by tomorrow (Friday) morning.

Let's not revert, let's improve.

I will commit a change to add a new configuration option proxy.config.alarm.arguments that contains a string of arguments that get passed to the the alarm script. The final invoked command will be:

    "%s/%s %s %s %s", proxy.config.alarm.abs_path, proxy.config.alarm.bin, proxy.config.alarm.arguments, description, alarm

Then I will commit a variation of the original emailing script as a default. IMHO this supports the original use case, actually works out of the box for clean installations, and makes sense for sites that don't want this to be emailed.

J

Re: TS-2143 and 'it just works'

Posted by Leif Hedstrom <zw...@apache.org>.
On Aug 22, 2013, at 10:11 AM, James Peach <jp...@apache.org> wrote:

> On Aug 21, 2013, at 5:51 PM, 永豪 <yo...@taobao.com> wrote:
> 
>> things I'd like to keep:
>> 1, feature should be outlined, and should keep revolution in a user friendly way
>> 2, provide basic system 'it just work'
>> 3, user interface changing should get more review before we can release into public
> 
> Yes, I strongly agree with all 3 of these points, though I don't think this particular commit is too problematic, particularly since we never actually installed the example_alarm_bin.sh script :)
> 
> I looked at the alarm documentation and there's a few things that we can improve:
> 
> 	- the docs still reference example_alarm_bin.sh though it no longer exists
> 	- the docs reference proxy.config.alarm_email, though it's no longer clear what this is for

Yeah, proxy.config.alarm_email is no longer used, and unless we back out this commit, we should remove it. 

So, I'm asking now for consensus, with two options:

1) We restore the old behavior, which passed the email address on the command line to the alarm script. I'd still argue that this old behavior simply did not "just work", it basically "just failed miserably".

2) We keep the commit, but also remove proxy.config.alarm_email (cause it's unused right now).

The improvements James points out are great, lets file an RFE on those. For example, there's nothing right now preventing someone from contributing a much better alarms.sh script. Or several of them, for different use cases, and something that actually does work.

Please voice your opinions asap, I'd like to get this resolved by tomorrow (Friday) morning.

-- leif


Re: TS-2143 and 'it just works'

Posted by James Peach <jp...@apache.org>.
On Aug 21, 2013, at 5:51 PM, 永豪 <yo...@taobao.com> wrote:

> I haven't go through many codes indeed, but TS-2143 make me sad, I love ATS because in most case, we can provide a working solution for your requirement. when you grow up, with big volumes, you will find out that ATS still work, and you may discover more features than your imagine. it is a good user experience compare to some other system, when you find out that the system builder is really 'user oriented'.
> 
> while we are still on the way to codes revolution, I really don't want we break that promise, turns ATS into another low quality codes just because we are free as in beer.
> 
> most of our active committers are working for big company, with tons of good servers, but we have newbies everyday, they need a good start from any point, one easy start. ATS is a big monster compare to the others, why can't we make it nice for the freshmen?  
> 
> things I'd like to keep:
> 1, feature should be outlined, and should keep revolution in a user friendly way
> 2, provide basic system 'it just work'
> 3, user interface changing should get more review before we can release into public

Yes, I strongly agree with all 3 of these points, though I don't think this particular commit is too problematic, particularly since we never actually installed the example_alarm_bin.sh script :)

I looked at the alarm documentation and there's a few things that we can improve:

	- the docs still reference example_alarm_bin.sh though it no longer exists
	- the docs reference proxy.config.alarm_email, though it's no longer clear what this is for
	- there's no documentation of how to configure ATS to invoke your custom alarm script
	- we don't provide any real-world sample scripts or real advice on how to write one
	- we don't provide any documentation on what an alarm is or what kind of events cause alarms

All these issues can (and should) be fixed. There's quite a bit of alarms material in the original Inktomi documentation that is not present in out current docs.

J



Re: TS-2143 and 'it just works'

Posted by Yongming Zhao <mi...@gmail.com>.
yeah, some wrong directions here. let us try to back to the real problem here:

from ATS, you get a 'alarm' feature, you get this in your documentation:

http://trafficserver.apache.org/docs/trunk/admin/configuration-files/records.config#proxy.config.alarm_email
proxy.config.alarm_email
STRING
Default: (none)
Reloadable.
The email address to which Traffic Server sends alarm messages. During a custom Traffic Server installation, you can specify the email address; otherwise, Traffic Server uses the Traffic Server user account name as the default value for this variable.

and


Alarm Configuration

proxy.config.alarm.bin
STRING
Default: example_alarm_bin.sh
Name of the script file that can execute certain actions when an alarm is signaled. The default file is a sample script named example_alarm_bin.sh located in the bin directory. You must edit the script to suit your needs.
proxy.config.alarm.abs_path
STRING
Default: NULL
The full path to the script file that sends email to alert someone about Traffic Server problems.


and you will get for sure, your can use those documentation to collect alarms, by mail or custom(still lack of official documents)


now, let me ask, how can we deliver the 'Alarm' feature for our users?

in our origin situation, we have a working mail solution for the alarm sending, for the newbies, and we have a way to custom the proxy.config.alarm.bin to your own needs, like the way documented https://blog.zymlinux.net/index.php/archives/378 .  how to do with the whole change then?


let us back to the change we made, we may use the following steps to get email alarm:
1, set up a working mail system, to get mail, for example alias and postfix setup.
2, copy the example_alarm_bin.sh into bin directory, and make it runnable(mostly done with the packager system)

and here is my system, and the testing, you will know why I claim that we break it:

zymtest1 trafficserver # grep alarm /etc/trafficserver/records.config
CONFIG proxy.config.alarm_email STRING nobody
   # execute alarm as "<abs_path>/<bin> "<MSG_STRING_FROM_PROXY>""
CONFIG proxy.config.alarm.bin STRING example_alarm_bin.sh
CONFIG proxy.config.alarm.abs_path STRING NULL
zymtest1 trafficserver # tail /etc/mail/aliases
noc:                root
security:           root
usenet:             root
uucp:               root
webmaster:          root
www:                webmaster

# trap decode to catch security attacks
# decode:           /dev/null
root: ming.zym@gmail.com
zymtest1 trafficserver # which sendmail
/usr/sbin/sendmail
zymtest1 trafficserver # ps aux | grep traffic
root      1448  0.0  0.0  59624  3116 ?        Ss   22:54   0:00 /usr/bin/traffic_cop
nobody    1454  0.1  0.4 468044 16500 ?        Sl   22:54   0:01 /usr/bin/traffic_manager
nobody    1464 12.9  4.5 1368068 184056 ?      Sl   22:54   2:08 /usr/bin/traffic_server -M --httpport 8080:fd=7
root      1665  0.0  0.0  17764   936 pts/0    S+   23:10   0:00 grep --colour=auto traffic
zymtest1 trafficserver # kill -9 1464
zymtest1 trafficserver # ps aux | grep traffic
root      1448  0.0  0.0  59624  3116 ?        Ss   22:54   0:00 /usr/bin/traffic_cop
nobody    1454  0.1  0.4 467888 16524 ?        Sl   22:54   0:01 /usr/bin/traffic_manager
nobody    1682 18.0  4.4 841524 180520 ?       Sl   23:11   0:00 /usr/bin/traffic_server -M --httpport 8080:fd=7
root      1726  0.0  0.0  17764   936 pts/0    S+   23:11   0:00 grep --colour=auto traffic
zymtest1 trafficserver # tail /var/log/trafficserver/traffic.out
[Aug 22 23:11:04.805] Manager {0x7ff74a608740} NOTE: [Alarms::signalAlarm] Server Process born
[Aug 22 23:11:05.825] {0x2aaaaab18bc0} STATUS: opened /var/log/trafficserver/diags.log
[Aug 22 23:11:05.825] {0x2aaaaab18bc0} NOTE: updated diags config
[Aug 22 23:11:05.831] Server {0x2aaaaab18bc0} NOTE: cache clustering disabled
[Aug 22 23:11:05.833] Server {0x2aaaaab18bc0} WARNING: no ssd disks specified in proxy.config.cache.ssd.storage:
[Aug 22 23:11:05.939] Server {0x2aaaaab18bc0} NOTE: cache clustering disabled
[Aug 22 23:11:05.939] Server {0x2aaaaab18bc0} WARNING: unable to open cache disk(s): SSD Cache Disabled
[Aug 22 23:11:05.949] Server {0x2aaaaab18bc0} NOTE: logging initialized[15], logging_mode = 3
[Aug 22 23:11:05.972] Server {0x2aaaaab18bc0} NOTE: traffic server running
[Aug 22 23:11:05.999] Server {0x2aaab4f0e700} NOTE: cache enabled
zymtest1 trafficserver # tail /var/log/messages
Aug 22 23:11:03 zymtest1 postfix/cleanup[1676]: EE0EB140514: message-id=<20...@zymtest1.corp.aliyk.com>
Aug 22 23:11:03 zymtest1 postfix/local[1679]: C2738141499: to=<no...@zymtest1.corp.aliyk.com>, orig_to=<nobody>, relay=local, delay=0.21, delays=0.03/0.14/0/0.04, dsn=2.0.0, status=sent (forwarded as EE0EB140514)
Aug 22 23:11:03 zymtest1 postfix/qmgr[3968]: EE0EB140514: from=<ro...@zymtest1.corp.aliyk.com>, size=601, nrcpt=1 (queue active)
Aug 22 23:11:03 zymtest1 postfix/qmgr[3968]: C2738141499: removed
Aug 22 23:11:04 zymtest1 postfix/smtp[1680]: connect to gmail-smtp-in.l.google.com[2a00:1450:4010:c04::1b]:25: Network is unreachable
Aug 22 23:11:05 zymtest1 traffic_server[1682]: NOTE: --- Server Starting ---
Aug 22 23:11:05 zymtest1 traffic_server[1682]: NOTE: Server Version: Apache Traffic Server - traffic_server - 3.2.0 - (build # 6815 on Jul  8 2013 at 15:06:15)
Aug 22 23:11:05 zymtest1 traffic_server[1682]: {0x2aaaaab18bc0} STATUS: opened /var/log/trafficserver/diags.log
Aug 22 23:11:09 zymtest1 postfix/smtp[1680]: EE0EB140514: to=<mi...@gmail.com>, orig_to=<nobody>, relay=gmail-smtp-in.l.google.com[74.125.129.27]:25, delay=5.5, delays=0.02/0.01/1.4/4.1, dsn=2.0.0, status=sent (250 2.0.0 OK 1377184269 yk3si10312244pac.41 - gsmtp)
Aug 22 23:11:09 zymtest1 postfix/qmgr[3968]: EE0EB140514: removed

and I will get a mail with title:
zymtest1.corp.aliyk.com [TrafficManager] Traffic Server process was reset.


but with the git master codes, I will just get something in the traffic.out:
with ould script that can send mail:
[TrafficServer] using root directory '/usr'
Usage: example_alarm_bin.sh <message> [<email_from_name> <email_from_addr> <email_to_addr>]
Usage: example_alarm_bin.sh <message> [<email_from_name> <email_from_addr> <email_to_addr>]

with the new one:
[TrafficServer] using root directory '/usr'
bin/example_alarm_bin.sh: desc=[TrafficManager] Traffic Server process was reset. alarm=1
[TrafficServer] using root directory '/usr'


you may ask, why you care of the email alarms?
1, because it is useful, when you have some box, not tones of box, especially when you are not a full time ATS admin
2, we provide it as a feature

this is really long mail, thanks for you patient

在 2013-8-22,下午10:27,Igor Galić <i....@brainsware.org> 写道:

> 
> 
> ----- Original Message -----
>> Thursday, August 22, 2013, 2:16:13 AM, you wrote:
>>> btw, the only reason I see right now for having an option for passing an
>>> additional, configurable parameter to a script, is to use it for sending
>>> "instance name" in a multi-instance setup.
>>> Question does anyone in our community run multiple instances of ATS on the
>>> the same node?
>> 
>> I have to disagree here - if you run multiple instances of ATS on different
>> machines it is *very* useful to be able to use the same script everywhere
>> and have it vary based on instance data. Having to maintain per machine
>> customized scripts is a real pain.
> 
>> _o
> 
> What, exactly, is the difference between having a configuration file (that
> you create and roll out with your configuration management) for Traffic
> Server, and an configuration file (that you create and roll out with your
> configuration management) for a (monitoring) script?
> 
> scripts, just because they are scripts, don't have to munge data and action.
> They can be configurable just like everything else. But of course you'd
> have to write them as such.
> 
> Anyway. This is starting to dissolve into a bikeshed issue, so I'll excuse
> myself from this discussion.
> 
> -- i
> Igor Galić
> 
> Tel: +43 (0) 664 886 22 883
> Mail: i.galic@brainsware.org
> URL: http://brainsware.org/
> GPG: 6880 4155 74BD FD7C B515  2EA5 4B1D 9E08 A097 C9AE
> 


Re: TS-2143 and 'it just works'

Posted by Igor Galić <i....@brainsware.org>.

----- Original Message -----
> Thursday, August 22, 2013, 2:16:13 AM, you wrote:
> > btw, the only reason I see right now for having an option for passing an
> > additional, configurable parameter to a script, is to use it for sending
> > "instance name" in a multi-instance setup.
> > Question does anyone in our community run multiple instances of ATS on the
> > the same node?
> 
> I have to disagree here - if you run multiple instances of ATS on different
> machines it is *very* useful to be able to use the same script everywhere
> and have it vary based on instance data. Having to maintain per machine
> customized scripts is a real pain.

>_o

What, exactly, is the difference between having a configuration file (that
you create and roll out with your configuration management) for Traffic
Server, and an configuration file (that you create and roll out with your
configuration management) for a (monitoring) script?

scripts, just because they are scripts, don't have to munge data and action.
They can be configurable just like everything else. But of course you'd
have to write them as such.

Anyway. This is starting to dissolve into a bikeshed issue, so I'll excuse
myself from this discussion.

-- i
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/
GPG: 6880 4155 74BD FD7C B515  2EA5 4B1D 9E08 A097 C9AE


Re: TS-2143 and 'it just works'

Posted by Leif Hedstrom <zw...@apache.org>.
On Aug 22, 2013, at 7:31 AM, Alan M. Carroll <am...@network-geographics.com> wrote:

> Thursday, August 22, 2013, 2:16:13 AM, you wrote:
>> btw, the only reason I see right now for having an option for passing an
>> additional, configurable parameter to a script, is to use it for sending
>> "instance name" in a multi-instance setup.
>> Question does anyone in our community run multiple instances of ATS on the
>> the same node?
> 
> I have to disagree here - if you run multiple instances of ATS on different machines it is *very* useful to be able to use the same script everywhere and have it vary based on instance data. Having to maintain per machine customized scripts is a real pain.
> 


The old script provided no such details, did it? The only records.config configurable portion was the Web UI admin user  ("admin"), which is not even a valid email address on almost every system. That configuration is in fact a legacy of a feature that was long ago removed.

Having the installed script include a $(hostname) in whatever it sends to the monitoring service (be it email, SNMP, whatever), seems completely trivial. The *only* thing that has changed really is that the script doesn't receive the "admin" string as a command line argument.

I'm with Igor, this is total bike shedding. This damn script is an *example*, it never worked for 99% of the users, and that one user it did work for can contact me personally, and I'll fix it for her, for free, when she upgrades to v4.0.0.

Ciao,

-- leif

Re: TS-2143 and 'it just works'

Posted by "Alan M. Carroll" <am...@network-geographics.com>.
Thursday, August 22, 2013, 2:16:13 AM, you wrote:
> btw, the only reason I see right now for having an option for passing an
> additional, configurable parameter to a script, is to use it for sending
> "instance name" in a multi-instance setup.
> Question does anyone in our community run multiple instances of ATS on the
> the same node?

I have to disagree here - if you run multiple instances of ATS on different machines it is *very* useful to be able to use the same script everywhere and have it vary based on instance data. Having to maintain per machine customized scripts is a real pain.


Re: TS-2143 and 'it just works'

Posted by Igor Galić <i....@brainsware.org>.

----- Original Message -----
> On Aug 21, 2013, at 6:51 PM, 永豪 <yo...@taobao.com> wrote:
> 
> > I haven't go through many codes indeed, but TS-2143 make me sad, I love ATS
> > because in most case, we can provide a working solution for your
> > requirement. when you grow up, with big volumes, you will find out that
> > ATS still work, and you may discover more features than your imagine. it
> > is a good user experience compare to some other system, when you find out
> > that the system builder is really 'user oriented'.
> 
> 
> Hmmmm, this seems incredibly harsh. Attached is an alarms_example.sh script,
> that works exactly as the code used to do, except, it is less confusing. If
> you like, we can put that back in there.
> 
> Fwiw, I think this is much worse for the "newbies". If you don't know what
> you are doing, you will be sending emails from "nobody" to "admin", and you
> have no idea why. Best case scenario, the email vanish. Worst case scenario,
> they will eventually fill up /var/log/mail.

+1

I was the one driving that Alarms change, here's my take on it:

the interface was confusing and diffuse. You have configuration one one side
but the actual execution was out-sourced to a script. I found this dissatisfying
on one hand, and limiting. It wasn't left to anyone's imagination what the
script should do, it was set in records.config and our documentation, it should
email alarms.

Now, we've removed the email config, we've removed allocation and deallocation
of memory, which I would assume in a scenario where things are worth *alarming*
not to be helpful, and we've simplified the example. We should update the
documentation saying how you can do whatever you want with this script now.

You can email, yes. Or you can send a message to an SMS gateway. Or send an
snmp trap. Or put the message on a queue, where it will be picked up, etc..

That script can be as simple or complex as necessary, and even more complex
than a single parameter would allow. But its configuration should be drawn
from your infrastructure, not from ATS.



btw, the only reason I see right now for having an option for passing an
additional, configurable parameter to a script, is to use it for sending
"instance name" in a multi-instance setup.
Question does anyone in our community run multiple instances of ATS on the
the same node?

> I do agree with the review concerns, so if the consensus is to roll this
> back, lets do it fast.
> 
> -- leif
> 
> ostype=`(uname -s) 2>/dev/null`
> if [ "$ostype" = "Linux" ]; then
>   SENDMAIL="/usr/sbin/sendmail"
> else
>   SENDMAIL="/usr/lib/sendmail"
> fi
> 
> if [ ! -x $SENDMAIL ]; then
>     echo "$0: Could not find $SENDMAIL program"
>     exit 1
> fi
> 
> if [ $# -eq 2 ]; then
>   msg="`hostname` $1"
>   email_from_name="traffic server"
>   email_from_addr="nobody"
>   email_to_addr="admin"
> 
>   result=`(echo "From: $email_from_name <$email_from_addr>"; echo "To:
>   $email_to_addr"; echo "Subject: $msg"; echo; date) | $SENDMAIL -bm
>   $email_to_addr`
>   if [ "$result" = "" ]; then
>     echo
>     echo "[example_alarm_bin.sh] sent alarm: $msg";
>     echo
>     exit 0
>   else
>     echo
>     echo "[example_alarm_bin.sh] sendmail failed"
>     echo
>     exit 1
>   fi
> else
>   # give a little help
>   echo "Usage: example_alarm_bin.sh <message> [<email_from_name>
>   <email_from_addr> <email_to_addr>]"
>   exit
> 
> fi
> 
> 

-- 
Igor Galić

Tel: +43 (0) 664 886 22 883
Mail: i.galic@brainsware.org
URL: http://brainsware.org/
GPG: 6880 4155 74BD FD7C B515  2EA5 4B1D 9E08 A097 C9AE


Re: TS-2143 and 'it just works'

Posted by Leif Hedstrom <zw...@apache.org>.
On Aug 21, 2013, at 6:51 PM, 永豪 <yo...@taobao.com> wrote:

> I haven't go through many codes indeed, but TS-2143 make me sad, I love ATS because in most case, we can provide a working solution for your requirement. when you grow up, with big volumes, you will find out that ATS still work, and you may discover more features than your imagine. it is a good user experience compare to some other system, when you find out that the system builder is really 'user oriented'.


Hmmmm, this seems incredibly harsh. Attached is an alarms_example.sh script, that works exactly as the code used to do, except, it is less confusing. If you like, we can put that back in there.

Fwiw, I think this is much worse for the "newbies". If you don't know what you are doing, you will be sending emails from "nobody" to "admin", and you have no idea why. Best case scenario, the email vanish. Worst case scenario, they will eventually fill up /var/log/mail.

I do agree with the review concerns, so if the consensus is to roll this back, lets do it fast.

-- leif

ostype=`(uname -s) 2>/dev/null`
if [ "$ostype" = "Linux" ]; then
  SENDMAIL="/usr/sbin/sendmail"
else
  SENDMAIL="/usr/lib/sendmail"
fi

if [ ! -x $SENDMAIL ]; then
    echo "$0: Could not find $SENDMAIL program"
    exit 1
fi

if [ $# -eq 2 ]; then
  msg="`hostname` $1"
  email_from_name="traffic server"
  email_from_addr="nobody"
  email_to_addr="admin"

  result=`(echo "From: $email_from_name <$email_from_addr>"; echo "To: $email_to_addr"; echo "Subject: $msg"; echo; date) | $SENDMAIL -bm $email_to_addr`
  if [ "$result" = "" ]; then
    echo
    echo "[example_alarm_bin.sh] sent alarm: $msg";
    echo
    exit 0
  else
    echo
    echo "[example_alarm_bin.sh] sendmail failed"
    echo
    exit 1
  fi
else
  # give a little help
  echo "Usage: example_alarm_bin.sh <message> [<email_from_name> <email_from_addr> <email_to_addr>]"
  exit

fi