You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@aurora.apache.org by ASF IRC Bot <as...@urd.zones.apache.org> on 2015/12/14 20:40:41 UTC

Summary of IRC Meeting in #aurora

Summary of IRC Meeting in #aurora at Mon Dec 14 19:02:40 2015:

Attendees: jsirois, jfarrell, jcohen, wfarner, Yasumoto, mkhutornenko, zmanji

- Preface
- Executor settings validation
  - Action: jcohen to file ticket to track executor input validation
- website fixes
- Opting out of health checks
- 0.11.0 release
- 0.10.0 deb vote
- mesos.executor python egg


IRC log follows:

## Preface ##
[Mon Dec 14 19:03:05 2015] <jcohen>: Hello everyone. It’s time for our weekly community meeting!
[Mon Dec 14 19:03:19 2015] <jcohen>: As a reminder: everyone is welcome (and encouraged) to participate
[Mon Dec 14 19:03:28 2015] <jcohen>: Let’s start off with our standard roll call…
[Mon Dec 14 19:03:32 2015] <jcohen>: here :)
[Mon Dec 14 19:03:36 2015] <wfarner>: Here
[Mon Dec 14 19:03:45 2015] <zmanji>: here
[Mon Dec 14 19:03:52 2015] <mkhutornenko>: here
[Mon Dec 14 19:04:02 2015] <Yasumoto>: howdy
[Mon Dec 14 19:04:13 2015] <jsirois>: here
## Executor settings validation ##
[Mon Dec 14 19:05:17 2015] <jcohen>: This came up as part of https://reviews.apache.org/r/41154/
[Mon Dec 14 19:05:54 2015] <jcohen>: We validate executor settings in the client when we submit jobs, but the executor itself does not validate those settings which could cause unexpected behavior if someone were to submit a job directly against the API.
[Mon Dec 14 19:06:09 2015] <jcohen>: Anyone have any thought on whether we should have the executor be more defensive in this regard?
[Mon Dec 14 19:06:30 2015] <wfarner>: Can you give an example?
[Mon Dec 14 19:06:53 2015] <jcohen>: An example of what? how things can break, or how we’d be more defensive?
[Mon Dec 14 19:07:05 2015] <wfarner>: Breakage
[Mon Dec 14 19:07:40 2015] <jcohen>: Sure. After the change in that review, you could configure a job with a shell healthcheck with no shell command specified.
[Mon Dec 14 19:08:14 2015] <jcohen>: so the first time the healtchecker would run, we’d crash with a message like ‘NoneType’ has no attribute split
[Mon Dec 14 19:08:28 2015] <jcohen>: ideally instead we’d just fail the task right away in that scenario
[Mon Dec 14 19:09:28 2015] <wfarner>: General +1 to executor doing input validation when input is first received
[Mon Dec 14 19:09:44 2015] <jcohen>: I can file a ticket to track that work
[Mon Dec 14 19:09:53 2015] <wfarner>: Thanks
[Mon Dec 14 19:10:48 2015] <jcohen>: #action jcohen to file ticket to track executor input validation
## website fixes ##
[Mon Dec 14 19:11:54 2015] <wfarner>: Echoing my email to the dev list, I fixed up a bunch of old issues on the website.  Please cease being tolerant of broken links and the like!
[Mon Dec 14 19:12:17 2015] <jcohen>: Thanks wfarner!
[Mon Dec 14 19:12:20 2015] <wfarner>: We also have versions docs up
[Mon Dec 14 19:12:22 2015] <jfarrell>: +1, thanks wfarner
[Mon Dec 14 19:12:29 2015] <wfarner>: Versioned*
## Opting out of health checks ##
[Mon Dec 14 19:13:13 2015] <jcohen>: Another question that shook out of that same review…
[Mon Dec 14 19:13:38 2015] <jcohen>: It’s currently possible to set up health check config that is not used if your task does not bind a health port
[Mon Dec 14 19:13:47 2015] <jcohen>: With the new shell health checker, this will not be possible
[Mon Dec 14 19:13:55 2015] <jcohen>: Is this behavior that we’d like to codify?
[Mon Dec 14 19:14:06 2015] <jcohen>: Or should it just remain an undocumented side effect of http health checks?
[Mon Dec 14 19:15:16 2015] <mkhutornenko>: I think this will be easily fixable with the schema change I proposed there
[Mon Dec 14 19:15:17 2015] <wfarner>: Seems okay to me - giving a health check command and then disabling it seems weird
[Mon Dec 14 19:15:33 2015] <wfarner>: Or maybe I am misunderstanding
[Mon Dec 14 19:16:01 2015] <jcohen>: wfarner: from my perspective it’s nice to be able to easily opt out of health checks, e.g. for a devel version of a task.
[Mon Dec 14 19:17:01 2015] <wfarner>: So remove the health check command from that one, right?  Am I missing something?
[Mon Dec 14 19:17:04 2015] <jcohen>: A case can be made that the correct way to do that would simply be to not configure health checks on the devel task at all though. It leads to slightly more complex configuration is the only downside.
[Mon Dec 14 19:17:12 2015] <jsirois>: `true` would accomplish this in the shell health check
[Mon Dec 14 19:17:17 2015] <jcohen>: It’s certainly more clear though
[Mon Dec 14 19:17:44 2015] <jcohen>: I just wanted to raise it as an inconsistency versus the current http health checker implementation.
[Mon Dec 14 19:18:12 2015] <wfarner>: I'd argue that's a bug in the http check handling
[Mon Dec 14 19:18:19 2015] <jcohen>: Arguable the correct solution is to have the executor fail if an http health checker is configured, but no health port is bound
[Mon Dec 14 19:18:40 2015] <wfarner>: Indeed
[Mon Dec 14 19:19:18 2015] <jcohen>: Do you think we should file a ticket and fix? Or just leave things as they are?
[Mon Dec 14 19:19:29 2015] <wfarner>: Anything else on that?
[Mon Dec 14 19:20:00 2015] <wfarner>: A ticket to keep it in mind sounds good
[Mon Dec 14 19:20:03 2015] <jcohen>: k
[Mon Dec 14 19:20:05 2015] <jcohen>: that’s all I’ve got
[Mon Dec 14 19:20:10 2015] <wfarner>: Thanks
[Mon Dec 14 19:20:12 2015] <jcohen>: any other topics?
## 0.11.0 release ##
[Mon Dec 14 19:21:08 2015] <wfarner>: I'd like us to get 0.11.0 out ASAP.  Just a heads up that the big change is removing the client side updater
[Mon Dec 14 19:21:28 2015] <wfarner>: I plan to tackle that this week.
[Mon Dec 14 19:21:52 2015] <mkhutornenko>: wfarner: is this the only reason to release 0.11.0
[Mon Dec 14 19:21:57 2015] <wfarner>: There is also a mesos dep upgrade
[Mon Dec 14 19:22:06 2015] <zmanji>: the mesos dep upgrade is important
[Mon Dec 14 19:22:22 2015] <wfarner>: We are nearly 3 versions behind, catching up is important
[Mon Dec 14 19:23:36 2015] <mkhutornenko>: wfarner: why not just ship mesos dep upgrade by itself to catch up a bit?
[Mon Dec 14 19:24:35 2015] <wfarner>: mkhutornenko: the client updater has been kicked out 2 times now, we should make good on the deprecation and remove it
[Mon Dec 14 19:26:05 2015] <wfarner>: Anything else on this?  Happy to move to dev list if something is contentious here
## 0.10.0 deb vote ##
[Mon Dec 14 19:27:44 2015] <wfarner>: We have an outstanding package vote without enough binding votes to pass
[Mon Dec 14 19:28:17 2015] <wfarner>: I would greatly appreciate if folks could take a look and report back.  Packages are a boon to the project, but only if we release them
[Mon Dec 14 19:31:53 2015] <jcohen>: Any other topics?
## mesos.executor python egg ##
[Mon Dec 14 19:32:05 2015] <jcohen>: (I encourage everyone to vote btw!)
[Mon Dec 14 19:32:42 2015] <wfarner>: t3hSteve has done some great work to slim the system deps of the python egg used by the executor
[Mon Dec 14 19:33:20 2015] <wfarner>: I would like to suggest we explore ways to use this ASAP
[Mon Dec 14 19:33:46 2015] <wfarner>: This would mean rolling custom mesos eggs that we distribute
[Mon Dec 14 19:34:02 2015] <jcohen>: Is there any chance of getting Mesos to support this directly?
[Mon Dec 14 19:34:26 2015] <wfarner>: Likely, but it probably wouldn't land until 0.27
[Mon Dec 14 19:34:40 2015] <wfarner>: And that means 4 more releases for us
[Mon Dec 14 19:34:52 2015] <jcohen>: nod
[Mon Dec 14 19:35:18 2015] <wfarner>: For users, it means the slave machines require fewer deps, and the Aurora-docker story is significantly better
[Mon Dec 14 19:36:48 2015] <jcohen>: yep, sounds like a win overall. Would prefer we didn’t have to manage the eggs, but waiting 4 releases to consume seems like a long time.
[Mon Dec 14 19:37:22 2015] <wfarner>: I see the egg thing as near-noop since we already build and distribute them
[Mon Dec 14 19:38:25 2015] <wfarner>: Here is the patch on the mesos side, which we would need to back port: https://reviews.apache.org/r/41049/
[Mon Dec 14 19:39:25 2015] <wfarner>: That's all for me
[Mon Dec 14 19:39:33 2015] <jcohen>: Thanks wfarner
[Mon Dec 14 19:39:39 2015] <jcohen>: Anyone else have any topics?
[Mon Dec 14 19:40:11 2015] <jcohen>: Alright then, thanks everyone, have a great week!
[Mon Dec 14 19:40:20 2015] <wfarner>: ASFBot: meeting stop


Meeting ended at Mon Dec 14 19:40:20 2015