You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@brooklyn.apache.org by Alex Heneveld <al...@cloudsoft.io> on 2022/08/24 15:44:17 UTC

Brooklyn Feature Proposal - Declarative and Retryable Workflow

Hi folks,

I'd like Apache Brooklyn to allow more sophisticated workflow to be written
in YAML.

As many of you know, we have a powerful task framework in java, but only a
very limited subset is currently exposed via YAML.  I think we could
generalize this without a mammoth effort, and get a very nice way for users
to write complex effectors, sensor feeds, etc, directly in YAML.

At [1] please find details of the proposal.

This includes the ability to branch and retry on error.  It can also give
us the ability to retry/resume on an Apache Brooklyn server failover.

Comments welcome!

Best
Alex


[1]
https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.
Hi,

it looks interesting indeed.

Thanks !
Regards
JB

On Thu, Aug 25, 2022 at 11:34 AM Duncan Grant <du...@gmail.com> wrote:
>
> Alex,
>
> +1
>
> I think these changes, combined with the recent docker sensor/effector
> changes from Iulana Cosmina, massively reduce the need to
> drop out of yaml into Java.  This is a win a) by reducing the barrier to
> entry for the average sys admin who is used to just getting things
> done without the need to compile code and b) by capturing the logic of a
> blueprint more clearly in the blueprint itself.
>
> I've seen Apache Brooklyn users attempt to implement some of these features
> within their local organisation a few times.  So it really makes
> sense that we make it a core, fully developed, feature.  I think
> replayability and retryability are really nice additions as well.
>
> thanks
>
> Duncan
>
> On Wed, 24 Aug 2022 at 18:57, Geoff Macartney <ge...@apache.org> wrote:
>
> > Hi Alex,
> >
> > This looks very interesting - I've just glanced through it so far but will
> > try to read it in detail soon. I'll certainly be very interested to hear
> > what everyone thinks.
> >
> > Cheers
> > Geoff
> >
> >
> >
> > On Wed, 24 Aug 2022 at 16:44, Alex Heneveld <al...@cloudsoft.io> wrote:
> >
> > > Hi folks,
> > >
> > > I'd like Apache Brooklyn to allow more sophisticated workflow to be
> > written
> > > in YAML.
> > >
> > > As many of you know, we have a powerful task framework in java, but only
> > a
> > > very limited subset is currently exposed via YAML.  I think we could
> > > generalize this without a mammoth effort, and get a very nice way for
> > users
> > > to write complex effectors, sensor feeds, etc, directly in YAML.
> > >
> > > At [1] please find details of the proposal.
> > >
> > > This includes the ability to branch and retry on error.  It can also give
> > > us the ability to retry/resume on an Apache Brooklyn server failover.
> > >
> > > Comments welcome!
> > >
> > > Best
> > > Alex
> > >
> > >
> > > [1]
> > >
> > >
> > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
> > >
> >

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Duncan Grant <du...@gmail.com>.
Alex,

+1

I think these changes, combined with the recent docker sensor/effector
changes from Iulana Cosmina, massively reduce the need to
drop out of yaml into Java.  This is a win a) by reducing the barrier to
entry for the average sys admin who is used to just getting things
done without the need to compile code and b) by capturing the logic of a
blueprint more clearly in the blueprint itself.

I've seen Apache Brooklyn users attempt to implement some of these features
within their local organisation a few times.  So it really makes
sense that we make it a core, fully developed, feature.  I think
replayability and retryability are really nice additions as well.

thanks

Duncan

On Wed, 24 Aug 2022 at 18:57, Geoff Macartney <ge...@apache.org> wrote:

> Hi Alex,
>
> This looks very interesting - I've just glanced through it so far but will
> try to read it in detail soon. I'll certainly be very interested to hear
> what everyone thinks.
>
> Cheers
> Geoff
>
>
>
> On Wed, 24 Aug 2022 at 16:44, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> > Hi folks,
> >
> > I'd like Apache Brooklyn to allow more sophisticated workflow to be
> written
> > in YAML.
> >
> > As many of you know, we have a powerful task framework in java, but only
> a
> > very limited subset is currently exposed via YAML.  I think we could
> > generalize this without a mammoth effort, and get a very nice way for
> users
> > to write complex effectors, sensor feeds, etc, directly in YAML.
> >
> > At [1] please find details of the proposal.
> >
> > This includes the ability to branch and retry on error.  It can also give
> > us the ability to retry/resume on an Apache Brooklyn server failover.
> >
> > Comments welcome!
> >
> > Best
> > Alex
> >
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
> >
>

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Geoff Macartney <ge...@apache.org>.
Hi Alex,

This looks very interesting - I've just glanced through it so far but will
try to read it in detail soon. I'll certainly be very interested to hear
what everyone thinks.

Cheers
Geoff



On Wed, 24 Aug 2022 at 16:44, Alex Heneveld <al...@cloudsoft.io> wrote:

> Hi folks,
>
> I'd like Apache Brooklyn to allow more sophisticated workflow to be written
> in YAML.
>
> As many of you know, we have a powerful task framework in java, but only a
> very limited subset is currently exposed via YAML.  I think we could
> generalize this without a mammoth effort, and get a very nice way for users
> to write complex effectors, sensor feeds, etc, directly in YAML.
>
> At [1] please find details of the proposal.
>
> This includes the ability to branch and retry on error.  It can also give
> us the ability to retry/resume on an Apache Brooklyn server failover.
>
> Comments welcome!
>
> Best
> Alex
>
>
> [1]
>
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
>

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Geoff Macartney <ge...@apache.org>.
Hi Alex,

Thanks for the detailed response. Mine in return:

> DSL:  I like this idea.  I think it could be built on incremental ...
> So I would suggest people can evolve the DSL in parallel and this shouldn't
> block implementation of the YAML proposal -- assuming we are agreed on the
> functionality and flow.

Sounds good to me.

> STEPS:  I went back and forth between the 3 options you list, (a) a list of
> ... I settled on (c) because it has the
> conciseness of (a) plus enforced labelling for readability and options for
> flow-control (next) as well as extensibility, without the cumbersomeness of
> (b) and the restrictions on extensibility it implies. ...
> including TOSCA ... However it did not have the "next" or "condition"
> options that you raise questions about.
> ... etc.

Fair enough, if you have tried out these options then you'll know what
feels most natural at the moment anyway. Your later comment "The
proposed model sees the workflow as a flat set of steps" helps
conceptualise the model (let's highlight that in docs), and in this
flat case I suppose (c) strikes a balance between the endpoints (a)
and (b). But see below on Next.

> CONDITION:  Again I played with `if: { ..., then: { ... } }` and reached
> the conclusion it's not a nice YAML experience

Agree, particularly with the flat model of workflow.

> NEXT:  This is probably the aspect I'm least confident of.  But in my
> exploration it feels good, IMO better than the alternatives.  Geoff you
> bring Dijkstra's "GOTO Considered Harmful" statement to mind -- the
> argument being that goto can encourage developers to make a messy program,
> (ab)using it when there are other better control structures...

yes, I do still have concerns about this - fewer, now that I'm
considering the workflow as a flat model, but nevertheless still some.
I do think that Next aka GOTO could be (ab)used to create a spaghetti
bowl of control flow which could make life difficult for users. But
likely anyone writing workflows will be aware of this and will
exercise a bit of judgement accordingly as they design the workflows,
so it may not be a problem too often in practice.

Even then there is still a related concern that it could complicate
the implementation of workflow support, e.g. it might make it
difficult to implement the workflow tracking sensors to provide good
support for the arbitrary control "flow" that Next permits.
"Hopscotch" more than "flow", perhaps.

But let's see how it goes. I'm sure we can publicise workflows as a
"beta" feature at first and reserve for ourselves the right to come
back and revisit these design decisions if needed after we've had some
experience with how it works.

> LET:  +1 to this suggestion as a shorthand for the cumbersome

grand

> LOOPS:  There are a three main ways loops could be done; one is with
> ... For the reasons under NEXT, I think a "while" statement at this level is not wanted

That's fine. The examples added in the document are helpful.

> REUSE:  As for adding a workflow to the catalog, I hadn't fleshed this out
> ... I've added a sketch of this ...

That sketch looks good. I'm sure this will work and be a useful way
(probably the "standard" way) to do workflows.

> TESTS:  Excellent observation.  This would be absolutely transformative for
> the brooklyn tests mechanism.

It will be good to watch this evolve.

> INCREMENTAL IMPLEMENTATION:  An initial set of workflow using a small
> subset of types would be the starting point,

Sounds good - it will give us the chance to see how the basics of the
idea work. I'm sure it will already be a lot of work as even this will
have to support much of the envisioned underlying mechanism, e.g. the
workflow tracking sensors (or will it? I suppose an alpha release
could be done without persistence).

> WHO:  I'm maybe best placed to start with the initial set of workflow,

I would say so :-)

> and then I'd welcome collaboration on everything else, much of which is parallelizable.

sounds good.

OTHER THOUGHTS

A copy/paste of the Google Doc would make a great starting point for
the Brooklyn Docs on this topic.

CONCLUSION

I think you've addressed most of my concerns, and in any case I do
think the proposal is a good thing at a high level. +1. Let's go for
it and see where we get to.

Cheers
Geoff

On Wed, 31 Aug 2022 at 12:52, Alex Heneveld <al...@cloudsoft.io> wrote:
>
> A
> Geoff, Peter, all --
>
> Excellent input.  My thoughts:
>
> DSL:  I like this idea.  I think it could be built on incremental
> improvement atop the YAML proposal.  I think by starting with YAML first we
> get some advantages:  we get a model (the YAML maps) that we are used to
> parsing and working with in Apache Brooklyn, including type assist in the
> composer text editor, and a GUI to show and even write workflow wouldn't be
> a huge undertaking (as composer converts between YAML text and graphical
> already).  This is a declarative model that tools can easily use at design
> time, and also a runtime model that we can process and show progress to a
> user.  With a DSL there are some extra steps, as it would have to be
> converted into some form of model to work with (probably with source line
> number mappings so we can map back).  (I would NOT want us to lose the
> ability to reason about and show the execution, so am reluctant to lean too
> much on the DSL execution of eg things like Groovy.)  And I think as DSL is
> additive -- eg the following based on the first example in the doc:
>
>    container image my/google-cloud
>
>       command "gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}"
>
>       env { BUCKET: $brooklyn:config("bucket") }
>
>       on-error retry;
>
>     set-sensor spark-output=${1.stdout}
>
>
> would get processed into something in order to execute it ... and the model
> that corresponds to the YAML seems a good a candidate as any.  (i'm not
> sure i like that DSL, but the point applies to most any DSL.)
>
> So I would suggest people can evolve the DSL in parallel and this shouldn't
> block implementation of the YAML proposal -- assuming we are agreed on the
> functionality and flow.
>
>
> STEPS:  I went back and forth between the 3 options you list, (a) a list of
> steps with no IDs, (b) a map of IDs where every step must say what to do
> next, and (c) the current proposal which is (b) plus a default
> numero-alphabetic ordering.  I settled on (c) because it has the
> conciseness of (a) plus enforced labelling for readability and options for
> flow-control (next) as well as extensibility, without the cumbersomeness of
> (b) and the restrictions on extensibility it implies.  In terms of related
> work the "init.d" mechanism seemed a good one because of the readability
> and extensibility it gives (in my experience).  And I've actually used
> something similar in some AB projects including TOSCA and it has worked
> well in practice.  (However it did not have the "next" or "condition"
> options that you raise questions about.)
>
> Aside:  if we're asking people to write and read workflow in YAML, even
> with plans for graphical tools and/or a DSL atop it, then I think the
> emphasis should be on making it as natural as possible.  To Peter's point
> there will always be a bit of a disconnect but I've looked at places where
> it has been done well -- eg CircleCI -- and places where it hasn't been (no
> names, but you know extremely verbose YAML) and tried to emulate the
> former.  Explicit IDs force the user to do a bit of documentation, and make
> it easier to correlate subsequent behaviour or graphical components.
> Having good default behaviour so keys can be omitted, and having a
> shorthand for common tasks, these also make a big difference to authoring.
> With (b), requiring a "next" all the time, we need an extra line in many
> steps, and we lose the ability to support the shorthand.
>
> CONDITION:  Again I played with `if: { ..., then: { ... } }` and reached
> the conclusion it's not a nice YAML experience (and gets worse with
> "else"!).  Ugly to write and ugly to work with.  The `condition` block I've
> used in circle-ci and elsewhere to achieve this and thought it was a more
> pleasant experience -- the best I came across.  Pleasant is relative of
> course -- and this I think is an area where DSL and visual will in time
> help even more.  A DSL could express `if ... then ... elseif ... else ... `
> in a more natural way, and easily generate the YAML.  Visually any step
> with a condition I would see the route(s) in to it branching, and the
> condition shown on the branch leading to the step, and the else branch
> leading to the next step in the ordering, and a line after the conditional
> step going to the next step (or to a different step if `next` is explicitly
> indicated).  But if we're starting with YAML then a condition keeps the
> model flat and simpler.
>
> NEXT:  This is probably the aspect I'm least confident of.  But in my
> exploration it feels good, IMO better than the alternatives.  Geoff you
> bring Dijkstra's "GOTO Considered Harmful" statement to mind -- the
> argument being that goto can encourage developers to make a messy program,
> (ab)using it when there are other better control structures.  Except when
> coming from YAML most of the possible better control structures give a
> worse UX to write, in my experience -- we end up with the `if: { ..., then:
> ... }` or introduce a similar `while: { condition: { ... }, do: { ... } }`
> -- both of which I think introduce more cognitive load.
>
> Aside:  "NESTING considered harmful".  In my experience, requiring a lot of
> nesting within commands also makes the UX unpleasant -- speaking from a
> YAML perspective -- and makes the workflow harder to reason about.  As soon
> as things are nested, either in `if` or `while` or nested `workflow` then
> I've got something more complicated to work with -- visually it is hard to
> read (note I am speaking explicitly about YAML; if we layer a DSL on top I
> would embrace nesting and eschew goto in that), implicit context/depth to
> consider when resolving labels and execution flow.  The proposed model sees
> the workflow as a flat set of steps, so code and user can know exactly
> where in execution something is occurring, based on its ID.  The only
> nesting is an explicit nested workflow which establishes a sub-context.  If
> we have lightweight nesting a la if or while then we lose this runtime
> clarity.  (To be clear, for a DSL written on top of this, I would take a
> very different view of this!)
>
> LET:  +1 to this suggestion as a shorthand for the cumbersome
> `set-workflow-variable`.  Improves readability.  I could see some simple
> maths being supported as well here.
>
> LOOPS:  There are a three main ways loops could be done; one is with
> explicit nested workflow running over a target eg 1..10; another is
> combining a condition with and increment step and a "next" entry to repeat
> the loop while the condition is satisfied; and the final one is using the
> special "retry" type (especially suited to time-based retries, waiting for
> some event or correcting errors).  For the reasons under NEXT, I think a
> "while" statement at this level is not wanted -- but I think it would be
> good at the DSL.  I've added some worked examples i the document at
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.1af8dhexg3n5
> .
>
> REUSE:  As for adding a workflow to the catalog, I hadn't fleshed this out
> except to satisfy myself that we could declare a new type "my-task" in a
> catalog.bom, have it extend { type: workflow } and declare { steps: { ... }
> }.  The type "my-task" could then be used in any workflow step or in an
> effecftor definition which wants steps.  The new type could itself be
> extended, with Map.putAll done to merge steps (inserting new and
> overwriting existing labels).  I think there will be a need to declare
> inputs and to specify inputs as part of this, and that side of things I
> hadn't thought about but assumed we'd be able to use the `parameters`
> syntax we use elsewhere along with default values.  I've added a sketch of
> this at
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.rl70czdae7la
> .
>
> TESTS:  Excellent observation.  This would be absolutely transformative for
> the brooklyn tests mechanism.  Currently that allows test scripts to be
> written in a DSL a bit like the one described here.  The existing
> assertions and methods there could be implemented as workflow steps so that
> writing tests and viewing and re-running tests all become much nicer.
>
>
> As for how to proceed:
>
> INCREMENTAL IMPLEMENTATION:  An initial set of workflow using a small
> subset of types would be the starting point, and allowing this to be used
> to declare sensors and effectors and policies.  I would then see most of
> the other tasks able to proceed in parallel, with the exception of the UI
> which will need the workflow metadata-as-sensors fleshed out a bit more.
>
> WHO:  I'm maybe best placed to start with the initial set of workflow, but
> I'd aim to do that pretty quickly keeping it minimal per ^, and then I'd
> welcome collaboration on everything else, much of which is parallelizable.
> I will update to this list when there is something to look at for the first
> set.  You're right I'm raring to go but I'm also very happy to have others
> do big swathes of it.  I'm also really grateful for the comments, so keep
> them coming.  It'll be a few days before I start, so please LMK if ^^^
> makes sense!
>
>
> Best
> Alex
>
>
>
>
>
>
> On Tue, 30 Aug 2022 at 23:22, Geoff Macartney <ge...@apache.org> wrote:
>
> > Hi Alex et al.
> >
> > Here are some thoughts on the proposal.
> >
> > Cheers
> > Geoff
> >
> > # General thoughts
> >
> > 1. Adding a procedural "sub-language" like this to Brooklyn could give
> > it a whole new level of capability, which is very exciting.
> > 2. At the same time this capability could add a whole new dimension of
> > complexity and difficulty, so I think it will be very important to
> > make sure the implementation and tooling give users lots of support
> > for traceability and debugging. It's good to see the emphasis put on
> > this in "The Other Big Idea: Traceability and Recoverability",
> > hopefully the implementation will also emphasise it.
> > 3. "A graphical workflow visualiser is not planned at this time" is
> > understandable but I feel probably the more support we can add for
> > this in the UI the better the chances of it succeeding.
> > 4. Peter's suggestion of a DSL could potentially simplify some of the
> > considerations below, what do you think of that suggestion?
> >
> > # About the workflow language
> >
> > 1. I find the document currently confusing in terms of what sort of
> > language it is describing - is it a workflow, typically expressed as a
> > directed graph of nodes (tasks) or a sequence of tasks (which would be
> > rather more like a procedural language)?
> > 2. if it is a workflow then I'd have thought there shouldn't be any
> > notion of ordering ("Step References, Numero-Alphabetic Ordering and
> > Extensibility"). Ordering should be defined only by the graph (value
> > of "Next" field, and the ids) in this case, don't you think?
> > 3. if it is a sequence of tasks then I think the current mechanism of
> > ids (the map keys) and Next is awkward. `1-2-http-request` and Next
> > puts me in mind of BASIC, with Next playing the role of GOTO. In this
> > case I think it has the potential of introducing the same problems as
> > GOTO, and might better be done without. Rather it might be preferable
> > to express the workflow as an array of steps (sequencing), with
> > support for selection (`condition`/`if`), iteration (maybe consider
> > introducing `while`? see question below about iteration), and
> > "functions" (independently defined named workflows, either in the
> > catalog or elsewhere in the blueprint). Ids would be optional, only
> > required for steps whose results are referenced elsewhere, not as
> > sequencers or labels for Next:
> >
> > ```yaml
> > steps:
> >   - id: sparc-job
> >     type: container
> >     image: my/google-cloud
> >     command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
> >     env:
> >       BUCKET: $brooklyn:config("bucket")
> >     on-error: retry
> >   - set-sensor: spark-output=${sparc-job.stdout}
> > ```
> >
> > 4. This might also work well with a slightly different markup for
> > `condition` (which might be nicer as `if`?), adding a `then` defining
> > a (sub) sequence of steps:
> >
> > ```yaml
> > steps:
> >   - if:
> >       target: ${scratch.skip_date}
> >       not: { equals: true }
> >       then:
> >         - ssh: echo today is `DATE`
> >         - <other commands in here...>
> > ```
> > we could also add `else`.
> >
> > 5. Can you add some more details about how you see iteration in the
> > document - there is that section "Multiple targets and looping" but it
> > doesn't really have examples of the latter. Could we introduce a
> > `while` construct?
> >
> > 6. How about renaming `set-workflow-variable` to a simpler `let` or
> > `set`? You don't actually give an example of its use, is it something
> > like the following? (I'm imagining a convention for a one line
> > definition for convenience, with variable name followed by keyword
> > `be` to indicate assignment to what follows):
> >
> > ```yaml
> > steps:
> >   - let: my-scratch be false
> >   - if:
> >       target: whatever
> >       equals: something
> >       then:
> >         - let: my-scratch be true
> > ```
> >
> > 7. Could you add some more detail about how independent workflows can
> > be defined, e.g. in the catalog (or as a separate workflow in the
> > blueprint?) Can these be parameterised, like function definitions? The
> > use of `parameters` in the example in section "Request/State Unique
> > Identifiers" isn't clear to me.
> >
> > 8. Would you foresee adding any support for testing all this to the
> > Brooklyn tests mechanism? Might be valuable.
> >
> > # About how to organise implementing all this
> >
> > 1. Can we sequence work on this to do the simpler bits first and get
> > experience with how it works before proceeding to more advanced things
> > like nested workflows with per-target conditions ("Multiple Targets
> > and looping"). I would particularly like to hope that the UI side of
> > things can progress in step with the workflow definitions, rather than
> > leave all the UI work to the end.
> > 2. (How) could the work for this be spread across the community? I've
> > no doubt you're raring to go on this but it would be good if it didn't
> > all fall on your shoulders!
> >
> >
> > On Mon, 29 Aug 2022 at 23:51, Geoff Macartney <ge...@gmail.com>
> > wrote:
> > >
> > > Hi Alex,
> > >
> > > I've done a first pass on the document, and it's very impressive. Adding
> > a procedural "sub-language" like this to Brooklyn could give it a whole new
> > level of capability, which is very exciting. I have some thoughts on some
> > of the details proposed which I will try to write up this week.
> > >
> > > I share the concerns about YAML which I think Peter expressed very well.
> > His suggestion of a DSL instead of YAML is interesting and I think would be
> > worth considering. I also have some reservations about some of the
> > constructs you're proposing (well, at least one of them) and some perhaps
> > relatively minor suggestions for changes in structure. My bigger concern is
> > that adding a new programming language within Blueprints like this could
> > add a whole new dimension of complexity. I'm asking myself, "how would I
> > debug this" when things go wrong. I think that's worth some discussion as
> > much as the details of the language. There are also points where I simply
> > have questions and would like some more detail.
> > >
> > > I'll try to get more detailed thoughts written up this week.
> > >
> > > Cheers
> > > Geoff
> > >
> > >
> > >
> > > On Sat, 27 Aug 2022 at 00:05, Peter Abramowitsch <
> > pabramowitsch@gmail.com> wrote:
> > >>
> > >> Hi Alex,
> > >> I haven't been involved with the Brooklyn team for a long while so take
> > >> this suggestion with as little or as much importance as you see at face
> > >> value.   Your proposal for a richer specification language to guide
> > >> realtime behavior is much appreciated and I think it is a great idea.
> > >> You've obviously thought very deeply as to how it could be applied in
> > >> different areas of a blueprint.
> > >>
> > >> My one comment is whether going for a declarative solution, especially
> > one
> > >> based on YAML is optimal.  Sure Yaml is well known, easy to eyeball,
> > but it
> > >> has two drawbacks that make me wonder if it is the best platform for
> > your
> > >> idea.  The first is that it is a format-based language.  Working in
> > large
> > >> infrastructure projects, small errors can have disastrous consequences,
> > so
> > >> as little as a missing or extra tab could result in destroying a data
> > >> resource or bringing down a complex system.   The other, more
> > philosophical
> > >> comment has to do with the clumsiness of describing procedural concepts
> > in
> > >> a declarative language.  (anyone have fun with XSL doing anything
> > >> significant?)
> > >>
> > >> So my suggestion would be to look into DSLs instead of Yaml.  Very nice
> > >> ones can be created with little effort in Ruby Python, JS - and even
> > Java.
> > >> In addition to having the language's own interpreter check the syntax
> > for
> > >> you, you get lots of freebies such as being able to do line by line
> > >> debugging - and of course the obvious advantage that there is no code
> > layer
> > >> between the DSL and its implementation, whereas with Yaml, someone
> > needs to
> > >> write the code that converts the grammar into behavior, catch errors
> > etc.
> > >>
> > >> What do you think?
> > >>
> > >> Peter
> > >>
> > >> On Wed, Aug 24, 2022 at 8:44 AM Alex Heneveld <al...@cloudsoft.io>
> > wrote:
> > >>
> > >> > Hi folks,
> > >> >
> > >> > I'd like Apache Brooklyn to allow more sophisticated workflow to be
> > written
> > >> > in YAML.
> > >> >
> > >> > As many of you know, we have a powerful task framework in java, but
> > only a
> > >> > very limited subset is currently exposed via YAML.  I think we could
> > >> > generalize this without a mammoth effort, and get a very nice way for
> > users
> > >> > to write complex effectors, sensor feeds, etc, directly in YAML.
> > >> >
> > >> > At [1] please find details of the proposal.
> > >> >
> > >> > This includes the ability to branch and retry on error.  It can also
> > give
> > >> > us the ability to retry/resume on an Apache Brooklyn server failover.
> > >> >
> > >> > Comments welcome!
> > >> >
> > >> > Best
> > >> > Alex
> > >> >
> > >> >
> > >> > [1]
> > >> >
> > >> >
> > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
> > >> >
> >

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Alex Heneveld <al...@cloudsoft.io>.
A
Geoff, Peter, all --

Excellent input.  My thoughts:

DSL:  I like this idea.  I think it could be built on incremental
improvement atop the YAML proposal.  I think by starting with YAML first we
get some advantages:  we get a model (the YAML maps) that we are used to
parsing and working with in Apache Brooklyn, including type assist in the
composer text editor, and a GUI to show and even write workflow wouldn't be
a huge undertaking (as composer converts between YAML text and graphical
already).  This is a declarative model that tools can easily use at design
time, and also a runtime model that we can process and show progress to a
user.  With a DSL there are some extra steps, as it would have to be
converted into some form of model to work with (probably with source line
number mappings so we can map back).  (I would NOT want us to lose the
ability to reason about and show the execution, so am reluctant to lean too
much on the DSL execution of eg things like Groovy.)  And I think as DSL is
additive -- eg the following based on the first example in the doc:

   container image my/google-cloud

      command "gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}"

      env { BUCKET: $brooklyn:config("bucket") }

      on-error retry;

    set-sensor spark-output=${1.stdout}


would get processed into something in order to execute it ... and the model
that corresponds to the YAML seems a good a candidate as any.  (i'm not
sure i like that DSL, but the point applies to most any DSL.)

So I would suggest people can evolve the DSL in parallel and this shouldn't
block implementation of the YAML proposal -- assuming we are agreed on the
functionality and flow.


STEPS:  I went back and forth between the 3 options you list, (a) a list of
steps with no IDs, (b) a map of IDs where every step must say what to do
next, and (c) the current proposal which is (b) plus a default
numero-alphabetic ordering.  I settled on (c) because it has the
conciseness of (a) plus enforced labelling for readability and options for
flow-control (next) as well as extensibility, without the cumbersomeness of
(b) and the restrictions on extensibility it implies.  In terms of related
work the "init.d" mechanism seemed a good one because of the readability
and extensibility it gives (in my experience).  And I've actually used
something similar in some AB projects including TOSCA and it has worked
well in practice.  (However it did not have the "next" or "condition"
options that you raise questions about.)

Aside:  if we're asking people to write and read workflow in YAML, even
with plans for graphical tools and/or a DSL atop it, then I think the
emphasis should be on making it as natural as possible.  To Peter's point
there will always be a bit of a disconnect but I've looked at places where
it has been done well -- eg CircleCI -- and places where it hasn't been (no
names, but you know extremely verbose YAML) and tried to emulate the
former.  Explicit IDs force the user to do a bit of documentation, and make
it easier to correlate subsequent behaviour or graphical components.
Having good default behaviour so keys can be omitted, and having a
shorthand for common tasks, these also make a big difference to authoring.
With (b), requiring a "next" all the time, we need an extra line in many
steps, and we lose the ability to support the shorthand.

CONDITION:  Again I played with `if: { ..., then: { ... } }` and reached
the conclusion it's not a nice YAML experience (and gets worse with
"else"!).  Ugly to write and ugly to work with.  The `condition` block I've
used in circle-ci and elsewhere to achieve this and thought it was a more
pleasant experience -- the best I came across.  Pleasant is relative of
course -- and this I think is an area where DSL and visual will in time
help even more.  A DSL could express `if ... then ... elseif ... else ... `
in a more natural way, and easily generate the YAML.  Visually any step
with a condition I would see the route(s) in to it branching, and the
condition shown on the branch leading to the step, and the else branch
leading to the next step in the ordering, and a line after the conditional
step going to the next step (or to a different step if `next` is explicitly
indicated).  But if we're starting with YAML then a condition keeps the
model flat and simpler.

NEXT:  This is probably the aspect I'm least confident of.  But in my
exploration it feels good, IMO better than the alternatives.  Geoff you
bring Dijkstra's "GOTO Considered Harmful" statement to mind -- the
argument being that goto can encourage developers to make a messy program,
(ab)using it when there are other better control structures.  Except when
coming from YAML most of the possible better control structures give a
worse UX to write, in my experience -- we end up with the `if: { ..., then:
... }` or introduce a similar `while: { condition: { ... }, do: { ... } }`
-- both of which I think introduce more cognitive load.

Aside:  "NESTING considered harmful".  In my experience, requiring a lot of
nesting within commands also makes the UX unpleasant -- speaking from a
YAML perspective -- and makes the workflow harder to reason about.  As soon
as things are nested, either in `if` or `while` or nested `workflow` then
I've got something more complicated to work with -- visually it is hard to
read (note I am speaking explicitly about YAML; if we layer a DSL on top I
would embrace nesting and eschew goto in that), implicit context/depth to
consider when resolving labels and execution flow.  The proposed model sees
the workflow as a flat set of steps, so code and user can know exactly
where in execution something is occurring, based on its ID.  The only
nesting is an explicit nested workflow which establishes a sub-context.  If
we have lightweight nesting a la if or while then we lose this runtime
clarity.  (To be clear, for a DSL written on top of this, I would take a
very different view of this!)

LET:  +1 to this suggestion as a shorthand for the cumbersome
`set-workflow-variable`.  Improves readability.  I could see some simple
maths being supported as well here.

LOOPS:  There are a three main ways loops could be done; one is with
explicit nested workflow running over a target eg 1..10; another is
combining a condition with and increment step and a "next" entry to repeat
the loop while the condition is satisfied; and the final one is using the
special "retry" type (especially suited to time-based retries, waiting for
some event or correcting errors).  For the reasons under NEXT, I think a
"while" statement at this level is not wanted -- but I think it would be
good at the DSL.  I've added some worked examples i the document at
https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.1af8dhexg3n5
.

REUSE:  As for adding a workflow to the catalog, I hadn't fleshed this out
except to satisfy myself that we could declare a new type "my-task" in a
catalog.bom, have it extend { type: workflow } and declare { steps: { ... }
}.  The type "my-task" could then be used in any workflow step or in an
effecftor definition which wants steps.  The new type could itself be
extended, with Map.putAll done to merge steps (inserting new and
overwriting existing labels).  I think there will be a need to declare
inputs and to specify inputs as part of this, and that side of things I
hadn't thought about but assumed we'd be able to use the `parameters`
syntax we use elsewhere along with default values.  I've added a sketch of
this at
https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit#heading=h.rl70czdae7la
.

TESTS:  Excellent observation.  This would be absolutely transformative for
the brooklyn tests mechanism.  Currently that allows test scripts to be
written in a DSL a bit like the one described here.  The existing
assertions and methods there could be implemented as workflow steps so that
writing tests and viewing and re-running tests all become much nicer.


As for how to proceed:

INCREMENTAL IMPLEMENTATION:  An initial set of workflow using a small
subset of types would be the starting point, and allowing this to be used
to declare sensors and effectors and policies.  I would then see most of
the other tasks able to proceed in parallel, with the exception of the UI
which will need the workflow metadata-as-sensors fleshed out a bit more.

WHO:  I'm maybe best placed to start with the initial set of workflow, but
I'd aim to do that pretty quickly keeping it minimal per ^, and then I'd
welcome collaboration on everything else, much of which is parallelizable.
I will update to this list when there is something to look at for the first
set.  You're right I'm raring to go but I'm also very happy to have others
do big swathes of it.  I'm also really grateful for the comments, so keep
them coming.  It'll be a few days before I start, so please LMK if ^^^
makes sense!


Best
Alex






On Tue, 30 Aug 2022 at 23:22, Geoff Macartney <ge...@apache.org> wrote:

> Hi Alex et al.
>
> Here are some thoughts on the proposal.
>
> Cheers
> Geoff
>
> # General thoughts
>
> 1. Adding a procedural "sub-language" like this to Brooklyn could give
> it a whole new level of capability, which is very exciting.
> 2. At the same time this capability could add a whole new dimension of
> complexity and difficulty, so I think it will be very important to
> make sure the implementation and tooling give users lots of support
> for traceability and debugging. It's good to see the emphasis put on
> this in "The Other Big Idea: Traceability and Recoverability",
> hopefully the implementation will also emphasise it.
> 3. "A graphical workflow visualiser is not planned at this time" is
> understandable but I feel probably the more support we can add for
> this in the UI the better the chances of it succeeding.
> 4. Peter's suggestion of a DSL could potentially simplify some of the
> considerations below, what do you think of that suggestion?
>
> # About the workflow language
>
> 1. I find the document currently confusing in terms of what sort of
> language it is describing - is it a workflow, typically expressed as a
> directed graph of nodes (tasks) or a sequence of tasks (which would be
> rather more like a procedural language)?
> 2. if it is a workflow then I'd have thought there shouldn't be any
> notion of ordering ("Step References, Numero-Alphabetic Ordering and
> Extensibility"). Ordering should be defined only by the graph (value
> of "Next" field, and the ids) in this case, don't you think?
> 3. if it is a sequence of tasks then I think the current mechanism of
> ids (the map keys) and Next is awkward. `1-2-http-request` and Next
> puts me in mind of BASIC, with Next playing the role of GOTO. In this
> case I think it has the potential of introducing the same problems as
> GOTO, and might better be done without. Rather it might be preferable
> to express the workflow as an array of steps (sequencing), with
> support for selection (`condition`/`if`), iteration (maybe consider
> introducing `while`? see question below about iteration), and
> "functions" (independently defined named workflows, either in the
> catalog or elsewhere in the blueprint). Ids would be optional, only
> required for steps whose results are referenced elsewhere, not as
> sequencers or labels for Next:
>
> ```yaml
> steps:
>   - id: sparc-job
>     type: container
>     image: my/google-cloud
>     command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
>     env:
>       BUCKET: $brooklyn:config("bucket")
>     on-error: retry
>   - set-sensor: spark-output=${sparc-job.stdout}
> ```
>
> 4. This might also work well with a slightly different markup for
> `condition` (which might be nicer as `if`?), adding a `then` defining
> a (sub) sequence of steps:
>
> ```yaml
> steps:
>   - if:
>       target: ${scratch.skip_date}
>       not: { equals: true }
>       then:
>         - ssh: echo today is `DATE`
>         - <other commands in here...>
> ```
> we could also add `else`.
>
> 5. Can you add some more details about how you see iteration in the
> document - there is that section "Multiple targets and looping" but it
> doesn't really have examples of the latter. Could we introduce a
> `while` construct?
>
> 6. How about renaming `set-workflow-variable` to a simpler `let` or
> `set`? You don't actually give an example of its use, is it something
> like the following? (I'm imagining a convention for a one line
> definition for convenience, with variable name followed by keyword
> `be` to indicate assignment to what follows):
>
> ```yaml
> steps:
>   - let: my-scratch be false
>   - if:
>       target: whatever
>       equals: something
>       then:
>         - let: my-scratch be true
> ```
>
> 7. Could you add some more detail about how independent workflows can
> be defined, e.g. in the catalog (or as a separate workflow in the
> blueprint?) Can these be parameterised, like function definitions? The
> use of `parameters` in the example in section "Request/State Unique
> Identifiers" isn't clear to me.
>
> 8. Would you foresee adding any support for testing all this to the
> Brooklyn tests mechanism? Might be valuable.
>
> # About how to organise implementing all this
>
> 1. Can we sequence work on this to do the simpler bits first and get
> experience with how it works before proceeding to more advanced things
> like nested workflows with per-target conditions ("Multiple Targets
> and looping"). I would particularly like to hope that the UI side of
> things can progress in step with the workflow definitions, rather than
> leave all the UI work to the end.
> 2. (How) could the work for this be spread across the community? I've
> no doubt you're raring to go on this but it would be good if it didn't
> all fall on your shoulders!
>
>
> On Mon, 29 Aug 2022 at 23:51, Geoff Macartney <ge...@gmail.com>
> wrote:
> >
> > Hi Alex,
> >
> > I've done a first pass on the document, and it's very impressive. Adding
> a procedural "sub-language" like this to Brooklyn could give it a whole new
> level of capability, which is very exciting. I have some thoughts on some
> of the details proposed which I will try to write up this week.
> >
> > I share the concerns about YAML which I think Peter expressed very well.
> His suggestion of a DSL instead of YAML is interesting and I think would be
> worth considering. I also have some reservations about some of the
> constructs you're proposing (well, at least one of them) and some perhaps
> relatively minor suggestions for changes in structure. My bigger concern is
> that adding a new programming language within Blueprints like this could
> add a whole new dimension of complexity. I'm asking myself, "how would I
> debug this" when things go wrong. I think that's worth some discussion as
> much as the details of the language. There are also points where I simply
> have questions and would like some more detail.
> >
> > I'll try to get more detailed thoughts written up this week.
> >
> > Cheers
> > Geoff
> >
> >
> >
> > On Sat, 27 Aug 2022 at 00:05, Peter Abramowitsch <
> pabramowitsch@gmail.com> wrote:
> >>
> >> Hi Alex,
> >> I haven't been involved with the Brooklyn team for a long while so take
> >> this suggestion with as little or as much importance as you see at face
> >> value.   Your proposal for a richer specification language to guide
> >> realtime behavior is much appreciated and I think it is a great idea.
> >> You've obviously thought very deeply as to how it could be applied in
> >> different areas of a blueprint.
> >>
> >> My one comment is whether going for a declarative solution, especially
> one
> >> based on YAML is optimal.  Sure Yaml is well known, easy to eyeball,
> but it
> >> has two drawbacks that make me wonder if it is the best platform for
> your
> >> idea.  The first is that it is a format-based language.  Working in
> large
> >> infrastructure projects, small errors can have disastrous consequences,
> so
> >> as little as a missing or extra tab could result in destroying a data
> >> resource or bringing down a complex system.   The other, more
> philosophical
> >> comment has to do with the clumsiness of describing procedural concepts
> in
> >> a declarative language.  (anyone have fun with XSL doing anything
> >> significant?)
> >>
> >> So my suggestion would be to look into DSLs instead of Yaml.  Very nice
> >> ones can be created with little effort in Ruby Python, JS - and even
> Java.
> >> In addition to having the language's own interpreter check the syntax
> for
> >> you, you get lots of freebies such as being able to do line by line
> >> debugging - and of course the obvious advantage that there is no code
> layer
> >> between the DSL and its implementation, whereas with Yaml, someone
> needs to
> >> write the code that converts the grammar into behavior, catch errors
> etc.
> >>
> >> What do you think?
> >>
> >> Peter
> >>
> >> On Wed, Aug 24, 2022 at 8:44 AM Alex Heneveld <al...@cloudsoft.io>
> wrote:
> >>
> >> > Hi folks,
> >> >
> >> > I'd like Apache Brooklyn to allow more sophisticated workflow to be
> written
> >> > in YAML.
> >> >
> >> > As many of you know, we have a powerful task framework in java, but
> only a
> >> > very limited subset is currently exposed via YAML.  I think we could
> >> > generalize this without a mammoth effort, and get a very nice way for
> users
> >> > to write complex effectors, sensor feeds, etc, directly in YAML.
> >> >
> >> > At [1] please find details of the proposal.
> >> >
> >> > This includes the ability to branch and retry on error.  It can also
> give
> >> > us the ability to retry/resume on an Apache Brooklyn server failover.
> >> >
> >> > Comments welcome!
> >> >
> >> > Best
> >> > Alex
> >> >
> >> >
> >> > [1]
> >> >
> >> >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
> >> >
>

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Geoff Macartney <ge...@apache.org>.
Hi Alex et al.

Here are some thoughts on the proposal.

Cheers
Geoff

# General thoughts

1. Adding a procedural "sub-language" like this to Brooklyn could give
it a whole new level of capability, which is very exciting.
2. At the same time this capability could add a whole new dimension of
complexity and difficulty, so I think it will be very important to
make sure the implementation and tooling give users lots of support
for traceability and debugging. It's good to see the emphasis put on
this in "The Other Big Idea: Traceability and Recoverability",
hopefully the implementation will also emphasise it.
3. "A graphical workflow visualiser is not planned at this time" is
understandable but I feel probably the more support we can add for
this in the UI the better the chances of it succeeding.
4. Peter's suggestion of a DSL could potentially simplify some of the
considerations below, what do you think of that suggestion?

# About the workflow language

1. I find the document currently confusing in terms of what sort of
language it is describing - is it a workflow, typically expressed as a
directed graph of nodes (tasks) or a sequence of tasks (which would be
rather more like a procedural language)?
2. if it is a workflow then I'd have thought there shouldn't be any
notion of ordering ("Step References, Numero-Alphabetic Ordering and
Extensibility"). Ordering should be defined only by the graph (value
of "Next" field, and the ids) in this case, don't you think?
3. if it is a sequence of tasks then I think the current mechanism of
ids (the map keys) and Next is awkward. `1-2-http-request` and Next
puts me in mind of BASIC, with Next playing the role of GOTO. In this
case I think it has the potential of introducing the same problems as
GOTO, and might better be done without. Rather it might be preferable
to express the workflow as an array of steps (sequencing), with
support for selection (`condition`/`if`), iteration (maybe consider
introducing `while`? see question below about iteration), and
"functions" (independently defined named workflows, either in the
catalog or elsewhere in the blueprint). Ids would be optional, only
required for steps whose results are referenced elsewhere, not as
sequencers or labels for Next:

```yaml
steps:
  - id: sparc-job
    type: container
    image: my/google-cloud
    command: gcloud dataproc jobs submit spark --BUCKET=gs://${BUCKET}
    env:
      BUCKET: $brooklyn:config("bucket")
    on-error: retry
  - set-sensor: spark-output=${sparc-job.stdout}
```

4. This might also work well with a slightly different markup for
`condition` (which might be nicer as `if`?), adding a `then` defining
a (sub) sequence of steps:

```yaml
steps:
  - if:
      target: ${scratch.skip_date}
      not: { equals: true }
      then:
        - ssh: echo today is `DATE`
        - <other commands in here...>
```
we could also add `else`.

5. Can you add some more details about how you see iteration in the
document - there is that section "Multiple targets and looping" but it
doesn't really have examples of the latter. Could we introduce a
`while` construct?

6. How about renaming `set-workflow-variable` to a simpler `let` or
`set`? You don't actually give an example of its use, is it something
like the following? (I'm imagining a convention for a one line
definition for convenience, with variable name followed by keyword
`be` to indicate assignment to what follows):

```yaml
steps:
  - let: my-scratch be false
  - if:
      target: whatever
      equals: something
      then:
        - let: my-scratch be true
```

7. Could you add some more detail about how independent workflows can
be defined, e.g. in the catalog (or as a separate workflow in the
blueprint?) Can these be parameterised, like function definitions? The
use of `parameters` in the example in section "Request/State Unique
Identifiers" isn't clear to me.

8. Would you foresee adding any support for testing all this to the
Brooklyn tests mechanism? Might be valuable.

# About how to organise implementing all this

1. Can we sequence work on this to do the simpler bits first and get
experience with how it works before proceeding to more advanced things
like nested workflows with per-target conditions ("Multiple Targets
and looping"). I would particularly like to hope that the UI side of
things can progress in step with the workflow definitions, rather than
leave all the UI work to the end.
2. (How) could the work for this be spread across the community? I've
no doubt you're raring to go on this but it would be good if it didn't
all fall on your shoulders!


On Mon, 29 Aug 2022 at 23:51, Geoff Macartney <ge...@gmail.com> wrote:
>
> Hi Alex,
>
> I've done a first pass on the document, and it's very impressive. Adding a procedural "sub-language" like this to Brooklyn could give it a whole new level of capability, which is very exciting. I have some thoughts on some of the details proposed which I will try to write up this week.
>
> I share the concerns about YAML which I think Peter expressed very well. His suggestion of a DSL instead of YAML is interesting and I think would be worth considering. I also have some reservations about some of the constructs you're proposing (well, at least one of them) and some perhaps relatively minor suggestions for changes in structure. My bigger concern is that adding a new programming language within Blueprints like this could add a whole new dimension of complexity. I'm asking myself, "how would I debug this" when things go wrong. I think that's worth some discussion as much as the details of the language. There are also points where I simply have questions and would like some more detail.
>
> I'll try to get more detailed thoughts written up this week.
>
> Cheers
> Geoff
>
>
>
> On Sat, 27 Aug 2022 at 00:05, Peter Abramowitsch <pa...@gmail.com> wrote:
>>
>> Hi Alex,
>> I haven't been involved with the Brooklyn team for a long while so take
>> this suggestion with as little or as much importance as you see at face
>> value.   Your proposal for a richer specification language to guide
>> realtime behavior is much appreciated and I think it is a great idea.
>> You've obviously thought very deeply as to how it could be applied in
>> different areas of a blueprint.
>>
>> My one comment is whether going for a declarative solution, especially one
>> based on YAML is optimal.  Sure Yaml is well known, easy to eyeball, but it
>> has two drawbacks that make me wonder if it is the best platform for your
>> idea.  The first is that it is a format-based language.  Working in large
>> infrastructure projects, small errors can have disastrous consequences, so
>> as little as a missing or extra tab could result in destroying a data
>> resource or bringing down a complex system.   The other, more philosophical
>> comment has to do with the clumsiness of describing procedural concepts in
>> a declarative language.  (anyone have fun with XSL doing anything
>> significant?)
>>
>> So my suggestion would be to look into DSLs instead of Yaml.  Very nice
>> ones can be created with little effort in Ruby Python, JS - and even Java.
>> In addition to having the language's own interpreter check the syntax for
>> you, you get lots of freebies such as being able to do line by line
>> debugging - and of course the obvious advantage that there is no code layer
>> between the DSL and its implementation, whereas with Yaml, someone needs to
>> write the code that converts the grammar into behavior, catch errors etc.
>>
>> What do you think?
>>
>> Peter
>>
>> On Wed, Aug 24, 2022 at 8:44 AM Alex Heneveld <al...@cloudsoft.io> wrote:
>>
>> > Hi folks,
>> >
>> > I'd like Apache Brooklyn to allow more sophisticated workflow to be written
>> > in YAML.
>> >
>> > As many of you know, we have a powerful task framework in java, but only a
>> > very limited subset is currently exposed via YAML.  I think we could
>> > generalize this without a mammoth effort, and get a very nice way for users
>> > to write complex effectors, sensor feeds, etc, directly in YAML.
>> >
>> > At [1] please find details of the proposal.
>> >
>> > This includes the ability to branch and retry on error.  It can also give
>> > us the ability to retry/resume on an Apache Brooklyn server failover.
>> >
>> > Comments welcome!
>> >
>> > Best
>> > Alex
>> >
>> >
>> > [1]
>> >
>> > https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
>> >

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Geoff Macartney <ge...@gmail.com>.
Hi Alex,

I've done a first pass on the document, and it's very impressive. Adding a
procedural "sub-language" like this to Brooklyn could give it a whole new
level of capability, which is very exciting. I have some thoughts on some
of the details proposed which I will try to write up this week.

I share the concerns about YAML which I think Peter expressed very well.
His suggestion of a DSL instead of YAML is interesting and I think would be
worth considering. I also have some reservations about some of the
constructs you're proposing (well, at least one of them) and some perhaps
relatively minor suggestions for changes in structure. My bigger concern is
that adding a new programming language within Blueprints like this could
add a whole new dimension of complexity. I'm asking myself, "how would I
debug this" when things go wrong. I think that's worth some discussion as
much as the details of the language. There are also points where I simply
have questions and would like some more detail.

I'll try to get more detailed thoughts written up this week.

Cheers
Geoff



On Sat, 27 Aug 2022 at 00:05, Peter Abramowitsch <pa...@gmail.com>
wrote:

> Hi Alex,
> I haven't been involved with the Brooklyn team for a long while so take
> this suggestion with as little or as much importance as you see at face
> value.   Your proposal for a richer specification language to guide
> realtime behavior is much appreciated and I think it is a great idea.
> You've obviously thought very deeply as to how it could be applied in
> different areas of a blueprint.
>
> My one comment is whether going for a declarative solution, especially one
> based on YAML is optimal.  Sure Yaml is well known, easy to eyeball, but it
> has two drawbacks that make me wonder if it is the best platform for your
> idea.  The first is that it is a format-based language.  Working in large
> infrastructure projects, small errors can have disastrous consequences, so
> as little as a missing or extra tab could result in destroying a data
> resource or bringing down a complex system.   The other, more philosophical
> comment has to do with the clumsiness of describing procedural concepts in
> a declarative language.  (anyone have fun with XSL doing anything
> significant?)
>
> So my suggestion would be to look into DSLs instead of Yaml.  Very nice
> ones can be created with little effort in Ruby Python, JS - and even Java.
> In addition to having the language's own interpreter check the syntax for
> you, you get lots of freebies such as being able to do line by line
> debugging - and of course the obvious advantage that there is no code layer
> between the DSL and its implementation, whereas with Yaml, someone needs to
> write the code that converts the grammar into behavior, catch errors etc.
>
> What do you think?
>
> Peter
>
> On Wed, Aug 24, 2022 at 8:44 AM Alex Heneveld <al...@cloudsoft.io> wrote:
>
> > Hi folks,
> >
> > I'd like Apache Brooklyn to allow more sophisticated workflow to be
> written
> > in YAML.
> >
> > As many of you know, we have a powerful task framework in java, but only
> a
> > very limited subset is currently exposed via YAML.  I think we could
> > generalize this without a mammoth effort, and get a very nice way for
> users
> > to write complex effectors, sensor feeds, etc, directly in YAML.
> >
> > At [1] please find details of the proposal.
> >
> > This includes the ability to branch and retry on error.  It can also give
> > us the ability to retry/resume on an Apache Brooklyn server failover.
> >
> > Comments welcome!
> >
> > Best
> > Alex
> >
> >
> > [1]
> >
> >
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
> >
>

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Peter Abramowitsch <pa...@gmail.com>.
Hi Alex,
I haven't been involved with the Brooklyn team for a long while so take
this suggestion with as little or as much importance as you see at face
value.   Your proposal for a richer specification language to guide
realtime behavior is much appreciated and I think it is a great idea.
You've obviously thought very deeply as to how it could be applied in
different areas of a blueprint.

My one comment is whether going for a declarative solution, especially one
based on YAML is optimal.  Sure Yaml is well known, easy to eyeball, but it
has two drawbacks that make me wonder if it is the best platform for your
idea.  The first is that it is a format-based language.  Working in large
infrastructure projects, small errors can have disastrous consequences, so
as little as a missing or extra tab could result in destroying a data
resource or bringing down a complex system.   The other, more philosophical
comment has to do with the clumsiness of describing procedural concepts in
a declarative language.  (anyone have fun with XSL doing anything
significant?)

So my suggestion would be to look into DSLs instead of Yaml.  Very nice
ones can be created with little effort in Ruby Python, JS - and even Java.
In addition to having the language's own interpreter check the syntax for
you, you get lots of freebies such as being able to do line by line
debugging - and of course the obvious advantage that there is no code layer
between the DSL and its implementation, whereas with Yaml, someone needs to
write the code that converts the grammar into behavior, catch errors etc.

What do you think?

Peter

On Wed, Aug 24, 2022 at 8:44 AM Alex Heneveld <al...@cloudsoft.io> wrote:

> Hi folks,
>
> I'd like Apache Brooklyn to allow more sophisticated workflow to be written
> in YAML.
>
> As many of you know, we have a powerful task framework in java, but only a
> very limited subset is currently exposed via YAML.  I think we could
> generalize this without a mammoth effort, and get a very nice way for users
> to write complex effectors, sensor feeds, etc, directly in YAML.
>
> At [1] please find details of the proposal.
>
> This includes the ability to branch and retry on error.  It can also give
> us the ability to retry/resume on an Apache Brooklyn server failover.
>
> Comments welcome!
>
> Best
> Alex
>
>
> [1]
>
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
>

Re: Brooklyn Feature Proposal - Declarative and Retryable Workflow

Posted by Thomas Bouron <th...@cloudsoft.io>.
Hi Alex.

Thank you for the proposal, it's much more detailed than I expected. This
is a massive +1 from me just for the fact this lowers the barrier to entry
by la long margin.

It's quite ambitious but it seems to be the last piece of the puzzle to
fully leave Java behind for a Brooklyn user!

Best.

On Wed, 24 Aug 2022 at 17:44, Alex Heneveld <al...@cloudsoft.io> wrote:

> Hi folks,
>
> I'd like Apache Brooklyn to allow more sophisticated workflow to be written
> in YAML.
>
> As many of you know, we have a powerful task framework in java, but only a
> very limited subset is currently exposed via YAML.  I think we could
> generalize this without a mammoth effort, and get a very nice way for users
> to write complex effectors, sensor feeds, etc, directly in YAML.
>
> At [1] please find details of the proposal.
>
> This includes the ability to branch and retry on error.  It can also give
> us the ability to retry/resume on an Apache Brooklyn server failover.
>
> Comments welcome!
>
> Best
> Alex
>
>
> [1]
>
> https://docs.google.com/document/d/1u02Bi6sS8Fkf1s7UzRRMnvLhA477bqcyxGa0nJesqkI/edit?usp=sharing
>


-- 
Thomas Bouron
Lead Software Engineer

*Cloudsoft <https://cloudsoft.io/> *| Bringing Business to the Cloud

GitHub: https://github.com/tbouron
Twitter: https://twitter.com/eltibouron